Population Lecture I
====================

**Author:** Ethan Ligon

**Date:** January 22, 2026



## Introduction



Today we&rsquo;ll introduce some key &ldquo;stylized facts&rdquo; about human
population and its growth.  None of these are &ldquo;causal&rdquo; statements,
just observations about relationships.

-   **Fact I:** Population growth is fundamentally exponential, but the
    rate of growth has fallen over time.
-   **Fact II:** Population growth rates are generally higher in places
    where people are poorer.
-   **Fact III:** Variation in growth rates across countries is
    accounted for more by variation in fertility than by mortality.



## Getting Data



### The World Development Indicators & `wbdata`



The World Bank maintains a large set of &ldquo;World Development Indicators&rdquo; (WDI),
including information on population.

-   API for WDI is available at [https://datahelpdesk.worldbank.org/knowledgebase/articles/889392-about-the-indicators-api-documentation](https://datahelpdesk.worldbank.org/knowledgebase/articles/889392-about-the-indicators-api-documentation)

-   A `python` module that uses the API is `wbdata`, written by Oliver Sherouse.

-   Available at [http://github.com/OliverSherouse/wbdata](http://github.com/OliverSherouse/wbdata).

-   Documented at [https://wbdata.readthedocs.io](https://wbdata.readthedocs.io).



### Getting Population Data Using wbdata



#### Goals



We want to devise ways to visualize the following:

-   Global population growth from 1960 to the present;
-   Population growth rates versus GDP per capita;
-   Age-sex population pyramids.



#### Methods (using wbdata)



We walk through the process of getting data from the WDI into a
`pandas` DataFrame.

The `wbdata` module has several key functions we&rsquo;ll want to use:

-   **`get_countries()`:** Returns code for different countries or
    regions.
-   **`get_sources()`:** Gives list of different data sources that can
    be accessed using the module; returns a numeric key;
-   **`get_indicators()`:** Given a source, this returns a list of
    available variables (indicators).
-   **`get_dataframe()`:** Given a source and a list of indicators,
    this returns a dataframe populated with the requested data.

Begin by importing the module:



In [1]:
## If import fails with "ModuleNotFoundError"
## uncomment below & try again
# %pip install wbdata

import wbdata

##### `wbdata.get_countries()`



What countries and regions are available?  Looking up the country
codes, or searching for particular strings:



In [1]:
import wbdata

# Return list of all country/region codes:
wbdata.get_countries()

# Return list matching a query term:
#wbdata.get_countries(query="World")
#wbdata.get_countries(query="United")

## Try your own search!
# wbdata.get_countries(query="")

##### `wbdata.get_sources()`



To see possible datasets we can access via the API, use `get_sources()`



In [1]:
wbdata.get_sources()

##### `wbdata.get_indicators()`



&ldquo;Population estimates and projections&rdquo; looks promising.
See what indicators/variables are available?



In [1]:
SOURCE = 40 # "Population estimates and projections

indicators = wbdata.get_indicators(source=SOURCE)
indicators

#### Getting Population Over Time



Let&rsquo;s get data on the global population and see how it has changed over
time. The variable `SP.POP.TOTL` seems like a reasonable place to
start.

We want to get a `pandas.DataFrame` of total population:



In [1]:
# Give variable for clarity
variable_labels = {"SP.POP.TOTL":"World Population"}

world = wbdata.get_dataframe(variable_labels, country="WLD",parse_dates=True)

# Print a few years' data
world.head()

## Plotting Data



### Plotting data from pandas.DataFrame



Let&rsquo;s make a time-series plot of global population.  We&rsquo;ll use `plotly` as a backend for plotting data in a `pandas.DataFrame`.
Here are a couple lines to set up the plotting environment:



In [1]:
import pandas as pd
pd.options.plotting.backend = 'plotly'

### Plotting Global Population Over time



With that done, after we have a DataFrame making a plot is just one
line of code:



In [1]:
# Useful arguments to pass include xTitle, yTitle, Title
world.plot(title="Fact I: Growth Rates Falling over Time",
            labels=dict(date='Year',value='Population'))

### Plotting Different Countries&rsquo; Population Growth Rates



Globally, population growth has been basically linear over the last 60
years.

-   Increases by 1 billion about every 12 years.
-   Implies *rate* of growth falling over time.

How do population growth rates vary by country?



In [1]:
import numpy as np

variable_labels = {"SP.POP.TOTL":"Population"}

# Three letter codes come from wbdata.get_countries()
countries = {"WLD":"World",
             "LIC":"Low income",
             "LMC":"Low-medium income",
             "UMC":"Upper-medium income",
             "HIC":"High income",
            }

df = wbdata.get_dataframe(variable_labels, country = countries,parse_dates=True).squeeze()

df = df.unstack('country')
df = df.sort_index()

# Differences (over time) in logs give us growth rates
np.log(df).diff().plot(title="Fact II: Poorer places have higher growth rates",
                       labels=dict(value="Growth Rate",date='Year'))

### Population Growth vs Per capita GDP



Our second stylized fact was that there&rsquo;s an inverse association between
income and population growth.  We&rsquo;ll investigate this fact here,
constructing a scatter plot relating population growth rates to (log) GDP per capita.



In [1]:
import numpy as np
# wbdata.get_indicators(query="GDP per capita")

indicators = {"NY.GDP.PCAP.CD":"GDP per capita",
              "SP.DYN.TFRT.IN":"Total Fertility Rate",
              "SP.POP.GROW":"Population Growth Rate",
              "SP.DYN.AMRT.MA":"Male Mortality",
              "SP.DYN.AMRT.FE":"Female Mortality",
              "SP.POP.1564.FE.ZS":"% Adult Female",
              "SP.POP.TOTL.FE.ZS":"% Female"}

data = wbdata.get_dataframe(indicators,parse_dates=True)

# Just grab data from one year
df = data.xs("2023-01-01",level='date') 

df['Log GDP per capita'] = np.log(df['GDP per capita'])

df.plot.scatter(title="Fact II: Population growth is lower in higher-income countries",
         x="Log GDP per capita",y="Population Growth Rate",
         hover_name=df.reset_index('country')['country'].values.tolist())

### Decomposing Population Growth



Consider the human population at a particular time $t$, and let the
size of the population be given by $N_t$ at time $t$.  Also, let
$\phi_t$ be the *share* of the population at time $t$ that are women
of child-bearing age (e.g., 15&ndash;49).

Now, as a matter of accounting, population in the next period $t+1$ will be given by
$$
    N_{t+1} = (1-\mbox{mortality rate})N_t + \mbox{TFR}\cdot\phi_t N_t.
 $$

Thus, we can think of population growth as depending on mortality, fertility, and the share of the population that can bear children.

We&rsquo;ve seen that population growth is falling over time.  Is the fall due to changes in mortality, fertility, or $\phi_t$?



### Mortality Over Time



Can mortality changes account for declining population?  Look at
deaths per 10,000 people.



In [1]:
world = data.xs("World",level='country')

world[["Male Mortality","Female Mortality"]].plot(title="Deaths per 10,000")

### Adult female share of population over time



Decreases in population growth could also be due to a decreasing share of adult women, perhaps due to gender selection at birth.  How does this share ($\phi_t$) vary over time?



In [1]:
# % Adult Female is % of females who are adult.
# To make a share of total population take product
world["% Adult Female"] = world["% Adult Female"]*world["% Female"]/100

world["% Adult Female"].plot(title="% of Adult Females in World Population")

### Fertility over time



Finally, decreases in population growth could be due to reduced fertility.  How does global fertility vary over time?



In [1]:
world["Total Fertility Rate"].plot()

### Relation between income and fertility



In [1]:
df.plot.scatter(x="Log GDP per capita",y="Total Fertility Rate",
         hover_name=df.reset_index('country')['country'].values.tolist(),
         labels=dict(index="Log GDP per capita",values="Total Fertility Rate"),
         title="Fact II: Women in Poorer Countries Have Higher Fertility")