## Understanding Age-Sex Composition



To relate the total fertility rate (TFR) of a country to population
growth, we need to know some other things about the country:

1.  Women of child-bearing age, as a proportion of population
2.  Mortality rates (which will vary with age)
3.  Rates of net migration

We won&rsquo;t have much to say about migration yet, but the number of
women of child-bearing age and rates of mortality can both be
helpfully visualized by constructing *population pyramids* that
report information on the age and sex composition of a population at
a point in time.



### Preface



In [1]:
%pip install wbdata
import wbdata

### Building a population pyramid



The next code builds a list of the age-sex counts we want
 (e.g., how many males are there between the ages of 10-14?).



In [1]:
# Data from WDI on age-sex comes in the forms of variables
# which take the form "SP.POP.LLHH.MA" for males
# and "SP.POP.LLHH.FE" for females, where LL is the *low* end of
# age range, like "05" for 5-yo, and HH is the *high* end.

# We construct a list of age-ranges.

# Start with an empty list of age-rages
age_ranges = []

# Ranges top out at 80, and go in five year increments
for i in range(0,80,5):
    age_ranges.append(f"{i:02d}"+f"{i+4:02d}")

age_ranges.append("80UP")

print(age_ranges)

Next we construct a dictionary of indicators, with labels, that we
 want to grab.



In [1]:
male_variables = {"SP.POP."+age_range+".MA":"Males "+age_range for age_range in age_ranges}
female_variables = {"SP.POP."+age_range+".FE":"Females "+age_range for age_range in age_ranges}

variables = male_variables
variables.update(female_variables)

print(variables)

Get the data!



In [1]:
# WLD is the World; substitute your own code or list of codes.
# Remember you can search for the appropriate codes using
# wbdata.get_countries(query="Some Place")

df = wbdata.get_dataframe(variables,country="WLD",parse_dates=True)
print(df.xs("2022-01-01").sum(axis=0))

### Plotting Population Pyramid



Now we put together some code for the population pyramid.  The structure
 of the DataFrames is more complicated than it was above, so using the simple DataFrame methods won&rsquo;t work here (or at least I don&rsquo;t see quite how to do it).   We use a more general `plotly` library instead.



In [1]:
import plotly.offline as py
import plotly.graph_objs as go
import pandas as pd
import numpy as np

py.init_notebook_mode(connected=True)

layout = go.Layout(barmode='overlay',
                   yaxis=go.layout.YAxis(range=[0, 90], title='Age'),
                   xaxis=go.layout.XAxis(title='Number'))

year = '2022-01-01'

bins = [go.Bar(x = df.loc[str(year),:].filter(regex="Male").values,
               y = [int(s[:2])+1 for s in age_ranges],
               orientation='h',
               name='Men',
               marker=dict(color='purple'),
               hoverinfo='skip'
               ),

        go.Bar(x = -df.loc[str(year),:].filter(regex="Female").values,
               y=[int(s[:2])+1 for s in age_ranges],
               orientation='h',
               name='Women',
               marker=dict(color='pink'),
               hoverinfo='skip',
               )
        ]
py.iplot(dict(data=bins, layout=layout))

### Changes in Pyramid Over Time



Let&rsquo;s try a more ambitious visualization, showing how the shape of the population pyramid has changed decade by decade.



In [1]:
# Count down by increments of 20 years
years = range(2020,1959,-20)

# This makes a list of graphs, year by year
bins = [go.Bar(x = df.loc[str(year)+'-01-01',:].filter(regex="Male").values,
               y = [int(s[:2])+1 for s in age_ranges],
               orientation='h',
               name='Men {:d}'.format(year),
               hoverinfo='skip',
               opacity=0.5
              )
        for year in years]

bins += [go.Bar(x = -df.loc[str(year)+'-01-01',:].filter(regex="Female").values,
                y=[int(s[:2])+1 for s in age_ranges],
                orientation='h',
                name='Women {:d}'.format(year),
                hoverinfo='skip',
                opacity=0.5
               )
         for year in years]

py.iplot(dict(data=bins, layout=layout))