In [None]:
from IPython.core.display import HTML
from datascience import *

import matplotlib
matplotlib.use('Agg')
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import os
plt.style.use('fivethirtyeight')

import pandas as pd
import zipfile
import io
import math

def css_styling():
    styles = open('../notebook_styles.css', 'r').read()
    return HTML(styles)
css_styling()

In [None]:
#Loading testing data
from client.api.notebook import Notebook 
lab06 = Notebook('lab06.ok')
_ = lab06.auth(inline=True)

# Lab 06 - The Demographic Dividend

## Introductions

**What is your partner's name?**

[ANSWER HERE]

**What is your partner's favorite movie?**

[ANSWER HERE]

**What is your partner's favorite food?**

[ANSWER HERE]

## Age structure and economic productivity

A [dependency ratio](https://en.wikipedia.org/wiki/Dependency_ratio) is a quantity that is intended to roughly capture what fraction of a population is at economically productive age ranges. Thus, a dependency ratio can be thought of as a summary of how conducive a population's age structure is to economic productivity.

We'll focus on three different dependency ratios today. The first is the **child dependency ratio**:

$$
\text{Child Dependency Ratio} = 100 \times \frac{\text{# children aged 0-14}}{\text{# adults aged 15-64}}.
$$

The child dependency ratio captures, on average, how many children each working-age adult supports.

The second metric is the **old-age dependency ratio**:

$$
\text{Old-age Dependency Ratio} = 100 \times \frac{\text{# adults aged 65+}}{\text{# adults aged 15-64}}.
$$

The old-age dependency ratio captures, on average, how many older people each working-age adult supports.

Finally, the **total dependency ratio** is given by

$$
\text{Total Dependency Ratio} = 100 \times \frac{\text{# children aged 0-14} + \text{# adults aged 65+}}{\text{# adults aged 15-64}}.
$$

The total dependency ratio captures how many children and adults are supported by the average working-age adult.

**Question - What is the range of possible values that the total dependency ratio can take on?**

[ANSWER HERE]

**Question - Mathematically, what is the relationship between the total, child, and old-age dependency ratios?**  
*[HINT: If given the child and old-age dependency ratios, how might you calculate the total dependency ratio?]*

[ANSWER HERE]

**Question - In the data exercises below, we're going to look only at the female population, ignoring males. What impact, if any, do you think this will have on the old-age dependency ratio?**

[ANSWER HERE]

**Question - The total dependency ratio is a widely used way to try to quantify how many people need to be supported by people who are working. What are two limitations of the total dependency ratio when used for this purpose?**

[ANSWER HERE]

## Looking at historical dependency ratios around the world

We'll open up a dataset with the UNPD estimates for the age-distribution of countries around the world, over time. (To keep things simple, we'll only look at female populations today.)

In [None]:
unpd_pop = Table.read_table('../data/UNPD/unpd_f_pop_byage_cleaned.csv')
unpd_pop

This function is from Lab 05; it will help us grab population distributions from the UNPD dataset.

In [None]:
def get_pop(country, reference_date):
    raw_dat = unpd_pop.where('area', country).where('reference_date', reference_date)
    return(raw_dat)

get_pop('United States of America', 1980).show()

It will also be helpful to have a list with all of the periods available in the UNPD dataset.

In [None]:
all_periods = np.unique(unpd_pop['reference_date'])
all_periods

**Question - fill in the code below to write a function that takes information about a population (returned by `get_pop`) and calculates the old-age dependency ratio.**

In [None]:
def calculate_old_dr(population_data):
    
    # calculate the total number of old people
    tot_old = ...
    # calculate the total number of working-age people
    tot_working = ...
    
    # use tot_old and tot_working to calculate the old-age dependency ratio
    old_dr = ...
    return(old_dr)
    
us_old_dr = calculate_old_dr(get_pop('United States of America', 1980))
us_old_dr

In [None]:
_ = lab06.grade('test_old_dr')

**Question - fill in the code below to write a function that takes information about a population (returned by `get_pop`) and calculates the child dependency ratio.**

In [None]:
def calculate_child_dr(population_data):
    
    # calculate the total number of old people
    tot_child = ...
    # calculate the total number of working-age people
    tot_working = ...
    
    # use tot_child and tot_working to calculate the child dependency ratio
    child_dr = ...
    
    return(child_dr)
    
us_child_dr = calculate_child_dr(get_pop('United States of America', 1980))
us_child_dr

In [None]:
_ = lab06.grade('test_child_dr')

**Question - fill in the code below to write a function that takes information about a population (returned by `get_pop`) and calculates the total dependency ratio.**  
*[HINT: you should make use of the two functions you just wrote]*

In [None]:
def calculate_total_dr(population_data):
    
    total_dr = ...

    return(total_dr)
    
us_total_dr = calculate_total_dr(get_pop('United States of America', 1980))
us_total_dr

In [None]:
_ = lab06.grade('test_total_dr')

Now that we can calculate the various dependency ratios, we'd like to be able to take a look at how they have changed over time.  This will help us understand what potential there has been for different countries to experience the demographic dividend.

**Question - What pattern in dependency ratios would be favorable for producing a demographic dividend? In other words, what time trend in dependency ratios would lead to the opportunity for economic development?**

[ANSWER HERE]

OK, now we'll look at actual time trends in dependency ratios.

**Question - fill in the function below to make it calculate a get time series of the child, old-age, and total dependency ratios for the given country.**

In [None]:
def get_all_dr(country):
    
    total_dr = make_array()
    child_dr = make_array()
    old_dr = make_array()
    
    for period in all_periods:
        
        ## grab the population data for this country and time period
        pop_data = ...
        
        ## calculate the total, old-age, and child dependency ratios
        ## (hint: use the functions you wrote above!)
        current_total_dr = ...
        current_old_dr = ...
        current_child_dr = ...
        
        ## save the dependency ratios you just calculated
        total_dr = ...
        child_dr = ...
        old_dr = ...
        
    result = Table().with_columns('period', all_periods,
                                  'total_dr', total_dr,
                                  'old_dr', old_dr,
                                  'child_dr', child_dr,                                  
                                  'area', country)
    return(result)

us_all_dr = get_all_dr('United States of America')
us_all_dr.show()

In [None]:
_ = lab06.grade('test_get_all_dr')

**Question - Plot the time-trajectory of the three dependency ratios for the US**   
*[HINT: In the cell above, you calculated the rates for the us and stored them in the Table `us_all_dr`]*

In [None]:
us_all_dr.plot(...)
plt.title("United States");

**Question - Fill in the code below to make a function that will first calculate and then plot the three dependency ratios over time for the given country.**  
*[HINT: You should be sure to use the `get_all_dr` function you wrote earlier]*

In [None]:
def plot_all_dr(country):
    
    ## get the dependency ratios for the country
    dr_data = ...
    
    ## plot the results
    ...

plot_all_dr('Germany')

**Question - Use `plot_all_dr` to plot the time-trajectory of the three dependency ratios for Uganda, Guatemala, Sweden, and Japan.**

In [None]:
...
...
...
...

**Question - Sort the four countries you just looked at into the stage of the demographic transition they are in; use the categories early, middle, and late. Do the dependency ratio patterns look like they are related to the demographic transition?**

[ANSWER HERE]

## Looking at dependency ratios using simulation

In the first part of the lab, which you've just finished, you looked at how dependency ratios have changed over time in the real world.

In the second part of the lab, we're going to use the simulation tools that we developed last lab to get a sense for how dependency ratios change as populations do. This analysis will help us understand what kind of factors affect dependency ratios in the long run. For example, we might wonder if a given set of fertility and mortality rates implies a relatively high or low dependency ratio. Understanding this would help us know how to come up with policy suggestions that could help countries manage their dependency ratios. It might even help them develop economically.

One way to study this question would be do use mathematical demography. But in this class, we're practicing our coding, so we'll use simulations instead.  We'll re-use some of the code that we developed in the formal demography lab.  To save us from having to type all of those population projection functions in again, I've put them in a module called `leslie`. We can load the module like this:

In [None]:
import leslie

To see an example of using the module, recall that in the formal demography lab, we wrote a function called `self_project` that takes a country and a time period and projects that country's population at that time period forward, assuming that birth and death rates are fixed.  `self_project` is in the `leslie` module, so we can use it now. The only difference from last week is that we have to specify that the `self_project` function is inside of the `leslie` module by calling it like this:

In [None]:
india_proj = leslie.self_project('India', 1970, 20)
india_proj

Note that we called `leslie.self_projct(...)` to use the `self_project` function that is inside the `leslie` module.

We'd like to develop an understanding of what happens to dependency ratios when birth and death rates are fixed over long periods of time. We'll do this by taking the results of `self_project` and calculating the three dependency ratios for each iteration of the projection.

In order to make this a bit easier, we're providing you with a function that will calculate, for each row in a table, the sum across a given set of columns:

In [None]:
def row_sums(tab, which_cols):
    """
    Calculate the sum across which_cols in the Table tab
    row by row.
    """
    tab_forsum = tab.select(which_cols)
    return(make_array(tab_forsum.to_df().sum(axis=1))[0])

To see an example of this function in action, let's create a test Table:

In [None]:
test_table = Table().with_columns('a', np.arange(4),
                                  'b', make_array('a', 'b', 'c', 'd'),
                                  'c', np.arange(4),
                                  'd', np.arange(start=10, stop=14))
test_table

We can use `row_sums` to calculate the row-wise sum of columns 'c' and 'd' like this:

In [None]:
row_sums(test_table, np.arange(start=2,stop=4))

We passed in `np.arange(start=2, stop=4)` because columns 'c' and 'd' are at indexes 2 and 3.

**Question - Fill in the code below to calculate the three dependency ratios for the given population projection output.**   
*[HINT: You should use the function `row_sums` here. Note also that the columns with the child ages are at indexes 2 to 4; the working ages are indexes 5 to 14; and the old ages are indexes 15 to 19]*

In [None]:
def add_dep_ratios(proj_output):
    
    # add columns for the proportion of the population that is
    # at child, working, and adult ages (hint: use the row_sums function)
    res = proj_output.with_columns('prop_old', ...,
                                   'prop_child', ...,
                                   'prop_working', ...)
    
    # calculate the dependency ratios (hint: use the columns you just added above)
    res = res.with_columns('total_dr', ...,
                           'child_dr', ...,
                           'old_dr',   ...)
    return(res)

india_proj_withdr = add_dep_ratios(india_proj)
india_proj_withdr

In [None]:
_ = lab06.grade('test_add_dep_ratios')

**Question - Fill in the code below to plot the total, child, and old-age dependency ratios (y axis) by iteration (x axis). This plot will show how the dependency ratios change as the population is projected forward in time.**

In [None]:
...
plt.title("India (1970) projection");

**Question - Now fill in the code below to project the 2015 Uganda population forward 20 time steps, calculate the dependency ratios, and plot the results**  
*[HINT: Be sure to use the `leslie.self_project` and `add_dep_ratios` functions]*

In [None]:
## project 2015 Uganda forward 20 time steps
uganda_proj = ...
## calculate dependency ratios
uganda_proj_withdr = ...
## plot the result
...
plt.title("Uganda (2105) projection");

**Question - Finally, do the same thing for Japan. That is, write some code to project the 2015 Japan population forward 20 time steps, calculate the dependency ratios, and plot the results**

In [None]:
japan_proj = ...
japan_proj_withdr = ...
...
plt.title("Japan (2015) projection");

**Question - From these examples, what do you conclude about the dependency ratios -- do they change over time, or do they tend to converge to stable values?**

[ANSWER HERE]

**Question - From these examples, what do you conclude about how dependency ratios are related to the population rates that are used in the projection? Do the dependency ratios tend to settle on the same values no matter what the rates are? Or might they depend on the rates?**

[ANSWER HERE]

If you are interested in finding more information about the demographic dividend, here are some resources:

* [The Wikipedia page](https://en.wikipedia.org/wiki/Demographic_dividend) has an overview
* The Health Policy Project has developed [DemDiv](http://www.healthpolicyproject.com/index.cfm?id=software&get=DemDiv), a spreadsheet-based model for projecting population scenarios related to the demographic dividend
* JHU and the Gates Institute have creaded [www.demographicdividend.org](http://www.demographicdividend.org/). Among other things, the site has different case studies related to the demographic dividend in Africa

## Run all tests

This cell just re-runs all of the unit tests in the notebook, to summarize the results

In [None]:
# this cell runs all the tests at once!
print("Running all tests...")
_ = [lab06.grade(q[:-3]) for q in os.listdir("tests") if q.startswith('test')]
print("Finished running all tests.")

### Submit your assignment by MIDNIGHT on the day of class

Please submit your lab in by running the cell below. You can submit as many times as you want, up to midnight on the day of the class. No late submissions are allowed, and the system will prevent you from being able to submit late.

In [None]:
_ = lab06.submit()