# Energy Access 


In this notebook, you will investigate the Human Development Index (HDI) and observe how various factors such as GNI per capita, life expectancy, and education affect HDI, using data from the World Bank.

**Dependencies:**

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

plt.style.use('fivethirtyeight')

from IPython.display import display, Latex, Markdown

<br>

----


<img src="hdi.png" width=800>

In this section, we will normalize each individual metric (GNI, life expectancy, education) and compute HDI based on the United Nations [guide](http://hdr.undp.org/sites/default/files/hdr2016_technical_notes_0.pdf "UNDP HDI Notes"). "The Human Development Index (HDI) is a summary measure of average achievement in key dimensions of human development: a long and healthy life, being knowledgeable and have a decent standard of living. The HDI is the geometric mean of normalized indices for each of the three dimensions."

<br>
The formula for calculating HDI is here:  
<img src="hdicalc.png" width=300>

<br>


In [27]:
# RUN THIS CELL
life_ed_gni = pd.read_csv('data/life_ed_gni.csv')
life_ed_gni.head()

Unnamed: 0,Income Group,Country,Year(s),Life,Ed_expected,HDI Rank,Ed_mean,GNI
0,Low income,Afghanistan,2012,60,9.9,169,3.4,1900.0
1,Lower middle income,Angola,2012,51,11.4,150,4.8,5550.0
2,Upper middle income,Albania,2012,74,14.2,75,9.6,10450.0
3,High income,Andorra,2012,83,13.5,32,9.6,
4,High income,United Arab Emirates,2012,76,13.3,42,9.2,60600.0


Define a function that normalizes GNI. Test this function by inputting Afghanistan's GPP in 2012 (use the `gni_all` table).

In [28]:
#Solution
def normalize_GNI(gni):
    """
    Normalize GNI to get the Income Index.

    Args:
        An integer corresponding to the GNI PPP
        of a country and year

    Returns:
        The Income Index (int)
    """
    numerator = (np.log(gni)-np.log(100))
    denominator = (np.log(75000)-np.log(100))
    return np.divide(numerator, denominator)

Run the following cell -- if it raises an error, that means there's an error in the function.

In [29]:
# TEST YOUR FUNCTION
first_num = life_ed_gni.loc[0,'GNI']

test_gni_ans = normalize_GNI(first_num)

assert test_gni_ans == 0.4447743835010624

Define a function that normalizes life expectancy. Test this function by inputting Afghanistan's life expectancy for both sexes in 2012.

In [30]:
#Solution
def normalize_life(life):
    """
    Normalize life expectancy to get the Life Expectancy Index.

    Args:
        An integer corresponding to the life
        expectancy for both sexes

    Returns:
        The Life Expectancy Index (int)
    """
    sub = life-20
    constants = 85-20
    return np.divide(sub, constants)

In [31]:
# TEST YOUR FUNCTION
life_num = life_ed_gni.loc[0,'Life']

test_life_ans = normalize_life(life_num)
assert test_life_ans == 0.6153846153846154

Define a function that calculates the Expected Index. Test this function by inputting Afghanistan's expected years of schooling for 2012 and mean years of schooling for 2012.

In [32]:
#Solution
def normalize_ed(mean_var, exp_var):
    """
    Normalize years of schooling to get the Years of Schooling Index.

    Args:
        First variable is mean education, second is expected education.

    Returns:
        The Years of Schooling Index (int)
    """ 
    mysi = np.divide(mean_var, 15)
    eysi = np.divide(exp_var, 18)
    add = mysi+eysi
    return np.divide(add, 2)

In [33]:
# TEST YOUR FUNCTION
ed_nums = life_ed_gni.loc[0,['Ed_mean', 'Ed_expected']]

test_ed_ans = normalize_ed(ed_nums[0],ed_nums[1])
assert  test_ed_ans == 0.38833333333333336

Define a function that calculates the HDI. Test this function by inputting Afghanistan's normalized GPP in 2012, normalized life expectancy for both sexes in 2012, and normalized expected years of schooling in 2012.

In [34]:
#Solution
def calc_hdi(gni_var, life_var, ed_var):
    """
    Compute HDI from normalized gni, life and education variables.
    
    Args:
        normalized gni (first entry), life (second entry) and education (third entry).
    
    Returns: 
        The HDI (float)
    """ 
    var = gni_var * life_var * ed_var
    return var **(np.divide(1,3))

In [35]:
#these three values were calculated using the previous three functions
assert calc_hdi(test_gni_ans, test_life_ans, test_ed_ans) == 0.4736930620781577

Use .apply() to create three new columns in the life_ed_gni data frame.  

* The first new column will be normalized GNI, called 'GNI_n'
* The second new column will be normalized life, called 'Life_n'
* The third new column will be normalized Education, called 'Ed_n'

In [36]:
#Solution
life_ed_gni['GNI_n']= life_ed_gni['GNI'].apply(normalize_GNI)
life_ed_gni['Life_n']= life_ed_gni['Life'].apply(normalize_life)
life_ed_gni['Ed_n']= life_ed_gni[['Ed_mean','Ed_expected']].apply(lambda x: normalize_ed(x['Ed_mean'], x['Ed_expected']), axis=1)