In [2]:
import numpy as np
import math
from datascience import *
from scipy import stats

# Welcome to IAS-150's Data Science Module

Today we will be examing a data set that the UN produces every year, called the Gender Inequality Index. This is the UN's annual ranking of 188 countries in  terms of gender _inequality_. 

### Load in the UN Gender Inequality Index (GII) data from 2016
Note: the table has been modified slightly from its original format for ease of use. The original table can be found at: http://hdr.undp.org/en/composite/GII

In [3]:
data = Table.read_table('GII_data-numbers.csv')
data

HDI rank,Country,Value (2015),Rank (2015),"Maternal mortality ratio (deaths per 100,000 live births)","Adolescent birth rate (births per 1,000 women ages 15–19)",Share of seats in parliament (% held by women),% Female population with at least some secondary education,% Male population with at least some secondary education,Female Labour force participation rate,Male Labour force participation rate
1,Norway,0.05,6.0,5.0,5.9,39.64,96.07,94.6,61.18,68.53
2,Australia,0.12,24.0,6.0,14.13,30.53,91.37,91.53,58.57,70.92
2,Switzerland,0.04,1.0,5.0,2.95,28.86,96.07,97.36,62.68,74.85
4,Germany,0.07,9.0,6.0,6.69,36.86,96.38,96.96,54.53,66.43
5,Denmark,0.04,2.0,6.0,4.04,37.43,89.08,98.53,58.04,66.16
5,Singapore,0.07,11.0,10.0,3.82,23.91,75.52,81.92,58.24,76.43
7,Netherlands,0.04,3.0,7.0,3.99,36.44,86.18,90.28,57.53,70.24
8,Ireland,0.13,26.0,8.0,10.43,19.91,86.76,82.22,52.38,67.82
9,Iceland,0.05,5.0,3.0,6.07,41.27,100.0,97.18,70.66,77.5
10,Canada,0.1,18.0,7.0,9.77,28.27,100.0,100.0,60.97,70.28


## Clean Data:
##### (Only pay attention if you're interested)
Right now, all of the values that look like numbers are actually being stored as _strings_, which, in Python, are usually how ASCII characters are stored. Thus, you can't do normal mathematical operations on strings. We have to change these strings into _floats_, or floating-point decimals. That's what the code below does.

In [4]:
un = Table()
for label in data.labels:
    clean_col = make_array()
    for i in np.arange(len(data.column(label))):
        if data.column(label).item(i) == '..':
            clean_col = np.append(clean_col, np.nan)
        elif label == 'Country' or label == 'HDI rank':
            clean_col = np.append(clean_col, data.column(label).item(i))   
        else:
            clean_col = np.append(clean_col, float(data.column(label).item(i)))
    un.append_column(label, clean_col)
un

HDI rank,Country,Value (2015),Rank (2015),"Maternal mortality ratio (deaths per 100,000 live births)","Adolescent birth rate (births per 1,000 women ages 15–19)",Share of seats in parliament (% held by women),% Female population with at least some secondary education,% Male population with at least some secondary education,Female Labour force participation rate,Male Labour force participation rate
1,Norway,0.05,6,5,5.9,39.64,96.07,94.6,61.18,68.53
2,Australia,0.12,24,6,14.13,30.53,91.37,91.53,58.57,70.92
2,Switzerland,0.04,1,5,2.95,28.86,96.07,97.36,62.68,74.85
4,Germany,0.07,9,6,6.69,36.86,96.38,96.96,54.53,66.43
5,Denmark,0.04,2,6,4.04,37.43,89.08,98.53,58.04,66.16
5,Singapore,0.07,11,10,3.82,23.91,75.52,81.92,58.24,76.43
7,Netherlands,0.04,3,7,3.99,36.44,86.18,90.28,57.53,70.24
8,Ireland,0.13,26,8,10.43,19.91,86.76,82.22,52.38,67.82
9,Iceland,0.05,5,3,6.07,41.27,100.0,97.18,70.66,77.5
10,Canada,0.1,18,7,9.77,28.27,100.0,100.0,60.97,70.28


The 'Value' and 'Rank' columns describe how each country did last year (2015). As you can see, the lower the value, the higher the rank. Thus, in 2016, Norway had the lowest value, although the raw value for 2016 is not shown in this table.

### How the GII is calculated:
![](gii_breakdown.png)

The Gender Inequality Index (GII) reflects gender-based disadvantage in three dimensions—reproductive health, empowerment and the labour market—for as many countries as data of reasonable quality allow. It shows the loss in potential human development due to inequality between female and male achievements in these dimensions. It ranges from 0, where women and men fare equally, to 1, where one gender fares as poorly as possible in all measured dimensions. (taken from UNDP technical notes)

In [5]:
np.mean(un.column(5))

nan

In [6]:
np.nanmean(un.column(5))

47.867595628415302

In [7]:
un.show()

HDI rank,Country,Value (2015),Rank (2015),"Maternal mortality ratio (deaths per 100,000 live births)","Adolescent birth rate (births per 1,000 women ages 15–19)",Share of seats in parliament (% held by women),% Female population with at least some secondary education,% Male population with at least some secondary education,Female Labour force participation rate,Male Labour force participation rate
1,Norway,0.05,6.0,5.0,5.9,39.64,96.07,94.6,61.18,68.53
2,Australia,0.12,24.0,6.0,14.13,30.53,91.37,91.53,58.57,70.92
2,Switzerland,0.04,1.0,5.0,2.95,28.86,96.07,97.36,62.68,74.85
4,Germany,0.07,9.0,6.0,6.69,36.86,96.38,96.96,54.53,66.43
5,Denmark,0.04,2.0,6.0,4.04,37.43,89.08,98.53,58.04,66.16
5,Singapore,0.07,11.0,10.0,3.82,23.91,75.52,81.92,58.24,76.43
7,Netherlands,0.04,3.0,7.0,3.99,36.44,86.18,90.28,57.53,70.24
8,Ireland,0.13,26.0,8.0,10.43,19.91,86.76,82.22,52.38,67.82
9,Iceland,0.05,5.0,3.0,6.07,41.27,100.0,97.18,70.66,77.5
10,Canada,0.1,18.0,7.0,9.77,28.27,100.0,100.0,60.97,70.28
