## Health Nutrition and Population Statistics
_____________

* Clearly state the goal of your project (what were you exploring?)
* Describe the data.
* What features (columns) did you have to work with?
    * What features were you interested in?
    * Were the features numerical/categorical/text?
    * Was a lot of data missing? If so, what did you do to handle it?
    * How did features relate to each other, and the values that you were interested in?

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [4]:
health_data = pd.read_csv('data/data.csv')

In [5]:
health_data.head()

Unnamed: 0,Country Name,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,...,2007,2008,2009,2010,2011,2012,2013,2014,2015,Unnamed: 60
0,Arab World,ARB,% of females ages 15-49 having comprehensive c...,SH.HIV.KNOW.FE.ZS,,,,,,,...,,,,,,,,,,
1,Arab World,ARB,% of males ages 15-49 having comprehensive cor...,SH.HIV.KNOW.MA.ZS,,,,,,,...,,,,,,,,,,
2,Arab World,ARB,"Adolescent fertility rate (births per 1,000 wo...",SP.ADO.TFRT,133.555013,134.159119,134.857912,134.504576,134.105211,133.569626,...,49.999851,49.887046,49.781207,49.672975,49.536047,49.383745,48.796558,48.196418,,
3,Arab World,ARB,Adults (ages 15+) and children (0-14 years) li...,SH.HIV.TOTL,,,,,,,...,,,,,,,,,,
4,Arab World,ARB,Adults (ages 15+) and children (ages 0-14) new...,SH.HIV.INCD.TL,,,,,,,...,,,,,,,,,,


In [7]:
health_data.tail()

Unnamed: 0,Country Name,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,...,2007,2008,2009,2010,2011,2012,2013,2014,2015,Unnamed: 60
89005,Zimbabwe,ZWE,Use of insecticide-treated bed nets (% of unde...,SH.MLR.NETS.ZS,,,,,,,...,,,17.3,,9.7,,,26.8,,
89006,Zimbabwe,ZWE,Use of Intermittent Preventive Treatment of ma...,SH.MLR.SPF2.ZS,,,,,,,...,,,13.9,,7.3,,,12.9,,
89007,Zimbabwe,ZWE,Vitamin A supplementation coverage rate (% of ...,SN.ITK.VITA.ZS,,,,,,,...,83.0,0.0,77.0,49.0,47.0,61.0,34.0,32.0,,
89008,Zimbabwe,ZWE,Wanted fertility rate (births per woman),SP.DYN.WFRT,,,,,,,...,,,,,3.5,,,,,
89009,Zimbabwe,ZWE,Women's share of population ages 15+ living wi...,SH.DYN.AIDS.FE.ZS,,,,,,,...,58.586086,58.760796,58.812421,58.825943,58.899308,58.93908,58.900126,58.822335,58.855551,


In [8]:
health_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 89010 entries, 0 to 89009
Data columns (total 61 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Country Name    89010 non-null  object 
 1   Country Code    89010 non-null  object 
 2   Indicator Name  89010 non-null  object 
 3   Indicator Code  89010 non-null  object 
 4   1960            35482 non-null  float64
 5   1961            35325 non-null  float64
 6   1962            35889 non-null  float64
 7   1963            35452 non-null  float64
 8   1964            35483 non-null  float64
 9   1965            35603 non-null  float64
 10  1966            35538 non-null  float64
 11  1967            36022 non-null  float64
 12  1968            35577 non-null  float64
 13  1969            35630 non-null  float64
 14  1970            36910 non-null  float64
 15  1971            37578 non-null  float64
 16  1972            38015 non-null  float64
 17  1973            37542 non-null 

In [9]:
health_data.transpose()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,89000,89001,89002,89003,89004,89005,89006,89007,89008,89009
Country Name,Arab World,Arab World,Arab World,Arab World,Arab World,Arab World,Arab World,Arab World,Arab World,Arab World,...,Zimbabwe,Zimbabwe,Zimbabwe,Zimbabwe,Zimbabwe,Zimbabwe,Zimbabwe,Zimbabwe,Zimbabwe,Zimbabwe
Country Code,ARB,ARB,ARB,ARB,ARB,ARB,ARB,ARB,ARB,ARB,...,ZWE,ZWE,ZWE,ZWE,ZWE,ZWE,ZWE,ZWE,ZWE,ZWE
Indicator Name,% of females ages 15-49 having comprehensive c...,% of males ages 15-49 having comprehensive cor...,"Adolescent fertility rate (births per 1,000 wo...",Adults (ages 15+) and children (0-14 years) li...,Adults (ages 15+) and children (ages 0-14) new...,Adults (ages 15+) living with HIV,Adults (ages 15+) newly infected with HIV,"Age at first marriage, female","Age at first marriage, male",Age dependency ratio (% of working-age populat...,...,Urban population,Urban population (% of total),Urban population growth (annual %),Urban poverty headcount ratio at national pove...,Use of any antimalarial drug (% of pregnant wo...,Use of insecticide-treated bed nets (% of unde...,Use of Intermittent Preventive Treatment of ma...,Vitamin A supplementation coverage rate (% of ...,Wanted fertility rate (births per woman),Women's share of population ages 15+ living wi...
Indicator Code,SH.HIV.KNOW.FE.ZS,SH.HIV.KNOW.MA.ZS,SP.ADO.TFRT,SH.HIV.TOTL,SH.HIV.INCD.TL,SH.DYN.AIDS,SH.HIV.INCD,SP.DYN.SMAM.FE,SP.DYN.SMAM.MA,SP.POP.DPND,...,SP.URB.TOTL,SP.URB.TOTL.IN.ZS,SP.URB.GROW,SI.POV.URHC,SH.MLR.PREG.ZS,SH.MLR.NETS.ZS,SH.MLR.SPF2.ZS,SN.ITK.VITA.ZS,SP.DYN.WFRT,SH.DYN.AIDS.FE.ZS
1960,,,133.555,,,,,,,87.7992,...,473101,12.608,4.89775,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2012,,,49.3837,,,,,,,61.6819,...,4.78243e+06,32.834,1.60077,,,,,61,,58.9391
2013,,,48.7966,,,,,,,61.6787,...,4.86482e+06,32.654,1.70815,,,,,34,,58.9001
2014,,,48.1964,,,,,,,61.7198,...,4.95506e+06,32.501,1.83779,,,26.8,12.9,32,,58.8223
2015,,,,,,,,,,61.7542,...,5.05155e+06,32.376,1.92863,,,,,,,58.8556


In [11]:
health_data['Indicator Name'].unique()

array(['% of females ages 15-49 having comprehensive correct knowledge about HIV (2 prevent ways and reject 3 misconceptions)',
       '% of males ages 15-49 having comprehensive correct knowledge about HIV (2 prevent ways and reject 3 misconceptions)',
       'Adolescent fertility rate (births per 1,000 women ages 15-19)',
       'Adults (ages 15+) and children (0-14 years) living with HIV',
       'Adults (ages 15+) and children (ages 0-14) newly infected with HIV',
       'Adults (ages 15+) living with HIV',
       'Adults (ages 15+) newly infected with HIV',
       'Age at first marriage, female', 'Age at first marriage, male',
       'Age dependency ratio (% of working-age population)',
       'Age dependency ratio, old', 'Age dependency ratio, young',
       'Age population, age 0, female, interpolated',
       'Age population, age 0, male, interpolated',
       'Age population, age 01, female, interpolated',
       'Age population, age 01, male, interpolated',
       'Age popula