<!-- PROJECT LOGO -->
<br/>
<p align="center">
    <img src="https://raw.githubusercontent.com/Team-17-Bedu/proyecto_python/main/img/icono.png" alt="Logo" width="135" height="135">

  <h3 align="center"><strong>Países con mayor calidad de vida</strong></h3>

  <p align="center">
    Proyecto final para el modulo "Procesamiento de datos con Python"
  </p>
</p>

#### Acerca del proyecto:

Este proyecto tiene como proposito realizar un analisis de la calidad de vida de ciertos paises, con el fin de determinar cuál de ellos podria ser la mejor opcion para vivir o buscar un empleo, todo esto, haciendo uso de herramientas estadísticas y computacionales, siendo el principal aliado, el lenguaje de programacion Python. 


En este caso se tendra como principal objetivo en calcular el IDH por medio de la documentación proporcionada del __*Human Development Report 2020*__, presente en el siguiente [enlace](http://hdr.undp.org/sites/default/files/hdr2020_technical_notes.pdf). Y con tales resultados justificar nuestro anterior proyecto elaborado en el lenguaje R, el proyecto esta presente en el siguiente [enlace](https://github.com/Team-17-Bedu/proyecto).

Todos los indicadores serán obtenidos por medio de la API proporcionada por la [UNDP](http://ec2-54-174-131-205.compute-1.amazonaws.com/API/Login.php), de los cuales serán analizados 
* Años esperados de escolaridad 
* Ingreso nacional bruto (INB) per cápita
* Esperanza de vida al nacer
* Promedio de años de escolaridad

<br/>
<p align="center">
    <img src="https://raw.githubusercontent.com/Team-17-Bedu/proyecto_python/main/img/logo-undp.jpg">
</p>

### Preparación:
Para comenzar se importaron los modulos necesarios (dadas las limitaciones de github no es posible renderizar los widgets, para ello sera necesario visualuzar el notebook en Google Colab)


In [10]:
import requests
import pandas as pd
import numpy as np
import ipywidgets as widgets
from IPython.display import display

Debido a que el acceso a la API requiere un inicio de sesion, se recurrió al objeto `Session` del modulo `requests`  

In [11]:
session = requests.Session()

Mediante el modulo `widgets` se definieron dos text box para introducir los datos de autenticación, `user` y `password` respectivamente 

In [12]:
user = widgets.Text()
password = widgets.Text()
display(user, password)

Text(value='')

Text(value='')

Definición de la tupla que contendra los datos de acceso

In [13]:
access = (user.value, password.value)

Inicio de sesión en la API de la UNDP

In [14]:
session.get("http://ec2-54-174-131-205.compute-1.amazonaws.com/API/Login.php",
            auth=access)

<Response [200]>

Puesto que el `json` recibido como respuesta presenta una estructura un tanto caotica, fue necesario crear un algoritmo que se encargara de normalizar los datos, a modo de obtener un `DataFrame` homogeneo, para ello se definio la funcion `normalize_data`

In [15]:
def normalize_data(data_dirt: dict, indicator_id: str) -> dict:
    new_data = {
        'Country': list(data_dirt['country_name'].values())
    }
    countries = list(data_dirt['country_name'].keys())
    for year, year_data in data_dirt['indicator_value'].items():
        values = []
        last_key = ''
        for key, val in year_data[indicator_id].items():
            current_pos = countries.index(key)
            last_pos = countries.index(last_key) if last_key != '' else 0
            last_key = key
            zeros = current_pos - last_pos
            i = 1
            while i < zeros and zeros != 0:
                values.append(0)
                i += 1
            values.append(val)
        new_data[year] = values
    return  new_data

Se declara la función `get_data` que construira la url mediante un `fstring` para la petición, esto es porque la url para la peticion tiene una sintaxis diferente, posteriormente se solicitarán los datos y se harán pasar por el normalizado definido con anterioridad

In [16]:
def get_data(indicator: str) -> dict :
    url = f'http://ec2-54-174-131-205.compute-1.amazonaws.com/API/HDRO_API.php/indicator_id={indicator}/structure=yic'
    return normalize_data(session.get(url).json(), indicator)

### Calculo del Indice de Educación
#### Expectativa de años de escolaridad
Obtencion de los registros del indicador correspondiente a la expectativa de años de escolaridad (`ID` = 69706)

In [17]:
data = get_data('69706')

Creación del `DataFrame` con los datos normalizados y visualización de `head` y `tail`

In [18]:
expectativa = pd.DataFrame(data)
expectativa

Unnamed: 0,Country,1990,1991,1992,1993,1994,1995,1996,1997,1998,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
0,Afghanistan,2.593,2.919,3.246,3.573,3.899,4.226,4.553,4.879,5.206,...,9.532,9.478,9.973,10.172,10.262,10.235,10.262,10.139,10.139,10.176
1,Angola,3.441,3.252,3.234,3.657,3.767,3.877,3.987,4.096,4.206,...,8.646,9.545,9.916,10.286,10.657,11.028,11.399,11.777,11.777,11.777
2,Albania,11.603,11.764,10.664,10.127,10.091,10.166,10.228,10.505,10.669,...,13.000,13.748,14.587,14.926,15.252,15.076,14.805,14.816,14.696,14.696
3,Andorra,10.799,10.799,10.799,10.799,10.799,10.799,10.799,10.799,10.799,...,11.672,11.672,13.524,13.139,13.495,13.140,13.300,13.046,13.300,13.300
4,United Arab Emirates,10.314,10.663,10.507,10.539,10.860,11.093,10.847,10.550,10.376,...,12.208,12.361,12.514,12.666,13.005,13.667,13.643,13.643,14.344,14.344
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
188,Samoa,11.655,11.699,11.744,11.788,11.832,11.877,11.921,11.832,12.010,...,12.859,12.508,12.380,12.458,12.535,12.433,12.431,12.520,12.520,12.730
189,Yemen,7.531,7.545,7.559,7.573,7.587,7.602,7.616,7.630,7.594,...,8.589,8.977,8.457,8.809,8.737,8.701,8.664,8.664,8.664,8.769
190,South Africa,11.371,11.946,12.315,12.684,12.752,13.038,13.022,13.007,12.992,...,12.808,12.793,12.872,13.147,13.406,13.765,13.668,13.668,13.668,13.791
191,Zambia,7.518,7.744,7.971,8.197,8.424,8.650,8.876,9.103,9.329,...,11.017,10.932,10.931,10.960,11.046,11.133,11.219,11.305,11.392,11.478


* Dimensiones del `DataFrame`
    * 193 filas
    * 31 columnas

In [19]:
expectativa.shape

(193, 31)

* Nombre de las columnas

In [20]:
expectativa.columns

Index(['Country', '1990', '1991', '1992', '1993', '1994', '1995', '1996',
       '1997', '1998', '1999', '2000', '2001', '2002', '2003', '2004', '2005',
       '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014',
       '2015', '2016', '2017', '2018', '2019'],
      dtype='object')

#### Calculo del indice de expectativa de años de escolaridad por paises

Se explica a continuación como se realizo el calculo del indice de expectativa de años de escolaaridad por pais.
Primeramente se presenta lo que es la formula que se uso:

$$\Huge\Huge\frac{\alpha - \theta}{\gamma - \theta}$$

Donde:

* $\alpha$ : Es la expectativa de años de escolaridad en el pais.
* $\theta$ : Es la expectativa minima de años de escolaridad. En el documento __*Human Development Report 2020*__ se especifican 0 años.
* $\gamma$ : Es la expectativa maxima de años de escolaridad. En el documento __*Human Development Report 2020*__ se especifican 18 años.


Este calculo se realizo por cada pais en cada uno de los años.

In [21]:
for year in range(1990, 2020):
    year = str(year)
    indices = []
    for cell in expectativa[year]:
        try:
            indices.append((float(cell) - 0) / (18 - 0))
        except:
            indices.append(0)
    expectativa[f'Indice_Expect_Educacion_{year}'] = indices

expectativa = expectativa.drop([str(i) for i in range(1990, 2020)],axis=1)

In [61]:
expectativa

Unnamed: 0,Country,Indice_Expect_Educacion_1990,Indice_Expect_Educacion_1991,Indice_Expect_Educacion_1992,Indice_Expect_Educacion_1993,Indice_Expect_Educacion_1994,Indice_Expect_Educacion_1995,Indice_Expect_Educacion_1996,Indice_Expect_Educacion_1997,Indice_Expect_Educacion_1998,...,Indice_Expect_Educacion_2010,Indice_Expect_Educacion_2011,Indice_Expect_Educacion_2012,Indice_Expect_Educacion_2013,Indice_Expect_Educacion_2014,Indice_Expect_Educacion_2015,Indice_Expect_Educacion_2016,Indice_Expect_Educacion_2017,Indice_Expect_Educacion_2018,Indice_Expect_Educacion_2019
0,Afghanistan,0.144056,0.162167,0.180333,0.198500,0.216611,0.234778,0.252944,0.271056,0.289222,...,0.529556,0.526556,0.554056,0.565111,0.570111,0.568611,0.570111,0.563278,0.563278,0.565333
1,Angola,0.191167,0.180667,0.179667,0.203167,0.209278,0.215389,0.221500,0.227556,0.233667,...,0.480333,0.530278,0.550889,0.571444,0.592056,0.612667,0.633278,0.654278,0.654278,0.654278
2,Albania,0.644611,0.653556,0.592444,0.562611,0.560611,0.564778,0.568222,0.583611,0.592722,...,0.722222,0.763778,0.810389,0.829222,0.847333,0.837556,0.822500,0.823111,0.816444,0.816444
3,Andorra,0.599944,0.599944,0.599944,0.599944,0.599944,0.599944,0.599944,0.599944,0.599944,...,0.648444,0.648444,0.751333,0.729944,0.749722,0.730000,0.738889,0.724778,0.738889,0.738889
4,United Arab Emirates,0.573000,0.592389,0.583722,0.585500,0.603333,0.616278,0.602611,0.586111,0.576444,...,0.678222,0.686722,0.695222,0.703667,0.722500,0.759278,0.757944,0.757944,0.796889,0.796889
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
188,Samoa,0.647500,0.649944,0.652444,0.654889,0.657333,0.659833,0.662278,0.657333,0.667222,...,0.714389,0.694889,0.687778,0.692111,0.696389,0.690722,0.690611,0.695556,0.695556,0.707222
189,Yemen,0.418389,0.419167,0.419944,0.420722,0.421500,0.422333,0.423111,0.423889,0.421889,...,0.477167,0.498722,0.469833,0.489389,0.485389,0.483389,0.481333,0.481333,0.481333,0.487167
190,South Africa,0.631722,0.663667,0.684167,0.704667,0.708444,0.724333,0.723444,0.722611,0.721778,...,0.711556,0.710722,0.715111,0.730389,0.744778,0.764722,0.759333,0.759333,0.759333,0.766167
191,Zambia,0.417667,0.430222,0.442833,0.455389,0.468000,0.480556,0.493111,0.505722,0.518278,...,0.612056,0.607333,0.607278,0.608889,0.613667,0.618500,0.623278,0.628056,0.632889,0.637667


#### Expectativa de años de escolaridad
Obtencion de los registros del indicador correspondiente al promedio de años de escolaridad (`ID` = 103006)

In [23]:
data = get_data('103006')

Creación del `DataFrame` con los datos normalizados y visualización de `head` y `tail`

In [24]:
promedio = pd.DataFrame(data)
promedio

Unnamed: 0,Country,1990,1991,1992,1993,1994,1995,1996,1997,1998,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
0,Afghanistan,1.490,1.564,1.638,1.712,1.786,1.860,1.920,1.980,2.040,...,3.230,3.310,3.390,3.470,3.550,3.630,3.630,3.780,3.930,3.930
1,Angola,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,...,4.726,4.726,4.788,4.851,4.915,4.980,5.100,5.125,5.174,5.174
2,Albania,7.830,7.828,7.826,7.824,7.822,8.027,8.175,8.323,8.471,...,9.292,9.958,10.025,10.025,10.025,10.025,10.025,10.055,10.055,10.146
3,Andorra,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,...,10.364,10.402,10.439,10.477,10.515,10.590,10.590,10.519,10.502,10.502
4,United Arab Emirates,5.623,5.915,6.207,6.500,6.792,7.084,7.323,7.562,7.802,...,9.883,10.036,10.189,10.342,10.495,10.648,10.923,12.111,12.111,12.111
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
185,Samoa,7.602,7.724,7.845,7.966,8.088,8.209,8.331,8.452,8.573,...,10.030,10.490,10.511,10.532,10.553,10.574,10.595,10.595,10.595,10.779
186,Yemen,0.290,0.362,0.434,0.506,0.578,0.650,0.760,0.870,0.980,...,2.600,2.800,3.000,3.000,3.000,3.000,3.000,3.000,3.200,3.200
187,South Africa,6.490,6.600,7.177,7.523,7.869,8.215,8.327,8.438,8.549,...,10.215,9.574,9.821,9.905,9.989,10.134,10.155,10.155,10.241,10.241
188,Zambia,4.680,4.952,5.224,5.496,5.768,6.040,6.008,5.976,5.944,...,6.600,6.674,6.748,6.822,6.769,6.864,6.960,7.056,7.104,7.152


* Dimensiones del `DataFrame`
    * 190 filas
    * 31 columnas

In [25]:
promedio.shape

(190, 31)

* Nombre de las columnas

In [26]:
promedio.columns

Index(['Country', '1990', '1991', '1992', '1993', '1994', '1995', '1996',
       '1997', '1998', '1999', '2000', '2001', '2002', '2003', '2004', '2005',
       '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014',
       '2015', '2016', '2017', '2018', '2019'],
      dtype='object')

#### Calculo del indice de promedio de años de escolaridad por país

Se explica a continuación como se realizo el calculo del indice de promedio de años de escolaridad por pais.
Primeramente se presenta lo que es la formula que se uso:

$$\Huge\Huge\frac{\alpha - \theta}{\gamma - \theta}$$

Donde:

* $\alpha$ : Es el promedio de años de escolaridad en el pais.
* $\theta$ : Es el promedio minimo de años de escolaridad. En el documento __*Human Development Report 2020*__ se especifican 0 años.
* $\gamma$ : Es el promedio maximo de años de escolaridad. En el documento __*Human Development Report 2020*__ se especifican 15 años.


Este calculo se realizo por cada pais en cada uno de los años. 

In [27]:
for year in range(1990, 2020):
    year = str(year)
    indices = []
    for cell in promedio[year]:
        try:
            indices.append((float(cell) - 0) / (15 - 0))
        except:
            indices.append(0)
    promedio[f'Indice_Promedio_Edu_{year}'] = indices

promedio = promedio.drop([str(i) for i in range(1990, 2020)], axis=1)

Resultado del calculo del indice del promedio de años de escolaridad por cada país en cada uno de sus periodos.

In [60]:
promedio

Unnamed: 0,Country,Indice_Promedio_Edu_1990,Indice_Promedio_Edu_1991,Indice_Promedio_Edu_1992,Indice_Promedio_Edu_1993,Indice_Promedio_Edu_1994,Indice_Promedio_Edu_1995,Indice_Promedio_Edu_1996,Indice_Promedio_Edu_1997,Indice_Promedio_Edu_1998,...,Indice_Promedio_Edu_2010,Indice_Promedio_Edu_2011,Indice_Promedio_Edu_2012,Indice_Promedio_Edu_2013,Indice_Promedio_Edu_2014,Indice_Promedio_Edu_2015,Indice_Promedio_Edu_2016,Indice_Promedio_Edu_2017,Indice_Promedio_Edu_2018,Indice_Promedio_Edu_2019
0,Afghanistan,0.099333,0.104267,0.109200,0.114133,0.119067,0.124000,0.128000,0.132000,0.136000,...,0.215333,0.220667,0.226000,0.231333,0.236667,0.242000,0.242000,0.252000,0.262000,0.262000
1,Angola,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.315067,0.315067,0.319200,0.323400,0.327667,0.332000,0.340000,0.341667,0.344933,0.344933
2,Albania,0.522000,0.521867,0.521733,0.521600,0.521467,0.535133,0.545000,0.554867,0.564733,...,0.619467,0.663867,0.668333,0.668333,0.668333,0.668333,0.668333,0.670333,0.670333,0.676400
3,Andorra,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.690933,0.693467,0.695933,0.698467,0.701000,0.706000,0.706000,0.701267,0.700133,0.700133
4,United Arab Emirates,0.374867,0.394333,0.413800,0.433333,0.452800,0.472267,0.488200,0.504133,0.520133,...,0.658867,0.669067,0.679267,0.689467,0.699667,0.709867,0.728200,0.807400,0.807400,0.807400
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
185,Samoa,0.506800,0.514933,0.523000,0.531067,0.539200,0.547267,0.555400,0.563467,0.571533,...,0.668667,0.699333,0.700733,0.702133,0.703533,0.704933,0.706333,0.706333,0.706333,0.718600
186,Yemen,0.019333,0.024133,0.028933,0.033733,0.038533,0.043333,0.050667,0.058000,0.065333,...,0.173333,0.186667,0.200000,0.200000,0.200000,0.200000,0.200000,0.200000,0.213333,0.213333
187,South Africa,0.432667,0.440000,0.478467,0.501533,0.524600,0.547667,0.555133,0.562533,0.569933,...,0.681000,0.638267,0.654733,0.660333,0.665933,0.675600,0.677000,0.677000,0.682733,0.682733
188,Zambia,0.312000,0.330133,0.348267,0.366400,0.384533,0.402667,0.400533,0.398400,0.396267,...,0.440000,0.444933,0.449867,0.454800,0.451267,0.457600,0.464000,0.470400,0.473600,0.476800


#### Calculo del Indice de Educacion
Creación de un `DataFrame` vacío para almacenar todos los resultados del calculo 

In [55]:
header = ['Country']
educacion = pd.DataFrame({}, columns=header)

### Calculo del indice de Educación por país

Se explica a continuación como se realizo el calculo del indice de educación por país.
Primeramente se presenta lo que es la formula que se uso:

$$\Huge\Huge\frac{\alpha + \beta}{2}$$

Donde:

* $\alpha$ : El indice de la expectativa de años de escolaridad en el pais.
* $\beta$ : El indice del promedio de años de escolaridad en el país

Este calculo se realizo por cada pais en cada uno de los años. 


In [56]:
def calcular_indice(_pais, expect, prom):
    dicci = {
        "Country": _pais
    }
    prom, expect = prom[0][1:], expect[0][1:]
    indice = [(float(prom[i]) + float(expect[i])) / 2 for i in range(0, len(prom))]

    for i in range(1990, 2020):
        dicci[str(i)] = indice[i - 1990]
    return dicci


for country in promedio["Country"]:
    expect = expectativa.loc[expectativa.Country == country].to_numpy()
    prom = promedio.loc[promedio.Country == country].to_numpy()
    if len(expect) > 0 and len(prom) > 0:
        educacion = educacion.append(calcular_indice(country, expect, prom), ignore_index= True)


Resultado del calculo del indice de educación por cada país en cada uno de sus periodos.

In [59]:
educacion

Unnamed: 0,Country,1990,1991,1992,1993,1994,1995,1996,1997,1998,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
0,Afghanistan,0.121694,0.133217,0.144767,0.156317,0.167839,0.179389,0.190472,0.201528,0.212611,...,0.372444,0.373611,0.390028,0.398222,0.403389,0.405306,0.406056,0.407639,0.412639,0.413667
1,Angola,0.095583,0.090333,0.089833,0.101583,0.104639,0.107694,0.110750,0.113778,0.116833,...,0.397700,0.422672,0.435044,0.447422,0.459861,0.472333,0.486639,0.497972,0.499606,0.499606
2,Albania,0.583306,0.587711,0.557089,0.542106,0.541039,0.549956,0.556611,0.569239,0.578728,...,0.670844,0.713822,0.739361,0.748778,0.757833,0.752944,0.745417,0.746722,0.743389,0.746422
3,Andorra,0.299972,0.299972,0.299972,0.299972,0.299972,0.299972,0.299972,0.299972,0.299972,...,0.669689,0.670956,0.723633,0.714206,0.725361,0.718000,0.722444,0.713022,0.719511,0.719511
4,United Arab Emirates,0.473933,0.493361,0.498761,0.509417,0.528067,0.544272,0.545406,0.545122,0.548289,...,0.668544,0.677894,0.687244,0.696567,0.711083,0.734572,0.743072,0.782672,0.802144,0.802144
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
184,Samoa,0.577150,0.582439,0.587722,0.592978,0.598267,0.603550,0.608839,0.610400,0.619378,...,0.691528,0.697111,0.694256,0.697122,0.699961,0.697828,0.698472,0.700944,0.700944,0.712911
185,Yemen,0.218861,0.221650,0.224439,0.227228,0.230017,0.232833,0.236889,0.240944,0.243611,...,0.325250,0.342694,0.334917,0.344694,0.342694,0.341694,0.340667,0.340667,0.347333,0.350250
186,South Africa,0.532194,0.551833,0.581317,0.603100,0.616522,0.636000,0.639289,0.642572,0.645856,...,0.696278,0.674494,0.684922,0.695361,0.705356,0.720161,0.718167,0.718167,0.721033,0.724450
187,Zambia,0.364833,0.380178,0.395550,0.410894,0.426267,0.441611,0.446822,0.452061,0.457272,...,0.526028,0.526133,0.528572,0.531844,0.532467,0.538050,0.543639,0.549228,0.553244,0.557233


### Calculo del Indice de Salud
Obtencion de los registros del indicador correspondiente a la expectativa de años de vida al nacer (`ID` = 69206)

In [32]:
data = get_data('69206')

Creación del `DataFrame` con los datos normalizados y visualización de `head` y `tail`

In [33]:
salud = pd.DataFrame(data)
salud

Unnamed: 0,Country,1990,1991,1992,1993,1994,1995,1996,1997,1998,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
0,Afghanistan,50.331,50.999,51.641,52.256,52.842,53.398,53.924,54.424,54.906,...,61.028,61.553,62.054,62.525,62.966,63.377,63.763,64.130,64.486,64.83
1,Angola,45.306,45.271,45.230,45.201,45.201,45.246,45.350,45.519,45.763,...,55.350,56.330,57.236,58.054,58.776,59.398,59.925,60.379,60.782,61.15
2,Albania,71.836,71.803,71.802,71.860,71.992,72.205,72.495,72.838,73.208,...,76.562,76.914,77.252,77.554,77.813,78.025,78.194,78.333,78.458,78.57
3,Andorra,76.517,76.682,76.854,77.030,77.213,77.414,77.644,77.909,78.207,...,80.818,80.935,81.054,81.173,81.294,81.416,81.540,81.663,81.786,81.91
4,United Arab Emirates,71.939,72.208,72.466,72.715,72.957,73.194,73.428,73.657,73.883,...,76.332,76.521,76.711,76.903,77.095,77.285,77.470,77.647,77.814,77.97
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
186,Samoa,66.281,66.470,66.655,66.842,67.037,67.249,67.491,67.764,68.067,...,71.663,71.906,72.136,72.351,72.549,72.730,72.895,73.046,73.187,73.32
187,Yemen,57.346,57.730,58.047,58.318,58.566,58.817,59.096,59.415,59.782,...,65.549,65.768,65.920,66.016,66.066,66.085,66.087,66.086,66.096,66.13
188,South Africa,63.307,63.384,63.247,62.894,62.331,61.561,60.595,59.489,58.315,...,57.669,58.895,60.060,61.099,61.968,62.649,63.153,63.538,63.857,64.13
189,Zambia,49.249,48.125,46.987,45.919,44.983,44.242,43.735,43.461,43.413,...,55.655,57.126,58.502,59.746,60.831,61.737,62.464,63.043,63.510,63.89


* Dimensiones del `DataFrame`
    * 191 filas
    * 31 columnas

In [34]:
salud.shape

(191, 31)

* Nombre de las columnas

In [35]:
salud.columns

Index(['Country', '1990', '1991', '1992', '1993', '1994', '1995', '1996',
       '1997', '1998', '1999', '2000', '2001', '2002', '2003', '2004', '2005',
       '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014',
       '2015', '2016', '2017', '2018', '2019'],
      dtype='object')

#### Calculo del Indice de Salud por paises.

Se explica a continuación como se realizo el calculo del indice de salud por pais.
Primeramente se presenta lo que es la formula que se uso:

$$\Huge\Huge\frac{\alpha - \theta}{\gamma - \theta}$$

Donde:

* $\alpha$ : Es la esperanza de vida en el pais.
* $\theta$ : Es la esperanza minima de vida. En el documento __*Human Development Report 2020*__ se especifican 20 años.
* $\gamma$ : Es la esperanza maxima de vida. En el documento __*Human Development Report 2020*__ se especifican 85 años.


Este calculo se realizo por cada pais en cada uno de los años.

In [36]:
#Calculo del "Indice_Salud"
for i in range(1990,2020):
    ind = str(i)
    salud[ind] = (salud[ind]-20)/(85-20)

Resultado del calculo del indice de salud por cada país en cada uno de sus periodos.

In [58]:
salud

Unnamed: 0,Country,1990,1991,1992,1993,1994,1995,1996,1997,1998,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
0,Afghanistan,0.466631,0.476908,0.486785,0.496246,0.505262,0.513815,0.521908,0.529600,0.537015,...,0.631200,0.639277,0.646985,0.654231,0.661015,0.667338,0.673277,0.678923,0.684400,0.689692
1,Angola,0.389323,0.388785,0.388154,0.387708,0.387708,0.388400,0.390000,0.392600,0.396354,...,0.543846,0.558923,0.572862,0.585446,0.596554,0.606123,0.614231,0.621215,0.627415,0.633077
2,Albania,0.797477,0.796969,0.796954,0.797846,0.799877,0.803154,0.807615,0.812892,0.818585,...,0.870185,0.875600,0.880800,0.885446,0.889431,0.892692,0.895292,0.897431,0.899354,0.901077
3,Andorra,0.869492,0.872031,0.874677,0.877385,0.880200,0.883292,0.886831,0.890908,0.895492,...,0.935662,0.937462,0.939292,0.941123,0.942985,0.944862,0.946769,0.948662,0.950554,0.952462
4,United Arab Emirates,0.799062,0.803200,0.807169,0.811000,0.814723,0.818369,0.821969,0.825492,0.828969,...,0.866646,0.869554,0.872477,0.875431,0.878385,0.881308,0.884154,0.886877,0.889446,0.891846
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
186,Samoa,0.712015,0.714923,0.717769,0.720646,0.723646,0.726908,0.730631,0.734831,0.739492,...,0.794815,0.798554,0.802092,0.805400,0.808446,0.811231,0.813769,0.816092,0.818262,0.820308
187,Yemen,0.574554,0.580462,0.585338,0.589508,0.593323,0.597185,0.601477,0.606385,0.612031,...,0.700754,0.704123,0.706462,0.707938,0.708708,0.709000,0.709031,0.709015,0.709169,0.709692
188,South Africa,0.666262,0.667446,0.665338,0.659908,0.651246,0.639400,0.624538,0.607523,0.589462,...,0.579523,0.598385,0.616308,0.632292,0.645662,0.656138,0.663892,0.669815,0.674723,0.678923
189,Zambia,0.449985,0.432692,0.415185,0.398754,0.384354,0.372954,0.365154,0.360938,0.360200,...,0.548538,0.571169,0.592338,0.611477,0.628169,0.642108,0.653292,0.662200,0.669385,0.675231


### Calculo del Indice de Salud
Obtencion de los registros del indicador correspondiente a la expectativa de años de vida al nacer (`ID` = 69206)

In [38]:
data = get_data('195706')

Creación del `DataFrame` con los datos normalizados y visualización de `head` y `tail`

In [39]:
ingreso = pd.DataFrame(data)
ingreso

Unnamed: 0,Country,1990,1991,1992,1993,1994,1995,1996,1997,1998,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
0,Afghanistan,2477.906,2059.950,1921.695,1320.535,950.897,1344.364,1223.441,1130.017,1057.778,...,1917.395,2013.614,2164.641,2229.907,2214.414,2128.162,2134.866,2229.658,2217.176,2229.362
1,Angola,4823.397,5380.463,2064.354,2024.850,1550.029,3396.198,3357.465,3962.468,3884.616,...,6913.161,6887.004,7282.050,7478.856,7704.368,7652.152,7189.032,6861.581,6360.551,6104.055
2,Albania,4937.523,3496.390,3207.627,3684.634,4102.593,4771.803,5252.751,4681.337,5115.353,...,10774.722,11237.447,11365.140,11806.358,11951.263,12273.473,12753.307,13071.095,13636.864,13998.300
3,Andorra,45393.316,44773.183,43487.951,41568.988,41401.230,41761.304,43356.786,47312.497,48964.850,...,49261.522,47366.247,47347.416,48486.415,50567.870,51779.832,53245.151,54371.345,55253.539,56000.303
4,United Arab Emirates,102433.229,96250.378,93043.562,92505.415,96784.515,101303.368,101490.336,103567.129,99915.860,...,54911.287,56152.975,57447.351,60007.281,62499.798,65528.563,66881.303,67667.530,67195.144,67462.095
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
186,Samoa,5938.879,5697.935,5433.785,5682.170,3700.967,3931.255,4370.056,4379.692,4477.627,...,6062.656,5969.899,5703.125,5693.886,5780.675,6237.246,6409.360,6287.699,6126.974,6308.652
187,Yemen,2958.129,2788.322,2762.329,2749.622,2681.535,2915.890,3232.027,3325.112,3504.726,...,4304.374,3493.180,3600.009,3607.262,3136.621,2194.613,1845.552,1619.640,1563.988,1593.704
188,South Africa,9974.994,9725.792,9312.598,9212.297,9312.299,9387.381,9582.037,9660.656,9537.183,...,12195.419,12346.453,12403.797,12523.410,12549.480,12528.094,12356.603,12322.437,12231.840,12129.230
189,Zambia,2015.630,1866.013,1818.721,1940.969,1755.783,1763.636,1836.819,1853.643,1787.271,...,2872.212,2873.238,3333.817,3389.723,3263.274,3403.717,3237.725,3330.785,3365.698,3325.519


* Dimensiones del `DataFrame`
    * 191 filas
    * 31 columnas

In [40]:
ingreso.shape

(191, 31)

* Nombre de las columnas

In [41]:
ingreso.columns

Index(['Country', '1990', '1991', '1992', '1993', '1994', '1995', '1996',
       '1997', '1998', '1999', '2000', '2001', '2002', '2003', '2004', '2005',
       '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014',
       '2015', '2016', '2017', '2018', '2019'],
      dtype='object')

#### Calculo del Indice de Ingreso por paises.

Se explica a continuación como se realizo el calculo del indice de ingreso por pais.
Primeramente se presenta lo que es la formula que se uso:

$$\Huge\Huge\frac{\log(\alpha) - \log(\theta)}{\log(\gamma) - \log(\theta)}$$

Donde:

* $\alpha$ : Es el ingreso nacional bruto.
* $\theta$ : Es el ingreso nacional bruto minimo. En el documento __*Human Development Report 2020*__ se especifica 75,000.
* $\gamma$ : Es el ingreso nacional bruto maximo. En el documento __*Human Development Report 2020*__ se especifica 100,000.


Este calculo se realizo por cada pais en cada uno de los años.

In [42]:
#Calculo del "Indice_Ingreso"
for i in range(1990,2020):
    ind = str(i)
    ingreso[ind] = (np.log(ingreso[ind])-np.log(100))/(np.log(75000)-np.log(100))

Resultado del calculo del indice de ingreso por cada país en cada uno de sus periodos.

In [43]:
ingreso

Unnamed: 0,Country,1990,1991,1992,1993,1994,1995,1996,1997,1998,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
0,Afghanistan,0.484889,0.456984,0.446489,0.389818,0.340213,0.392519,0.378282,0.366283,0.356304,...,0.446151,0.453547,0.464472,0.468959,0.467906,0.461905,0.462380,0.468942,0.468094,0.468922
1,Angola,0.585502,0.602011,0.457306,0.454388,0.414022,0.532508,0.530775,0.555802,0.552805,...,0.639874,0.639301,0.647727,0.651755,0.656242,0.655215,0.645785,0.638743,0.627289,0.621071
2,Albania,0.589034,0.536900,0.523879,0.544821,0.561052,0.583877,0.598383,0.580986,0.594379,...,0.706909,0.713261,0.714967,0.720721,0.722563,0.726582,0.732375,0.736093,0.742494,0.746445
3,Andorra,0.924151,0.922074,0.917674,0.910857,0.910246,0.911554,0.917218,0.930407,0.935592,...,0.936505,0.930578,0.930518,0.934109,0.940458,0.944036,0.948251,0.951413,0.953844,0.955872
4,United Arab Emirates,1.047088,1.037683,1.032565,1.031688,1.038519,1.045412,1.045691,1.048750,1.043329,...,0.952905,0.956283,0.959726,0.966311,0.972459,0.979607,0.982694,0.984459,0.983401,0.984000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
186,Samoa,0.616928,0.610671,0.603501,0.610253,0.545489,0.554608,0.570592,0.570925,0.574265,...,0.620043,0.617714,0.610809,0.610564,0.612849,0.624332,0.628444,0.625549,0.621638,0.626052
187,Yemen,0.511647,0.502717,0.501303,0.500606,0.496818,0.509475,0.525024,0.529313,0.537260,...,0.568304,0.536761,0.541311,0.541615,0.520497,0.466549,0.440382,0.420658,0.415377,0.418220
188,South Africa,0.695259,0.691437,0.684880,0.683244,0.684875,0.686088,0.689188,0.690422,0.688479,...,0.725618,0.727477,0.728177,0.729627,0.729941,0.729684,0.727602,0.727183,0.726069,0.724796
189,Zambia,0.453698,0.442048,0.438170,0.447997,0.432850,0.433524,0.439666,0.441043,0.435535,...,0.507195,0.507249,0.529708,0.532220,0.526477,0.532842,0.525290,0.529570,0.531145,0.529331


### Calculo del Indice de Desarrollo Humano (`IDH`)
Apoyandose de los datos obtenidos y calculados con anterioridad se procederá a calcular el `IDH`, donde, de manera similar al Indice de Educación, se generarán un dataframe vacio que contendra el Indice calculado 

In [51]:
header = ['Country']
indice_desarrollo_humano = pd.DataFrame({}, columns=header)

#### Calculo del Indice de Desarrollo Humano por paises.

Se explica a continuación como se realizo el calculo del indice de salud por pais.
Primeramente se presenta lo que es la formula que se uso:

$$\Huge\Huge{IDH=\left(I_{Salud}+I_{Educacion}+I_{Ingreso}\right)^{\frac{1}{3}}}$$

Donde:

* $IDH$ : Es el indice de desarrollo humano
* $I_{Salud}$ : Es el indice de salud.
* $I_{Educacion}$ : Es el indice de educación
* $I_{Ingreso}$ : Es el indice de ingreso.


Este calculo se realizo por cada pais en cada uno de los años.

In [52]:
def IDH(_pais, salud, educacion, ingreso):
    dicci = {
        "Country": _pais
    }
    s, e, i = salud[0][1:], educacion[0][1:], ingreso[0][1:]
    idh = np.power((s * e * i), (1/3))

    for i in range(1990, 2020):
        dicci[str(i)] = idh[i - 1990]
    return dicci


for country in educacion["Country"]:
    i_salud = salud.loc[salud.Country == country].to_numpy()
    i_educacion = educacion.loc[educacion.Country == country].to_numpy()
    i_ingreso = ingreso.loc[ingreso.Country == country].to_numpy()
    if len(i_salud) > 0 and len(i_educacion) > 0 and len(i_ingreso) > 0:
        indice_desarrollo_humano = indice_desarrollo_humano.append(
            IDH(country, i_salud, i_educacion, i_ingreso),
            ignore_index= True)

Resultado del calculo del indice de desarrollo humano por cada país en cada uno de sus periodos.

In [57]:
indice_desarrollo_humano

Unnamed: 0,Country,1990,1991,1992,1993,1994,1995,1996,1997,1998,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
0,Afghanistan,0.301969,0.307349,0.315698,0.311545,0.306704,0.330741,0.335027,0.339391,0.343925,...,0.471596,0.476699,0.489384,0.496208,0.499687,0.499912,0.501872,0.506297,0.509414,0.511449
1,Angola,0.279302,0.276516,0.251700,0.261568,0.256099,0.281362,0.284079,0.291727,0.294718,...,0.517260,0.532542,0.544492,0.554749,0.564651,0.572441,0.577930,0.582449,0.581500,0.581311
2,Albania,0.649509,0.631199,0.614983,0.617664,0.623857,0.636526,0.645523,0.645403,0.655441,...,0.744501,0.763920,0.775068,0.781797,0.786781,0.787499,0.787709,0.790128,0.791796,0.794782
3,Andorra,0.622343,0.622481,0.622117,0.621213,0.621738,0.622763,0.624882,0.628824,0.631068,...,0.837209,0.836501,0.858384,0.856293,0.863242,0.861973,0.865613,0.863365,0.867289,0.868483
4,United Arab Emirates,0.734672,0.743620,0.746320,0.752571,0.764489,0.775088,0.776830,0.778562,0.779812,...,0.820365,0.826067,0.831772,0.838366,0.846887,0.859155,0.864289,0.880806,0.888589,0.889568
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
184,Samoa,0.632903,0.633538,0.633789,0.638886,0.618115,0.624299,0.633153,0.635029,0.640719,...,0.698499,0.700591,0.698044,0.699870,0.702575,0.707027,0.709533,0.709951,0.709095,0.715387
185,Yemen,0.400704,0.401410,0.403833,0.406271,0.407771,0.413770,0.421354,0.426049,0.431074,...,0.505966,0.505955,0.504070,0.509378,0.501877,0.483497,0.473814,0.466629,0.467713,0.470202
186,South Africa,0.627028,0.633859,0.642229,0.647862,0.650282,0.653436,0.650426,0.645951,0.639971,...,0.664029,0.664650,0.674878,0.684557,0.692735,0.701218,0.702650,0.704598,0.706892,0.709058
187,Zambia,0.420746,0.417393,0.415938,0.418700,0.413920,0.414861,0.415506,0.415946,0.415509,...,0.526983,0.534188,0.549419,0.557295,0.560509,0.568866,0.571398,0.577504,0.581570,0.583990


### Traduccion de los nombres de cada país de ingles a español
Por cuestiones de comodidad, todos los nombres presentes en la columna `Country`, para ello se definio una nueva función que hará uso del modulo `google_trans_new`

In [67]:
from google_trans_new import google_translator

def translate(df: pd.DataFrame) -> pd.DataFrame:
    translator = google_translator()
    regs = df.Country.count()
    for i in range(0, regs):
        pais = translator.translate(df.Country[i], lang_tgt='es', lang_src='en')
        df.Country[i] = pais.strip() if type(pais) is not list else pais[0].strip()
    return df

Posteriormente se llamara a la función anterior y se enciara como parametro el `DataFrame` llamado `indice_desarrollo_humano`

In [68]:
indice_desarrollo_humano = translate(indice_desarrollo_humano)

### Conservación de los indices calculados en un archivo CSV 
Una vez concretados todos los procedimientos planeados se conservarán los indices en un nuevo archivo `.csv`

In [69]:
indice_desarrollo_humano.to_csv('HDI_data.csv', index=False)