## ***Basic*** and ***safely*** managed ***sanitation*** services in different countries
(servizi sanitari di base e gestiti in sicurezza nei diversi paesi)

***

 Il lavoro consiste in un'analisi dettagliata di dati open source importati da una **fonte di Open Data** circa **fenomeni mondiali**
   
   **FONTE:** WHO/OMS Global Health Data Repository: [https://apps.who.int/gho/data/node.home ]
   
   From **6.2** SDG health and health-related targets **(Sanitation and hygiene)** in the section Water and sanitation, concatenated to the dataframe of the Health Equity Monitor section.

### **OBIETTIVO ANALISI**

***L'analisi ha l'obiettivo di individuare il fenomeno della stratificazione sociale e le disuguaglianze di salute dovute a diverse condizioni socio-economiche e culturali***

La classe sociale di appartenenza della popolazione è associata anche al concetto di aspettativa di vita. 
L'analisi vuole mettere in evidenza come le classi medie-alte hanno maggiori aspettative di vita, maggiore accesso alle risorse economiche , e come la differenza tra lavoro urbano e lavoro agricolo sia determinante per il tipo di stile di vita condotto e per la possibilità di ammalarsi o godere di una buona salute. 

Inoltre l'obiettivo è quello di mettere in relazione alcuni fattori socio-culturali con altri fattori economici, si evidenzia la correlazione tra livello di istruzione e aspettativa e stile di vita della persona.
  Le classi sociali con maggiori risorse materiali (risorse economiche) avranno maggior accesso anche a quelle immateriali (servizi igenico-sanitari)

1. **librerie da importare**

In [427]:
import numpy as np
import pandas as pd

2. dopo aver fatto il **download completo del data set come CSV table**, ho **salvato in locale** e attraverso il comando **head()** che permette la visualizzazione delle prime n righe del dataframe in esame, ho **visualizzato il file**. Si può specificare il numero di righe da visualizzare, esempio-head(20)-

In [428]:
df_sanitation = pd.read_csv(r"C:\Users\franc\Documents\UNIMI\COMPUTER SOCIETY\WSH_SANITATION_SAFELY_MANAGED,WSH_SANITATION_BASIC.csv")

In [429]:
df_sanitation.head(20) 

Unnamed: 0.1,Unnamed: 0,2017,2017.1,2017.2,2017.3,2017.4,2017.5,2016,2016.1,2016.2,...,2001.2,2001.3,2001.4,2001.5,2000,2000.1,2000.2,2000.3,2000.4,2000.5
0,,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...
1,Country,Total,Urban,Rural,Total,Urban,Rural,Total,Urban,Rural,...,Rural,Total,Urban,Rural,Total,Urban,Rural,Total,Urban,Rural
2,Afghanistan,43,62,37,,,,42,60,36,...,22,,,,24,30,22,,,
3,Albania,98,98,97,40,40,39,98,98,97,...,83,38,40,37,88,97,82,39,40,38
4,Algeria,88,90,82,18,16,21,88,90,82,...,73,19,18,19,84,91,72,19,18,19
5,Andorra,100,100,100,100,100,100,100,100,100,...,100,22,22,22,100,100,100,15,15,15
6,Angola,50,64,23,,,,49,63,22,...,8,,,,28,48,8,,,
7,Antigua and Barbuda,88,,,,,,88,,,...,,,,,82,,,,,
8,Argentina,,96,,,,,94,96,77,...,69,,,,87,89,69,,,
9,Armenia,94,100,83,48,45,,93,100,83,...,77,47,45,,87,92,77,47,45,


Si tratta di una tabella a doppia entrata che mette in rilievo negli anni, dal 2013 al 2017, la percentuale di popolazione che fa uso dei servizi sanitari a seconda delle due diversificate zone, urbane e rurali, dei vari paesi.

**variabili:**
1. la percentuale di popolazione che utilizza almeno servizi sanitari di base
2. la percentuale di popolazione che utilizza servizi sanitari gestiti in sicurezza

3. variabile anni (dal 2013 al 2017)

4. la variabile Country di tipo categoriale, è divisa in 3 sotto categorie: Total, Urban and Rural. 

In [430]:
# vogliamo vedere che tipo di dati sono
type(df_sanitation)   # dataframe = dati in formato tabellare tipici della libraria Pandas

pandas.core.frame.DataFrame

In [347]:
 #avrei potuto utilizzare la print per vederle tutte ma graficamente è più efficace head()

***

### Operazioni per ordinare il dataframe

In [431]:
new = df_sanitation.rename(columns={'Unnamed: 0' : 'Years', '2017.1' : '2017', '2017.2' : '2017', '2017.3' : '2017', '2017.4' : '2017', '2017.5' : '2017', '2016.1' : '2016', '2016.2' : '2016', '2016.3' : '2016', '2016.4' : '2016', '2016.5' : '2016', '2001.1' : '2001', '2001.2' : '2001', '2001.3' : '2001', '2001.4' : '2001', '2001.5' : '2001', '2000.1' : '2000', '2000.2' : '2000', '2000.3' : '2000', '2000.4' : '2000', '2000.5' : '2000'})  # modificare l'anno 2017 con la funzione rename
new

Unnamed: 0,Years,2017,2017.1,2017.2,2017.3,2017.4,2017.5,2016,2016.1,2016.2,...,2001,2001.1,2001.2,2001.3,2000,2000.1,2000.2,2000.3,2000.4,2000.5
0,,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...
1,Country,Total,Urban,Rural,Total,Urban,Rural,Total,Urban,Rural,...,Rural,Total,Urban,Rural,Total,Urban,Rural,Total,Urban,Rural
2,Afghanistan,43,62,37,,,,42,60,36,...,22,,,,24,30,22,,,
3,Albania,98,98,97,40,40,39,98,98,97,...,83,38,40,37,88,97,82,39,40,38
4,Algeria,88,90,82,18,16,21,88,90,82,...,73,19,18,19,84,91,72,19,18,19
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
191,Venezuela (Bolivarian Republic of),94,,,24,,,94,,,...,,,,,,,,,,
192,Viet Nam,84,94,78,,,,82,94,76,...,45,,,,52,81,43,,,
193,Yemen,59,88,43,,67,,59,88,43,...,27,,67,,42,86,27,,67,
194,Zambia,26,36,19,,,,26,36,19,...,12,,,,24,46,12,,,


In [432]:
new = new.set_index('Years')   # indicizzo secondo Years
new

Unnamed: 0_level_0,2017,2017,2017,2017,2017,2017,2016,2016,2016,2016,...,2001,2001,2001,2001,2000,2000,2000,2000,2000,2000
Years,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...
Country,Total,Urban,Rural,Total,Urban,Rural,Total,Urban,Rural,Total,...,Rural,Total,Urban,Rural,Total,Urban,Rural,Total,Urban,Rural
Afghanistan,43,62,37,,,,42,60,36,,...,22,,,,24,30,22,,,
Albania,98,98,97,40,40,39,98,98,97,40,...,83,38,40,37,88,97,82,39,40,38
Algeria,88,90,82,18,16,21,88,90,82,18,...,73,19,18,19,84,91,72,19,18,19
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Venezuela (Bolivarian Republic of),94,,,24,,,94,,,24,...,,,,,,,,,,
Viet Nam,84,94,78,,,,82,94,76,,...,45,,,,52,81,43,,,
Yemen,59,88,43,,67,,59,88,43,,...,27,,67,,42,86,27,,67,
Zambia,26,36,19,,,,26,36,19,,...,12,,,,24,46,12,,,


***

### **ANALISI VALORI NULLI (NAN)**

Il linguaggio di programmazione Python che si sta utilizzando, introduce una serie di dati mancanti:

**-NONE
-NAN**

Visualizziamo e analizziamo nello specifico i valori nulli nel dataset

funzione ***isnull*** genera una maschera booleana che indica i valori mancanti

In [433]:
new.isnull().any  # restituisce valore booleano
# sono presenti alcuni valori nulli in determinate righe, dunque nella nostra analisi possono esserci problemi dovuti a missing value

<bound method DataFrame.any of                                      2017   2017   2017   2017   2017   2017  \
Years                                                                          
NaN                                 False  False  False  False  False  False   
Country                             False  False  False  False  False  False   
Afghanistan                         False  False  False   True   True   True   
Albania                             False  False  False  False  False  False   
Algeria                             False  False  False  False  False  False   
...                                   ...    ...    ...    ...    ...    ...   
Venezuela (Bolivarian Republic of)  False   True   True  False   True   True   
Viet Nam                            False  False  False   True   True   True   
Yemen                               False  False  False   True  False   True   
Zambia                              False  False  False   True   True   True   
Zimbabwe 

##### **Del processo di esportazione non si ha informazioni, durante tale operazione possono venire a mancare alcuni dati che si sono cancellati o persi** 

E' opportuno ragionare sia sui dati mancanti autodescrittivi, che sono evidenti ed espliciti, sia sul fatto che il risultato finale dell'analisi possibilmente sarà influenzato dai dati mancanti. 

**NAN (not a number)** è il tipo di dato mancante introdotto da Python specifico per i numeri decimali, è trattato come floating point (virgola mobile), diverso da **NONE** (codifica generica per tutti i dati-object-)

Non fare un'accurata analisi dei dati mancanti è un errore dal punto di vista metodologico



***Quale conseguenze in presenza di NAN?***

è possibile procedere in due modi: 

-sostituendo in modo esplicito Nan con 0 (np.nansum fa la somma interpretando NAN come 0), oppure con altri valori specificati (data.fillna(0)); 

-eliminando il valore mancante (dropna( )).

Cancellare l'intera riga o il 50% delle righe non porta ad avere efficaci risultati, perchè si eliminerebbero dati che per una variabile sono mancanti ma per altre no. 

##### Dunque sarebbe opportuno filtrare tutte le righe che restituiscono il valore booleano FALSE in un'unica colonna. 

OPPURE:

In [434]:
new.dropna(inplace=True) # oppure visualizzazione finale del dataset eliminando le righe in cui sono contenuti valori NaN
new

Unnamed: 0_level_0,2017,2017,2017,2017,2017,2017,2016,2016,2016,2016,...,2001,2001,2001,2001,2000,2000,2000,2000,2000,2000
Years,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using at least basic sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...,Population using safely managed sanitation se...
Country,Total,Urban,Rural,Total,Urban,Rural,Total,Urban,Rural,Total,...,Rural,Total,Urban,Rural,Total,Urban,Rural,Total,Urban,Rural
Albania,98,98,97,40,40,39,98,98,97,40,...,83,38,40,37,88,97,82,39,40,38
Algeria,88,90,82,18,16,21,88,90,82,18,...,73,19,18,19,84,91,72,19,18,19
Andorra,100,100,100,100,100,100,100,100,100,100,...,100,22,22,22,100,100,100,15,15,15
Austria,100,100,100,97,100,92,100,100,100,97,...,100,97,100,92,100,100,100,97,100,92
Belarus,98,98,96,81,82,76,98,98,96,80,...,96,90,88,94,92,90,96,90,88,94
Canada,99,99,99,82,82,82,99,99,99,82,...,99,77,77,77,100,100,99,77,77,77
China,85,91,76,72,84,56,83,90,74,68,...,44,27,30,25,56,77,44,27,29,25
Czechia,99,99,99,94,98,85,99,99,99,94,...,99,84,86,76,99,99,99,84,86,76


***

### FUNZIONI: TRANSPOSE E RENAME 

In [435]:
new = new.transpose()  # utilizzo la funzione 'Trasposta' per una nuova visualizzazione del dataframe, composto da 38 Stati come colonne e 109 righe riferite gli anni 
new

Years,NaN,Country,Albania,Algeria,Andorra,Austria,Belarus,Canada,China,Czechia,...,Russian Federation,Samoa,Senegal,Sierra Leone,Slovakia,Spain,Sweden,Switzerland,United Kingdom of Great Britain and Northern Ireland,United Republic of Tanzania
2017,Population using at least basic sanitation se...,Total,98,88,100,100,98,99,85,99,...,90,98,51,16,98,100,99,100,99,30
2017,Population using at least basic sanitation se...,Urban,98,90,100,100,98,99,91,99,...,95,98,65,26,99,100,99,100,99,43
2017,Population using at least basic sanitation se...,Rural,97,82,100,100,96,99,76,99,...,78,98,40,9,97,100,100,100,99,24
2017,Population using safely managed sanitation se...,Total,40,18,100,97,81,82,72,94,...,61,48,21,13,83,97,93,100,98,25
2017,Population using safely managed sanitation se...,Urban,40,16,100,100,82,82,84,98,...,63,38,22,20,88,97,94,100,99,31
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2000,Population using at least basic sanitation se...,Urban,97,91,100,100,90,100,77,99,...,94,99,63,20,99,100,99,100,99,11
2000,Population using at least basic sanitation se...,Rural,82,72,100,100,96,99,44,99,...,55,97,23,5,97,100,100,100,99,3
2000,Population using safely managed sanitation se...,Total,39,19,15,97,90,77,27,84,...,55,49,14,9,84,94,92,98,97,4
2000,Population using safely managed sanitation se...,Urban,40,18,15,100,88,77,29,86,...,61,41,16,16,90,95,93,98,99,9


In [436]:
new1 = new.rename(columns={'Country' : 'Residence', 'NaN' : 'Sanitation'})
new1

Years,NaN,Residence,Albania,Algeria,Andorra,Austria,Belarus,Canada,China,Czechia,...,Russian Federation,Samoa,Senegal,Sierra Leone,Slovakia,Spain,Sweden,Switzerland,United Kingdom of Great Britain and Northern Ireland,United Republic of Tanzania
2017,Population using at least basic sanitation se...,Total,98,88,100,100,98,99,85,99,...,90,98,51,16,98,100,99,100,99,30
2017,Population using at least basic sanitation se...,Urban,98,90,100,100,98,99,91,99,...,95,98,65,26,99,100,99,100,99,43
2017,Population using at least basic sanitation se...,Rural,97,82,100,100,96,99,76,99,...,78,98,40,9,97,100,100,100,99,24
2017,Population using safely managed sanitation se...,Total,40,18,100,97,81,82,72,94,...,61,48,21,13,83,97,93,100,98,25
2017,Population using safely managed sanitation se...,Urban,40,16,100,100,82,82,84,98,...,63,38,22,20,88,97,94,100,99,31
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2000,Population using at least basic sanitation se...,Urban,97,91,100,100,90,100,77,99,...,94,99,63,20,99,100,99,100,99,11
2000,Population using at least basic sanitation se...,Rural,82,72,100,100,96,99,44,99,...,55,97,23,5,97,100,100,100,99,3
2000,Population using safely managed sanitation se...,Total,39,19,15,97,90,77,27,84,...,55,49,14,9,84,94,92,98,97,4
2000,Population using safely managed sanitation se...,Urban,40,18,15,100,88,77,29,86,...,61,41,16,16,90,95,93,98,99,9


#### Dalla visualizzazione di questa tabella è possibile osservare come in media, la percentuale più alta della popolazione delle varie nazioni, sia che utilizzino servizi sanitari di base, sia che utilizzino servizi gestiti in più sicurezza, appartiene alla classe urbana (e non a quella rurale). 
#### Da ciò si deduce come un primo fattore di incidenza, nonostante il minimo scostamento per alcune nazioni, sia la posizione geografica di residenza: la popolazione che vive in zone urbane ed è impiegata in lavori più salubri e redditizi avrà maggior accesso alla sanità. 

##### -questi dati sono importanti al fine di spiegare questa correlazione.

***

### STATISTICHE DESCRITTIVE

In [437]:
frame = new1.iloc[:6, 2:6] # dal dataframe originale voglio selezionare, in base all'anno 2017, tutti gli Stati la cui iniziale è la lettera A 
frame

Years,Albania,Algeria,Andorra,Austria
2017,98,88,100,100
2017,98,90,100,100
2017,97,82,100,100
2017,40,18,100,97
2017,40,16,100,100
2017,39,21,100,92


In [438]:
Stati = ['Albania', 'Algeria', 'Andorra', 'Austria']
print(Stati)

['Albania', 'Algeria', 'Andorra', 'Austria']


In [439]:
Anno = [2017]
print(Anno)

[2017]


In [451]:
Sanitation = ['Basictotal', 'Safelytotal']
Sanitation

['Basictotal', 'Safelytotal']

In [452]:
Basictotal = [98, 88, 100, 100]   # estraggo il totale di popolazione che fa uso di servizi sanitari di base (sia in zone urbane, sia in zone rurali)

In [453]:
Safelytotal = [40, 18, 100, 97]   # estraggo il totale di popolazione che fa uso di servizi sanitari sicuri (sia in zone urbane che rurali)

In [454]:
print(Basictotal)                
print(Safelytotal)

[98, 88, 100, 100]
[40, 18, 100, 97]


In [455]:
print(type(Basictotal))           # stampo e osservo il tipo di dati, fanno parte di una lista
print(type(Safelytotal))

<class 'list'>
<class 'list'>


In [456]:
Basictotal=np.array(Basictotal)     # adesso creo un array unidimensionale con i valori selezionati
Safelytotal=np.array(Safelytotal)
print(Basictotal)
print(Safelytotal)

[ 98  88 100 100]
[ 40  18 100  97]


In [457]:
print(type(Basictotal))
print(type(Safelytotal))

<class 'numpy.ndarray'>
<class 'numpy.ndarray'>


In [477]:
print("Population using at least basic sanitation services (%):",    np.mean(Basictotal), "%")    # ho ottenuto la media della popolazione che utilizza servizi igienico-sanitari di base e la media di quella che invece utilizza servizi sanitari gestiti in maggior sicurezza
print("Population using safely managed sanitation services (%):",    np.mean(Safelytotal), "%")

Population using at least basic sanitation services (%): 96.5 %
Population using safely managed sanitation services (%): 63.75 %


In [481]:
print(np.var(Basictotal))   # calcolo la varianza 
print(np.var(Safelytotal))

24.75
1269.1875


In [484]:
print(np.std(Basictotal))  # calcolo la deviazione standard
print(np.std(Safelytotal))

4.9749371855331
35.62565788866221


In [522]:
print(np.percentile(Basictotal, 50))    # calcolo il secondo percentile, cioè la Mediana o valore mediano della distribuzione
print(np.percentile(Safelytotal, 50))

99.0
68.5


In [523]:
print(np.percentile(Basictotal, 75))    # calcolo il terzo percentile, il 75% della distribuzione 
print(np.percentile(Safelytotal, 75))

100.0
97.75


### Da questa analisi si evince come solo il ***63%*** della popolazione (sul totale di chi risiede sia in zone urbane che rurali), gode di maggiori prevenzioni sanitarie, in termini di utilizzo di servizi più sicuri ed efficienti, rispetto ad una più alta percentuale di popolazione, ***(96%)***, il cui utilizzo è circoscritto ai soli servizi sanitari di base. 

***

#### **ANALISI GRAFICA DEL DATAFRAME**

In [501]:
%matplotlib inline 
import matplotlib.pyplot as plt
import seaborn as sns

In [513]:
Stati = ['Albania', 'Algeria', 'Andorra', 'Austria']

In [514]:
Sanitation = ['Basictotal', 'Safelytotal']
Sanitation

['Basictotal', 'Safelytotal']

In [515]:
sns.scatterplot(data = frame, x = 'Stati', y = 'Sanitation')
plt.show()

ValueError: Could not interpret input 'Stati'

In [519]:
new1 = sns.load_dataset('frame')

HTTPError: HTTP Error 404: Not Found

***

### **CONCATENAZIONE DI DATAFRAME**

In [397]:
import pandas as pd
import numpy as np

In [106]:
df = pd.read_csv(r"C:\Users\franc\Documents\UNIMI\COMPUTER SOCIETY\LIFEEX.csv")  # importo un altro dataframe 
df

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,Life expectancy at birth (years),Life expectancy at birth (years).1,Life expectancy at birth (years).2,Life expectancy at age 60 (years),Life expectancy at age 60 (years).1,Life expectancy at age 60 (years).2,Healthy life expectancy (HALE) at birth (years),Healthy life expectancy (HALE) at birth (years).1,Healthy life expectancy (HALE) at birth (years).2,Healthy life expectancy (HALE) at age 60 (years),Healthy life expectancy (HALE) at age 60 (years).1,Healthy life expectancy (HALE) at age 60 (years).2
0,WHO region,Year,Both sexes,Male,Female,Both sexes,Male,Female,Both sexes,Male,Female,Both sexes,Male,Female
1,Global,2016,72.0,69.8,74.2,20.5,19.0,21.9,63.3,62.0,64.8,15.8,14.8,16.8
2,Global,2015,71.7,69.5,73.9,20.4,18.9,21.8,63.0,61.7,64.5,15.7,14.7,16.6
3,Global,2010,70.1,68.0,72.3,19.9,18.4,21.3,61.7,60.4,63.1,15.3,14.3,16.2
4,Global,2005,68.2,66.1,70.3,19.3,17.8,20.7,60.0,58.7,61.3,14.8,13.7,15.8
5,Global,2000,66.5,64.4,68.7,18.8,17.2,20.2,58.5,57.2,59.9,14.3,13.2,15.4
6,Africa,2016,61.2,59.6,62.7,16.6,15.9,17.3,53.8,52.6,54.9,12.5,12.0,13.1
7,Africa,2015,60.7,59.1,62.2,16.6,15.8,17.3,53.3,52.1,54.4,12.4,11.8,13.0
8,Africa,2010,57.6,56.4,58.8,16.1,15.4,16.7,50.4,49.6,51.3,12.0,11.5,12.4
9,Africa,2005,53.4,52.3,54.4,15.4,14.7,16.1,46.7,46.0,47.4,11.4,10.9,11.8


* Per prima cosa faccio qualche operazione per ordinare il mio dataset

In [107]:
data = df.rename(columns={'Unnamed: 0' : 'Continent', 'Unnamed: 1' : 'Year'})  # per rinominare il nome della prima e seconda colonna, passo i vecchi e nuovi nomi come dizionario
# ho utilizzato la funzione rename
data

Unnamed: 0,Continent,Year,Life expectancy at birth (years),Life expectancy at birth (years).1,Life expectancy at birth (years).2,Life expectancy at age 60 (years),Life expectancy at age 60 (years).1,Life expectancy at age 60 (years).2,Healthy life expectancy (HALE) at birth (years),Healthy life expectancy (HALE) at birth (years).1,Healthy life expectancy (HALE) at birth (years).2,Healthy life expectancy (HALE) at age 60 (years),Healthy life expectancy (HALE) at age 60 (years).1,Healthy life expectancy (HALE) at age 60 (years).2
0,WHO region,Year,Both sexes,Male,Female,Both sexes,Male,Female,Both sexes,Male,Female,Both sexes,Male,Female
1,Global,2016,72.0,69.8,74.2,20.5,19.0,21.9,63.3,62.0,64.8,15.8,14.8,16.8
2,Global,2015,71.7,69.5,73.9,20.4,18.9,21.8,63.0,61.7,64.5,15.7,14.7,16.6
3,Global,2010,70.1,68.0,72.3,19.9,18.4,21.3,61.7,60.4,63.1,15.3,14.3,16.2
4,Global,2005,68.2,66.1,70.3,19.3,17.8,20.7,60.0,58.7,61.3,14.8,13.7,15.8
5,Global,2000,66.5,64.4,68.7,18.8,17.2,20.2,58.5,57.2,59.9,14.3,13.2,15.4
6,Africa,2016,61.2,59.6,62.7,16.6,15.9,17.3,53.8,52.6,54.9,12.5,12.0,13.1
7,Africa,2015,60.7,59.1,62.2,16.6,15.8,17.3,53.3,52.1,54.4,12.4,11.8,13.0
8,Africa,2010,57.6,56.4,58.8,16.1,15.4,16.7,50.4,49.6,51.3,12.0,11.5,12.4
9,Africa,2005,53.4,52.3,54.4,15.4,14.7,16.1,46.7,46.0,47.4,11.4,10.9,11.8


In [108]:
data.set_index('Continent')

Unnamed: 0_level_0,Year,Life expectancy at birth (years),Life expectancy at birth (years).1,Life expectancy at birth (years).2,Life expectancy at age 60 (years),Life expectancy at age 60 (years).1,Life expectancy at age 60 (years).2,Healthy life expectancy (HALE) at birth (years),Healthy life expectancy (HALE) at birth (years).1,Healthy life expectancy (HALE) at birth (years).2,Healthy life expectancy (HALE) at age 60 (years),Healthy life expectancy (HALE) at age 60 (years).1,Healthy life expectancy (HALE) at age 60 (years).2
Continent,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
WHO region,Year,Both sexes,Male,Female,Both sexes,Male,Female,Both sexes,Male,Female,Both sexes,Male,Female
Global,2016,72.0,69.8,74.2,20.5,19.0,21.9,63.3,62.0,64.8,15.8,14.8,16.8
Global,2015,71.7,69.5,73.9,20.4,18.9,21.8,63.0,61.7,64.5,15.7,14.7,16.6
Global,2010,70.1,68.0,72.3,19.9,18.4,21.3,61.7,60.4,63.1,15.3,14.3,16.2
Global,2005,68.2,66.1,70.3,19.3,17.8,20.7,60.0,58.7,61.3,14.8,13.7,15.8
Global,2000,66.5,64.4,68.7,18.8,17.2,20.2,58.5,57.2,59.9,14.3,13.2,15.4
Africa,2016,61.2,59.6,62.7,16.6,15.9,17.3,53.8,52.6,54.9,12.5,12.0,13.1
Africa,2015,60.7,59.1,62.2,16.6,15.8,17.3,53.3,52.1,54.4,12.4,11.8,13.0
Africa,2010,57.6,56.4,58.8,16.1,15.4,16.7,50.4,49.6,51.3,12.0,11.5,12.4
Africa,2005,53.4,52.3,54.4,15.4,14.7,16.1,46.7,46.0,47.4,11.4,10.9,11.8


In [109]:
data.groupby(['Continent'])[['Life expectancy at birth (years)']].aggregate('max')

Unnamed: 0_level_0,Life expectancy at birth (years)
Continent,Unnamed: 1_level_1
Africa,61.2
Americas,76.8
Eastern Mediterranean,69.1
Europe,77.5
Global,72.0
South-East Asia,69.5
WHO region,Both sexes
Western Pacific,76.9


In [444]:
pd.concat([tabella, data], join = 'inner',  ignore_index = False, axis = 1)

Unnamed: 0,Unnamed: 1,0,Continent,Year,Life expectancy at birth (years),Life expectancy at birth (years).1,Life expectancy at birth (years).2,Life expectancy at age 60 (years),Life expectancy at age 60 (years).1,Life expectancy at age 60 (years).2,Healthy life expectancy (HALE) at birth (years),Healthy life expectancy (HALE) at birth (years).1,Healthy life expectancy (HALE) at birth (years).2,Healthy life expectancy (HALE) at age 60 (years),Healthy life expectancy (HALE) at age 60 (years).1,Healthy life expectancy (HALE) at age 60 (years).2
