# Analysis of Global Life Expectancy and Socio-Economic Indicators #

## Introduction: ##
Life expectancy is a crucial indicator of a country's quality of life. It is influenced by various socio-economic factors such as access to healthcare, education, income, and the quality of the environment. In this analysis, we will explore World Bank data to understand global trends in life expectancy and its relationship with socio-economic indicators.

In [36]:
import pandas as pd 
import plotly.express as px 
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from plotly.express import choropleth
import plotly.graph_objs as go

In [72]:
#import data 
df = pd.read_csv("life expectancy.csv")

#Display dataset
df.head()

Unnamed: 0,Country Name,Country Code,Region,IncomeGroup,Year,Life Expectancy World Bank,Prevelance of Undernourishment,CO2,Health Expenditure %,Education Expenditure %,Unemployment,Corruption,Sanitation,Injuries,Communicable,NonCommunicable
0,Afghanistan,AFG,South Asia,Low income,2001,56.308,47.8,730.0,,,10.809,,,2179727.1,9689193.7,5795426.38
1,Angola,AGO,Sub-Saharan Africa,Lower middle income,2001,47.059,67.5,15960.0,4.483516,,4.004,,,1392080.71,11190210.53,2663516.34
2,Albania,ALB,Europe & Central Asia,Upper middle income,2001,74.288,4.9,3230.0,7.139524,3.4587,18.575001,,40.520895,117081.67,140894.78,532324.75
3,Andorra,AND,Europe & Central Asia,High income,2001,,,520.0,5.865939,,,,21.78866,1697.99,695.56,13636.64
4,United Arab Emirates,ARE,Middle East & North Africa,High income,2001,74.544,2.8,97200.0,2.48437,,2.493,,,144678.14,65271.91,481740.7


In [4]:
#empty information in dataframe
df.isna().any()

Country Name                      False
Country Code                      False
Region                            False
IncomeGroup                       False
Year                              False
Life Expectancy World Bank         True
Prevelance of Undernourishment     True
CO2                                True
Health Expenditure %               True
Education Expenditure %            True
Unemployment                       True
Corruption                         True
Sanitation                         True
Injuries                          False
Communicable                      False
NonCommunicable                   False
dtype: bool

# 1. Global Trends in Life Expectancy: #

In [17]:
#remove missing values
df_life = df.dropna(subset=["Life Expectancy World Bank", "Sanitation", "Health Expenditure %", "CO2" ])
df_life.head()

Unnamed: 0,Country Name,Country Code,Region,IncomeGroup,Year,Life Expectancy World Bank,Prevelance of Undernourishment,CO2,Health Expenditure %,Education Expenditure %,Unemployment,Corruption,Sanitation,Injuries,Communicable,NonCommunicable
2,Albania,ALB,Europe & Central Asia,Upper middle income,2001,74.288,4.9,3230.0,7.139524,3.4587,18.575001,,40.520895,117081.67,140894.78,532324.75
5,Argentina,ARG,Latin America & Caribbean,Upper middle income,2001,73.755,3.0,125260.0,8.371798,4.83374,17.32,,48.053996,1397676.07,1507068.98,8070909.52
6,Armenia,ARM,Europe & Central Asia,Upper middle income,2001,71.8,26.1,3600.0,4.645627,2.46944,10.912,,46.351896,103371.75,122238.13,767916.19
9,Australia,AUS,East Asia & Pacific,High income,2001,79.634146,2.5,345640.0,7.696229,,6.74,,58.788894,612233.81,208282.73,4158052.86
10,Austria,AUT,Europe & Central Asia,High income,2001,78.57561,2.5,67910.0,9.269429,5.57548,4.01,,99.679399,240208.86,77701.17,2101883.59


In [18]:
#average life expectancy over the years.

average_life_expectancy = df_life.groupby("Year")["Life Expectancy World Bank"].mean().reset_index()

fig_average = px.line(average_life_expectancy, x="Year", y="Life Expectancy World Bank", title="Average Annual Life Expectancy")

fig_average.show()


Observation of a steady increase in global life expectancy in recent decades, reflecting improvements in health and living standards.

# 2. Correlation between Income and Life Expectancy:

In [7]:
df_life.head()

Unnamed: 0,Country Name,Country Code,Region,IncomeGroup,Year,Life Expectancy World Bank,Prevelance of Undernourishment,CO2,Health Expenditure %,Education Expenditure %,Unemployment,Corruption,Sanitation,Injuries,Communicable,NonCommunicable
2,Albania,ALB,Europe & Central Asia,Upper middle income,2001,74.288,4.9,3230.0,7.139524,3.4587,18.575001,,40.520895,117081.67,140894.78,532324.75
5,Argentina,ARG,Latin America & Caribbean,Upper middle income,2001,73.755,3.0,125260.0,8.371798,4.83374,17.32,,48.053996,1397676.07,1507068.98,8070909.52
6,Armenia,ARM,Europe & Central Asia,Upper middle income,2001,71.8,26.1,3600.0,4.645627,2.46944,10.912,,46.351896,103371.75,122238.13,767916.19
9,Australia,AUS,East Asia & Pacific,High income,2001,79.634146,2.5,345640.0,7.696229,,6.74,,58.788894,612233.81,208282.73,4158052.86
10,Austria,AUT,Europe & Central Asia,High income,2001,78.57561,2.5,67910.0,9.269429,5.57548,4.01,,99.679399,240208.86,77701.17,2101883.59


In [8]:
#Life Expectancy X Income Group 
average_life_expectancy = df_life.groupby("IncomeGroup")["Life Expectancy World Bank"].mean().reset_index()
fig_income = px.bar(average_life_expectancy, x="Life Expectancy World Bank", y="IncomeGroup")

fig_income.show()

Observation of a positive correlation between per capita income and life expectancy, suggesting that wealthier countries tend to have higher life expectancy.

# 3. Education and Life Expectancy: 

In [9]:
average_education_expenditure = df_life.groupby("Region")["Education Expenditure %"].mean().reset_index()

fig_education = px.bar(average_education_expenditure, 
                       x="Region", 
                       y="Education Expenditure %", 
                       #color="Life Expectancy World Bank", 
                       title="Education Expenditure X Region")

fig_education.show()

In [10]:
#Region X Life Expectancy

average_life_region = df_life.groupby("Region")["Life Expectancy World Bank"].mean().reset_index()

fig_education = px.bar(average_life_region, 
                       x="Region", 
                       y="Life Expectancy World Bank", 
                       #color="Life Expectancy World Bank", 
                       title="Education Expenditure X Region")

fig_education.show()

The conclusion reached is that the life expectancy of regions is related to the percentage of education expenditure, so education is one of the factors contributing to increased life expectancy.

# 4. CO2

In [58]:


#average prod CO2 in the world

sum_co2 = df_life.groupby("Country Name")["CO2"].sum().reset_index()

#top10 prod C02
df_sorted = sum_co2.sort_values(by="CO2", ascending=False)

top10_co2 = df_sorted.head(10)
top10_co2



Unnamed: 0,Country Name,CO2
20,China,149386300.0
100,United States,100815000.0
41,India,31741720.0
46,Japan,22492080.0
34,Germany,14642610.0
16,Canada,10438670.0
99,United Kingdom,8881190.0
61,Mexico,8513210.0
82,Saudi Arabia,8075290.0
45,Italy,7551710.0


In [70]:
fig_co2 = go.Figure(data=go.Choropleth(
    locations=sum_co2['Country Name'], # Spatial coordinates
    z = sum_co2['CO2'], # Data to be color-coded
    locationmode = 'country names', # set of locations match entries in `locations`
    colorscale = 'Reds',
    colorbar_title = "CO2 Emission",
))

fig_co2.show()

It is observed that China is the largest producer of CO2, followed by the USA, India, and Japan.

In [68]:
chinaco2 = "China"

chinaco2 = df_life[df_life["Country Name"] == chinaco2]

china_prod = px.bar(chinaco2,
                    x="Year",
                    y="CO2",
                    labels={'x':'Ano', 'y':'Produção de CO2'},
                    title="China's CO2 production by year")
china_prod.show()

It is noticed that there is a steadily increasing rise in CO2 production in China

In [69]:
eua = "United States"

euaco2 = df_life[df_life["Country Name"] == eua]

eua_prod = px.bar(euaco2,
                    x="Year",
                    y="CO2",
                    labels={'x':'Ano', 'y':'Produção de CO2'},
                    title="EUA's CO2 production by year")
eua_prod.show()

But in the USA, which is the second-largest producer, there is a decrease over the years.

# Conclusion

It is noticed that the global life expectancy increases every year. It is noteworthy that there are several factors linked to life expectancy. These socioeconomic factors such as income vs. life expectancy (as shown in the first topic of the analysis) show us that a country's economic power is directly related to how its population lives. The way a country spends on health and education are other important factors, and industrialization of a country is another contributing factor, considering that China and the USA, the two largest CO2 producers, also have some of the highest life expectancies.