# Life Expectancy and GDP Analysis


The goal of this project is to investigate the relation between the life expectancy of its citizens and the economic output of a country (GDP) across six countries; **Peru, Chile, Mexico, UK, China and United States of America**.


- Has life expectancy increased over time in the six nations?
- Has GDP increased over time in the six nations?
- Is there a correlation between GDP and life expectancy of a country?
- What is the average life expectancy in these nations?
- What is the distribution of that life expectancy?


Also we will try to conclude:


- What did you learn throughout the process?
- Are the results what you expected?
- What are the key findings and takeaways?


**Data sources**

- GDP Source: [World Bank Group](https://data.worldbank.org/indicator/NY.GDP.MKTP.CD)

- Life expectancy Data Source: [World Bank Group](https://data.worldbank.org/indicator/SP.DYN.LE00.IN)


## Import libraries


In [214]:
import numpy as np
import pandas as pd
import plotly.express as px
from scipy.stats import pearsonr

## Load csv


In [215]:
df = pd.read_csv("filtered_gdp_lifeExp.csv")

## Inspect all data


In [216]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 384 entries, 0 to 383
Data columns (total 7 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Country Name            384 non-null    object 
 1   Country Code            384 non-null    object 
 2   Indicator Name GDP      384 non-null    object 
 3   Year                    384 non-null    int64  
 4   GDP                     362 non-null    float64
 5   Indicator Name LifeExp  384 non-null    object 
 6   Life Expectancy         378 non-null    float64
dtypes: float64(2), int64(1), object(4)
memory usage: 21.1+ KB


In [217]:
df.describe(include="all")

Unnamed: 0,Country Name,Country Code,Indicator Name GDP,Year,GDP,Indicator Name LifeExp,Life Expectancy
count,384,384,384,384.0,362.0,384,378.0
unique,6,6,1,,,1,
top,Chile,CHL,GDP (current US$),,,"Life expectancy at birth, total (years)",
freq,64,64,384,,,384,
mean,,,,1991.5,2452890000000.0,,70.392108
std,,,,18.497054,4885454000000.0,,7.931819
min,,,,1960.0,4110000000.0,,33.275
25%,,,,1975.75,78400840000.0,,66.0185
50%,,,,1991.5,310920800000.0,,72.637
75%,,,,2007.25,1761255000000.0,,76.01575


## Create a function to display information over time

In [218]:
def point_plot(df, feature, countries):
  # Filter out the countries of interest
  filtered_df = df[df['Country Name'].isin(countries)].dropna(subset=[feature])

  # Create a new column for the GDP in Trillions values
  filtered_df['GDP in Trillions'] = filtered_df[feature].apply(lambda x: f"${x/1e12:.2f}T")

  feature_info = {'GDP in Trillions': True, feature: False} if feature == "GDP" else {feature: True}

  # Create the plot
  fig = px.line(
    filtered_df, x='Year', y=feature, color='Country Name', 
    hover_data={'Country Name': True, 'Year': True, **feature_info},
    markers=True
  )

  # Update layout for rotated x-axis labels and increased height
  fig.update_layout(
    xaxis=dict(tickmode='array', tickvals=filtered_df['Year'].unique(), tickangle=-70),
    yaxis_title="GDP in Trillions of U.S. Dollars" if feature == "GDP" else "Life expectancy at birth (years)",
    xaxis_title="Year",
    title=dict(text= f"{feature} Over Time for Different Countries", x=0.5),
    legend_title="Country",
    height=1000
  )

  fig.update_traces(marker=dict(size=13))

  return fig.show()

## Inspect GDP and Life Expectancy data

In [219]:
countries = ["United States", "China", "United Kingdom", "Mexico", "Chile", "Peru"]
point_plot(df=df, feature="GDP", countries=countries)
point_plot(df=df, feature="Life Expectancy", countries=countries)

## Divide the data

There is too much deference between these six countries. For better visualization, we are going to divide all the data in two groups: 
- "United States", "China" and "United Kingdom"
- "Mexico", "Chile" and "Peru"

In [220]:
top_three_countries = ["United States", "China", "United Kingdom"]
bottom_three_countries = ["Mexico", "Chile", "Peru"]

In [221]:
point_plot(df=df, feature="GDP", countries=top_three_countries)

In [222]:
point_plot(df=df, feature="GDP", countries=bottom_three_countries)

#### Has GDP increased over time in the six nations?


- The USA only experienced GDP drops in 2008 and 2019.
- China's GDP has stopped increasing since 2021.
- The UK's GDP had significant decreases in 2007, 2014, and 2019.
- In Mexico, decreases occurred in 1994, 2008, 2014, and 2019.
- For Chile, GDP decreased around 2008, 2013, 2018, and 2021.
- Peru's GDP did not increase in 2008 and dropped in 2014 and 2019.
- The GDP of the USA and China is increasing exponentially.
- UK's GDP has been stagnant since 2007.
- We can see that China's GDP started to increase exponentially around 2005.
- China did not had a decrease in 2008 and 2019 like other countries.
- Mexico has increased its GDP, but it has experienced several notable decreases over time.

- In 2008, the world faced a severe financial crisis, often referred to as the Great Recession. Major financial institutions collapsed, leading to widespread economic downturns.
- In 2014, the global economy was still recovering from the effects of the 2008 financial crisis. Many countries, especially those heavily reliant on global trade or commodities, faced challenges as global demand fluctuated. Also, there were political conflicts and regional instability, such as the annexation of Crimea and the Ebola outbreak in West Africa, disrupted economic activities and investor confidence in affected regions.
- In 2019, the COVID-19 pandemic began in Wuhan, China. The virus spread rapidly worldwide, leading to unprecedented public health measures.

In [223]:
point_plot(df=df, feature="Life Expectancy", countries=top_three_countries)

In [224]:
point_plot(df=df, feature="Life Expectancy", countries=bottom_three_countries)

In [225]:
mediam_by_countri = df.groupby("Country Name")["Life Expectancy"].median().reset_index().rename(columns={"Life Expectancy": "Median LifeExpec"})
print(mediam_by_countri)

     Country Name  Median LifeExpec
0           Chile         73.574000
1           China         68.169000
2          Mexico         70.133000
3            Peru         65.434000
4  United Kingdom         76.082927
5   United States         75.365854


#### Has life expectancy increased over time in the six nations?

- All countries have increased the life expectancy over the years.
- In 2019, all nations decreased its Life Expectancy, but not China.
- China has increased notably their life expectancy at birth through the years.
- Life Expectancy in Mexico is staging since 2004.

## Correlation between GDP and Life Expectancy

In [226]:
fig = px.scatter(df, x="Life Expectancy", y="GDP", color="Country Name")
fig.show()

In [227]:
country_colors = {
    "Chile": "#1f77b4",
    "China": "#ff7f0e",
    "United Kingdom": "#2ca02c",
    "Mexico": "#9467bd",
    "Peru": "#d62728",
    "United States": "#17becf"
}

for country in countries:
  country_df = df[df["Country Name"] == country].dropna(subset=["GDP", "Life Expectancy"])
  fig = px.scatter(country_df, x="Life Expectancy", y="GDP", color="Country Name" ,color_discrete_map=country_colors)
  fig.show()
  correlation = country_df['GDP'].corr(df['Life Expectancy'])
  print(f"{country} correlation: {correlation}")
  corr, pvalue = pearsonr(country_df["GDP"], country_df["Life Expectancy"])
  print(f"{country} pearson corr: {corr}")
  print(f"{country} pearson pvalue: {pvalue}")


United States correlation: 0.8733998663254097
United States pearson corr: 0.8733998663254099
United States pearson pvalue: 1.0084064525724126e-20


China correlation: 0.6256033408048534
China pearson corr: 0.6256033408048536
China pearson pvalue: 4.203590353006337e-08


United Kingdom correlation: 0.9753399701880273
United Kingdom pearson corr: 0.9753399701880278
United Kingdom pearson pvalue: 9.791927542132303e-42


Mexico correlation: 0.8360965838843155
Mexico pearson corr: 0.8360965838843157
Mexico pearson pvalue: 1.4999133539884228e-17


Chile correlation: 0.771631778639865
Chile pearson corr: 0.7716317786398649
Chile pearson pvalue: 1.346776535237519e-13


Peru correlation: 0.8251354907456356
Peru pearson corr: 0.8251354907456359
Peru pearson pvalue: 3.237415574235191e-11


#### Is there a correlation between GDP and life expectancy of a country?
- As we can see, there is a strong positive correlation between GDP and Life Expectancy for each country.
- Higher GDP generally correlates with better healthcare infrastructure, including access to medical facilities, doctors, and medicines. This improves healthcare outcomes and reduces mortality rates, thus increasing life expectancy.