## Introduction

This study aims to analyze the data on Gross Domestic Product (GDP) and life expectancy provided by the World Health Organization and the World Bank. The analysis will focus on examining the potential relationships between these two variables across six countries. The data will be carefully examined to identify any correlations or patterns that may exist between GDP and life expectancy. By conducting this study, we hope to gain a deeper understanding of the complex relationship between these important socioeconomic indicators.

**Questions that will be answered:**

+ Is there a correlation between a country's GDP and life expectancy?
+ Which country has the highest GDP and life expectancy, and how do they compare to the other countries in the analysis?
+ Is there a significant difference in the GDP and life expectancy of developed countries versus developing countries in the analysis?
+ Does the government expenditure on healthcare in each country have a direct impact on life expectancy?
+ How has the GDP and life expectancy of each country changed over time, and what factors may have contributed to these changes?

**Data Sources**

+ GDP Source: [World Bank](https://data.worldbank.org/indicator/NY.GDP.MKTP.CD) national accounts data, and OECD National Accounts data files.

+ Life expectancy Data Source: [World Health Organization](http://apps.who.int/gho/data/node.main.688)


## Import Python Modules

In [2]:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
%matplotlib inline

### Loading the Data

Firstly we need to load the data. The dataset named $all\_data.csv$ contains information about GDP and life expectancy of 6 countries. The csv file will be load as a Data Frame named $data$.

In [4]:
data = pd.read_csv('all_data.csv')

data.head()

Unnamed: 0,Country,Year,Life expectancy at birth (years),GDP
0,Chile,2000,77.3,77860930000.0
1,Chile,2001,77.3,70979920000.0
2,Chile,2002,77.8,69736810000.0
3,Chile,2003,77.9,75643460000.0
4,Chile,2004,78.0,99210390000.0


### EDA

Primarly, we will check general information about the Data Frame. We can see there are 96 entires without null values

In [5]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 96 entries, 0 to 95
Data columns (total 4 columns):
 #   Column                            Non-Null Count  Dtype  
---  ------                            --------------  -----  
 0   Country                           96 non-null     object 
 1   Year                              96 non-null     int64  
 2   Life expectancy at birth (years)  96 non-null     float64
 3   GDP                               96 non-null     float64
dtypes: float64(2), int64(1), object(1)
memory usage: 3.1+ KB


We can also see that the 6 different countries are: Chile, China, Germany, Mexico, USA and Zimbabwe. The selections of countries makes a representation from each continent.

In [6]:
print(data.Country.unique())

['Chile' 'China' 'Germany' 'Mexico' 'United States of America' 'Zimbabwe']


Also, we can see data from 2000 to 2015, which is also relevant information aas there are 15 yearas.

In [8]:
print(data.Year.unique())

[2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
 2014 2015]


### Conclusions

### Further Research