## Project Title

### Project Description

Our team will focus on studying the relationship (if any) of the average land temperature (i.e., #2 proposed dataset) and the evolution of population in such land, along with its GDP. Datasets will be pulled from the World Bank and are available in `.csv`, `.xml` and `.xlsx` formats in the following links, respectively:

[Population](https://data.worldbank.org/indicator/SP.POP.TOTL)

[GDP (2015 USD)](https://data.worldbank.org/indicator/NY.GDP.MKTP.KD)

Behavior will be analysed mainly by visual means (such as scatterplots, pyramid populations, etc.) and reported in Markdown in the corresponding .ipynb file. 

> Imported libraries

In [8]:
import pandas as pd
import geopandas
import folium 

## Programming Component


### Reading and cleaning the main dataset


In [21]:
df = pd.read_csv('..\data\land-data\GlobalLandTemperaturesByCountry.csv') # Reading GlobalLandTemperaturesByCountry csv file
df.head()

Unnamed: 0,dt,AverageTemperature,AverageTemperatureUncertainty,Country
0,1743-11-01,4.384,2.294,Åland
1,1743-12-01,,,Åland
2,1744-01-01,,,Åland
3,1744-02-01,,,Åland
4,1744-03-01,,,Åland


Next setp will be convert ´dt´ colum in to a date type one and we will only be focusing in the ´year´ information. 

In [22]:
df.dt = pd.to_datetime(df.dt) # Convert to datetime
df.dt = df.dt.dt.year # Extract year

In [23]:
df.rename(columns={"dt": "Year"}, inplace=True) # Rename column
df_year_temp =df.groupby(['Country', 'Year']).mean() # Group by country and year
df_year_temp.drop(columns=['AverageTemperatureUncertainty'], inplace=True) # Drop unnecessary column
df_year_temp.head() 

Unnamed: 0_level_0,Unnamed: 1_level_0,AverageTemperature
Country,Year,Unnamed: 2_level_1
Afghanistan,1838,18.379571
Afghanistan,1839,
Afghanistan,1840,13.413455
Afghanistan,1841,13.9976
Afghanistan,1842,15.154667


In [24]:
df_year_temp.reset_index(inplace=True) # Let's reset the index so that we can use the Country and Year columns as a key to merge with the other datasets
df_year_temp.head()

Unnamed: 0,Country,Year,AverageTemperature
0,Afghanistan,1838,18.379571
1,Afghanistan,1839,
2,Afghanistan,1840,13.413455
3,Afghanistan,1841,13.9976
4,Afghanistan,1842,15.154667


### Reading and cleaning the 2 others data sets

In [27]:
df_gdp = pd.read_csv('..\data\gdp-data\GDP_Country.csv', skiprows=4) # Reading GDP_Country csv file
df_gdp.head()

Unnamed: 0,Country Name,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,...,2013,2014,2015,2016,2017,2018,2019,2020,2021,Unnamed: 66
0,Aruba,ABW,GDP (constant 2015 US$),NY.GDP.MKTP.KD,,,,,,,...,2862306000.0,2861720000.0,2963128000.0,3025850000.0,3191738000.0,3359555000.0,3380889000.0,2752412000.0,3225070000.0,
1,Africa Eastern and Southern,AFE,GDP (constant 2015 US$),NY.GDP.MKTP.KD,153696400000.0,154061100000.0,166362100000.0,174952800000.0,182972100000.0,192720900000.0,...,862334100000.0,897164500000.0,923143900000.0,946092800000.0,971065300000.0,996417800000.0,1016728000000.0,985792300000.0,1029191000000.0,
2,Afghanistan,AFG,GDP (constant 2015 US$),NY.GDP.MKTP.KD,,,,,,,...,19189250000.0,19712070000.0,19998160000.0,20450180000.0,20991490000.0,21241130000.0,22072000000.0,21553060000.0,17083570000.0,
3,Africa Western and Central,AFW,GDP (constant 2015 US$),NY.GDP.MKTP.KD,105675500000.0,107614700000.0,111674900000.0,119808200000.0,126269100000.0,131391300000.0,...,704676000000.0,746466400000.0,766958000000.0,767829900000.0,785533200000.0,808676300000.0,834480200000.0,826966700000.0,859759200000.0,
4,Angola,AGO,GDP (constant 2015 US$),NY.GDP.MKTP.KD,,,,,,,...,82433770000.0,86407070000.0,87219300000.0,84969040000.0,84841590000.0,83724810000.0,83138740000.0,78482970000.0,79346280000.0,



### Combining and cleaning data



### Calculations