<font size = 3> Project Owner: Ishan Patel</font>

<font size=5> Introduction/ Motivation:</font>

<font size=3> While the average human works day to day, people dont concern themselves with mundaine matters such as global warming. Over the past century, the temperature of the Earth's surface has risen approximately 1.1 degrees Celcius, while that might not seem to be much but it has an everlasting effect of the quality of life and effects on different parts of the planet. Throughout this project, I plan to analyze the data on how temperatures have risen over the past 70 years in different countries throughout the world. We want to figure out if global warming is something people should take more seriously or if its just being exaggerated. By the end of the project we will figure out if the rate of temperature increase is something to be concerned about in our lifetimes.</font>

<font size = 4> Step 1: Importing Dependencies</font>

<font size = 3> The first step is to import the libraries that will allow us to extract, modify, analyse, and visualize data in a cleaner and efficient way. We will be using pandas and numpy to clean and modify and seaborn to create data visualizations. We are also using beautiful soup to scrape and parse the data into a neat and readable table format.</font>

In [39]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import re
from bs4 import BeautifulSoup

<font size = 5>Part 1: Data collection</font>

In [40]:
df = pd.read_csv("./TempChange.csv", encoding='latin-1')
df.head()

Unnamed: 0,Area Code,Area,Months Code,Months,Element Code,Element,Unit,Y1961,Y1962,Y1963,...,Y2010,Y2011,Y2012,Y2013,Y2014,Y2015,Y2016,Y2017,Y2018,Y2019
0,2,Afghanistan,7001,January,7271,Temperature change,°C,0.777,0.062,2.744,...,3.601,1.179,-0.583,1.233,1.755,1.943,3.416,1.201,1.996,2.951
1,2,Afghanistan,7001,January,6078,Standard Deviation,°C,1.95,1.95,1.95,...,1.95,1.95,1.95,1.95,1.95,1.95,1.95,1.95,1.95,1.95
2,2,Afghanistan,7002,February,7271,Temperature change,°C,-1.743,2.465,3.919,...,1.212,0.321,-3.201,1.494,-3.187,2.699,2.251,-0.323,2.705,0.086
3,2,Afghanistan,7002,February,6078,Standard Deviation,°C,2.597,2.597,2.597,...,2.597,2.597,2.597,2.597,2.597,2.597,2.597,2.597,2.597,2.597
4,2,Afghanistan,7003,March,7271,Temperature change,°C,0.516,1.336,0.403,...,3.39,0.748,-0.527,2.246,-0.076,-0.497,2.296,0.834,4.418,0.234


<font size = 4>Data Curation and Parsing - Remove Useless Columns, Rename combinartion months, handle missing data</font>

In [41]:
df.columns = df.columns.str.lower()
df.columns = df.columns.str.replace('y', '')
df.drop(columns=['area code', 'months code', 'element code'] ,inplace=True)
months_replace = {'Dec\x96Jan\x96Feb': 'First Quarter', 'Mar\x96Apr\x96May': 'Second Quarter', 'Jun\x96Jul\x96Aug': 'Third Quarter', 'Sep\x96Oct\x96Nov': 'Fourth Quarter'}
df.replace(months_replace, inplace=True)
df

Unnamed: 0,area,months,element,unit,1961,1962,1963,1964,1965,1966,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
0,Afghanistan,January,Temperature change,°C,0.777,0.062,2.744,-5.232,1.868,3.629,...,3.601,1.179,-0.583,1.233,1.755,1.943,3.416,1.201,1.996,2.951
1,Afghanistan,January,Standard Deviation,°C,1.950,1.950,1.950,1.950,1.950,1.950,...,1.950,1.950,1.950,1.950,1.950,1.950,1.950,1.950,1.950,1.950
2,Afghanistan,February,Temperature change,°C,-1.743,2.465,3.919,-0.202,-0.096,3.397,...,1.212,0.321,-3.201,1.494,-3.187,2.699,2.251,-0.323,2.705,0.086
3,Afghanistan,February,Standard Deviation,°C,2.597,2.597,2.597,2.597,2.597,2.597,...,2.597,2.597,2.597,2.597,2.597,2.597,2.597,2.597,2.597,2.597
4,Afghanistan,March,Temperature change,°C,0.516,1.336,0.403,1.659,-0.909,-0.069,...,3.390,0.748,-0.527,2.246,-0.076,-0.497,2.296,0.834,4.418,0.234
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9651,OECD,Third Quarter,Standard Deviation,°C,0.247,0.247,0.247,0.247,0.247,0.247,...,0.247,0.247,0.247,0.247,0.247,0.247,0.247,0.247,0.247,0.247
9652,OECD,Fourth Quarter,Temperature change,°C,0.036,0.461,0.665,-0.157,-0.203,-0.295,...,0.958,1.106,0.885,1.041,0.999,1.670,1.535,1.194,0.581,1.233
9653,OECD,Fourth Quarter,Standard Deviation,°C,0.378,0.378,0.378,0.378,0.378,0.378,...,0.378,0.378,0.378,0.378,0.378,0.378,0.378,0.378,0.378,0.378
9654,OECD,Meteorological year,Temperature change,°C,0.165,-0.009,0.134,-0.190,-0.385,-0.166,...,1.246,0.805,1.274,0.991,0.811,1.282,1.850,1.349,1.088,1.297


<font size = 3> This is the temperature of all the countries on the planet in any given year from 1961 through 2019. I have removed the area code, months code, and element code columns since we will not be needing them. </font>

<font size = 5> Part 2: Data management/representation</font>

In [42]:
df

Unnamed: 0,area,months,element,unit,1961,1962,1963,1964,1965,1966,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
0,Afghanistan,January,Temperature change,°C,0.777,0.062,2.744,-5.232,1.868,3.629,...,3.601,1.179,-0.583,1.233,1.755,1.943,3.416,1.201,1.996,2.951
1,Afghanistan,January,Standard Deviation,°C,1.950,1.950,1.950,1.950,1.950,1.950,...,1.950,1.950,1.950,1.950,1.950,1.950,1.950,1.950,1.950,1.950
2,Afghanistan,February,Temperature change,°C,-1.743,2.465,3.919,-0.202,-0.096,3.397,...,1.212,0.321,-3.201,1.494,-3.187,2.699,2.251,-0.323,2.705,0.086
3,Afghanistan,February,Standard Deviation,°C,2.597,2.597,2.597,2.597,2.597,2.597,...,2.597,2.597,2.597,2.597,2.597,2.597,2.597,2.597,2.597,2.597
4,Afghanistan,March,Temperature change,°C,0.516,1.336,0.403,1.659,-0.909,-0.069,...,3.390,0.748,-0.527,2.246,-0.076,-0.497,2.296,0.834,4.418,0.234
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9651,OECD,Third Quarter,Standard Deviation,°C,0.247,0.247,0.247,0.247,0.247,0.247,...,0.247,0.247,0.247,0.247,0.247,0.247,0.247,0.247,0.247,0.247
9652,OECD,Fourth Quarter,Temperature change,°C,0.036,0.461,0.665,-0.157,-0.203,-0.295,...,0.958,1.106,0.885,1.041,0.999,1.670,1.535,1.194,0.581,1.233
9653,OECD,Fourth Quarter,Standard Deviation,°C,0.378,0.378,0.378,0.378,0.378,0.378,...,0.378,0.378,0.378,0.378,0.378,0.378,0.378,0.378,0.378,0.378
9654,OECD,Meteorological year,Temperature change,°C,0.165,-0.009,0.134,-0.190,-0.385,-0.166,...,1.246,0.805,1.274,0.991,0.811,1.282,1.850,1.349,1.088,1.297


<font size = 3> This is the data after I have removed all of the useless columns that we will not be needing for this project. Additionally I have handled the case for missing data and also renamed some of the values in the DataFrame.</font>

<font size = 5>Part 3: Exploratory data analysis</font>

<font size = 3> This is where we will start to create data visuals to see how the temperature has changed over the time period from 1961 through 2019.</font>