# **Analyzing Climate Change**

With each passing day, the threat upon climate change has become an important matter to be concerned about. Giving rise to global warming with the emission of greenhouse gases and drastic weather changes. Greenhouse gases mostly due to the rise in Carbon Dioxide emission and methane. The sources being fossil fuels being burnt, deforestation and industrial effluents. Over recent years there has been a massive increase in Earth’s surface temperature with heat waves rising. Simultaneously glaciers are melting, thereby decreasing land size. Not only humans but also plants, animal kingdom are being affected rigorously.

Scientists say this will continue to destroy mother Earth if something is not done at its earliest. Every big organisation is now joining hands in making decisions regarding the betterment of climate changes for our future generations. WHO and NASA have brought about many regulations in this climate change index for all the countries.

## **About the Dataset**

The Berkeley Earth Surface Temperature Study contains 1.6 billion temperature records. It is very well packaged and has interesting subsets (like countries, cities, etc.). They have published the source data for the transformations. They have included methods that have weather observations from a short timespan to be included. In this dataset, there are several files. Global Land and Ocean-and-Land Temperatures record from 1750 – 2015.

Other files include – Global Average Land Temperature record for Country, Global Average Land Temperature record for State, Global Land Temperatures record for Major City, Global Land Temperatures record for City.

In [None]:

!python -m pip install pip --upgrade --user -q --no-warn-script-location
!python -m pip install numpy pandas seaborn matplotlib scipy statsmodels sklearn tensorflow keras torch torchvision \
    tqdm scikit-image pmdarima plotly cufflinks --user -q --no-warn-script-location
import IPython
IPython.Application.instance().kernel.do_shutdown(True)


In [None]:
# !gdown https://drive.google.com/uc?id=1xCiW2E_y10IcTDThaMBtVLuqbKbC8QuW

In [None]:
# !unzip climate_change.zip

## **Implementation**

In [None]:
import pandas as pd
import seaborn as sns
import numpy as np

import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
import plotly.graph_objs as go
import cufflinks as cf
init_notebook_mode(connected=True)
cf.go_offline()

In [None]:
global_temp = pd.read_csv('https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/raw/main/analyzing_climate/GlobalTemperatures.csv',parse_dates=["dt"], index_col="dt")
global_temp.head()

In [None]:
global_temp.tail()

In [None]:
global_temp.describe()

In [None]:
global_temp.shape, global_temp.info()

In [None]:
global_temp.index

In [None]:
global_temp.isna().sum()

In [None]:
sns.heatmap(global_temp.corr())

In [None]:
for itr in global_temp.columns:
    global_temp.plot(y=[itr], figsize=[20,10])

In [None]:
global_temp.plot(figsize=(30,10))

In [None]:
temp = pd.read_csv('https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/raw/main/analyzing_climate/GlobalLandTemperaturesByCountry.csv')
temp.head()

In [None]:
temp.tail()

In [None]:
ax = temp.groupby(['dt'])['AverageTemperature'].last().sort_values(ascending=False).head(10).sort_values().plot(kind='barh');
ax.set_xlabel("avg temp");
plt.title("Datewise Highest Average Temperature");

In [None]:
ax = temp.groupby(['Country'])['AverageTemperature'].last().sort_values(ascending=False).head(10).sort_values().plot(kind='barh');
ax.set_xlabel("Avg Temp");
plt.title("Countries with Highest Average temperature");

In [None]:
new_df = pd.read_csv('https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/raw/main/analyzing_climate/GlobalTemperatures.csv')
new_df['year'] = pd.to_datetime( new_df['dt']).dt.year # Converting date into year and making new column.

by_new = new_df.groupby(['year'] )['LandAverageTemperature'].mean().reset_index()
new_pivot = by_new.pivot_table(values='LandAverageTemperature', index='year')
new_pivot.iplot(kind='scatter')

In [None]:
india = temp[temp['Country']=='India']
india['year'] = pd.to_datetime(india['dt']).dt.year

new_india = india.groupby('year')['AverageTemperature'].mean().reset_index()
new_india.iplot(kind='scatter', x='year', y='AverageTemperature', title='Temperature trend in India',
               xTitle='Year', yTitle='Temperature')


In [None]:
df_state = pd.read_csv('https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/raw/main/analyzing_climate/GlobalLandTemperaturesByState.csv')

In [None]:
state = df_state[df_state['Country']=='India']
state = state.groupby('State')['AverageTemperature'].mean().reset_index()
state.sort_values('AverageTemperature',inplace=True, )
state = state[:10]
state.iplot(kind='bar', x='State', y='AverageTemperature', title='Top 10 Coolest States',
           xTitle='State', yTitle='Temperature', color='deepskyblue')

In [None]:
state = df_state[df_state['Country']=='India']
state = state.groupby('State')['AverageTemperature'].mean().reset_index()
state.sort_values('AverageTemperature',inplace=True, ascending=False)
state = state[:10]
state.iplot(kind='bar', x='State', y='AverageTemperature', title='Top 10 Hotest States',
           xTitle='State', yTitle='Temperature')


In [None]:
df_city = pd.read_csv('https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/raw/main/analyzing_climate/GlobalLandTemperaturesByMajorCity.csv')

In [None]:
temp_df = df_city[df_city['City']== 'Calcutta']
temp_df['year'] = pd.to_datetime(temp_df['dt']).dt.year

by_year = temp_df.groupby('year')['AverageTemperature'].mean().reset_index()
by_year.iplot(kind='scatter', x='year', y='AverageTemperature', title='Temperature trend of Calcutta',
             xTitle='Year', yTitle='Temperature', legend=True)

In [None]:
global_temp = pd.read_csv('https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/raw/main/analyzing_climate/GlobalTemperatures.csv')

In [None]:
# drop unnecessary columns
global_temp = global_temp[['dt', 'LandAverageTemperature']]

global_temp['dt'] = pd.to_datetime(global_temp['dt'])
global_temp['year'] = global_temp['dt'].map(lambda x: x.year)
global_temp['month'] = global_temp['dt'].map(lambda x: x.month)

def get_season(month):
    if month >= 3 and month <= 5:
        return 'spring'
    elif month >= 6 and month <= 8:
        return 'summer'
    elif month >= 9 and month <= 11:
        return 'autumn'
    else:
        return 'winter'
    
min_year = global_temp['year'].min()
max_year = global_temp['year'].max()
years = range(min_year, max_year + 1)

global_temp['season'] = global_temp['month'].apply(get_season)

spring_temps = []
summer_temps = []
autumn_temps = []
winter_temps = []

for year in years:
    curr_years_data = global_temp[global_temp['year'] == year]
    spring_temps.append(curr_years_data[curr_years_data['season'] == 'spring']['LandAverageTemperature'].mean())
    summer_temps.append(curr_years_data[curr_years_data['season'] == 'summer']['LandAverageTemperature'].mean())
    autumn_temps.append(curr_years_data[curr_years_data['season'] == 'autumn']['LandAverageTemperature'].mean())
    winter_temps.append(curr_years_data[curr_years_data['season'] == 'winter']['LandAverageTemperature'].mean())
sns.set(style="whitegrid")
sns.set_color_codes("pastel")
f, ax = plt.subplots(figsize=(10, 6))

plt.plot(years, summer_temps, label='Summers average temperature', color='orange')
plt.plot(years, autumn_temps, label='Autumns average temperature', color='r')
plt.plot(years, spring_temps, label='Springs average temperature', color='g')
plt.plot(years, winter_temps, label='Winters average temperature', color='b')

plt.xlim(min_year, max_year)

ax.set_ylabel('Average temperature')
ax.set_xlabel('Year')
ax.set_title('Average temperature in each season')
legend = plt.legend(loc='center left', bbox_to_anchor=(1, 0.5), frameon=True, borderpad=1, borderaxespad=1)

In [None]:
temp_by_country = pd.read_csv('https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/raw/main/analyzing_climate/GlobalLandTemperaturesByCountry.csv')
countries = temp_by_country['Country'].unique()
max_min_list = []

# getting max and min temps
for country in countries:
    curr_temps = temp_by_country[temp_by_country['Country'] == country]['AverageTemperature']
    max_min_list.append((curr_temps.max(), curr_temps.min()))
    
# nan cleaning
res_max_min_list = []
res_countries = []

for i in range(len(max_min_list)):
    if not np.isnan(max_min_list[i][0]):
        res_max_min_list.append(max_min_list[i])
        res_countries.append(countries[i])

# calc differences        
differences = []

for tpl in res_max_min_list:
    differences.append(tpl[0] - tpl[1])
    
# sorting
differences, res_countries = (list(x) for x in zip(*sorted(zip(differences, res_countries), key=lambda pair: pair[0], reverse=True)))

# ploting
f, ax = plt.subplots(figsize=(8, 8))
sns.barplot(x=differences[:15], y=res_countries[:15], palette=sns.color_palette("coolwarm", 25), ax=ax)

texts = ax.set(ylabel="", xlabel="Temperature difference", title="Countries with the highest temperature differences")

#**Related Articles:**

> * [Analyzing Climate Change](https://analyticsindiamag.com/time-series-analysis-on-climate-change/)

> * [TadGAN](https://analyticsindiamag.com/hands-on-guide-to-tadgan-with-python-codes/)

> * [Pastas](https://analyticsindiamag.com/guide-to-pastas-a-python-framework-for-hydrogeological-time-series-analysis/)

> * [Bitcoin Price Prediction](https://analyticsindiamag.com/guide-to-implementing-time-series-analysis-predicting-bitcoin-price-with-rnn/)

> * [Time Series Forecasting with Darts](https://analyticsindiamag.com/hands-on-guide-to-darts-a-python-tool-for-time-series-forecasting/)

> * [Guide to Time Series Forecasting with GluonTS](https://analyticsindiamag.com/gluonts-pytorchts-for-time-series-forecasting/)

