# Intro

In this notebook we'll take a look at some visualizations of temperature changes across the world. The temperature measurements found here are relative to a baseline computed from the temperatures between 1951 - 1980. For example a value of 2 indicates a 2°C increase from baseline of the area of interest. Based on current climate research we should expect to see something between a linear and expontial trend upwards.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
!pip install seaborn --upgrade
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
sns.set_style('darkgrid')

# Primary Cleaning

In [None]:
df = pd.read_csv('/kaggle/input/temperature-change/Environment_Temperature_change_E_All_Data_NOFLAG.csv', encoding='latin-1')
df.head()

In [None]:
df.columns = df.columns.str.lower()
df.columns = df.columns.str.replace('y', '')
df.drop(columns=['area code', 'element code', 'months code', 'unit'], inplace = True)
df.head()

In [None]:
# df.months.unique()

In [None]:
# df.area.unique()

In [None]:
months = ['January', 'February', 'March', 'April', 'May', 'June', 'July',
       'August', 'September', 'October', 'November', 'December']
seasons = ['Winter', 'Spring', 'Summer', 'Fall']

In [None]:
seasons_replace = {'Dec\x96Jan\x96Feb': 'Winter', 'Mar\x96Apr\x96May': 'Spring', 'Jun\x96Jul\x96Aug': 'Summer', 'Sep\x96Oct\x96Nov': 'Fall', }
df.replace(seasons_replace, inplace=True)

# Countries Visualization

These functions will give us region specific dataframes for easy analysis. 

In [None]:
def country_df(df, country):
    dfn = df[(df['element'] == 'Temperature change') & (df['area'] == country)]
    dfn = dfn.set_index('months').transpose()[2:]
    dfn['year'] = dfn.index
    dfn.reset_index(drop=True, inplace=True)
    dfn.index.names = [country]
    dfn = dfn.astype('float')
    return dfn

In [None]:
def seasons_df(df, country):
    dfn = df[(df['element'] == 'Temperature change') & (df['area'] == country)]
    dfn.rename(columns={'months': 'seasons'}, inplace=True)
    dfn = dfn.set_index('seasons').transpose()[2:]
    dfn['year'] = dfn.index
    dfn.drop(columns=months, inplace=True)
    dfn.reset_index(drop=True, inplace=True)
    dfn.index.names = [country]
    dfn = dfn.astype('float')
    return dfn

In [None]:
usa = country_df(df, 'United States of America')
usa.head()

This function will let us easily view temperature trends in a country for any of the provided time periods.

In [None]:
def country_plot(data, period):
    p = plt.figure(figsize=(8,8))
    sns.regplot(data=data, x='year', y=period, fit_reg=True, lowess=True, scatter_kws={'alpha':0.2}, line_kws={'lw':2, 'alpha':0.75})
    plt.ylabel('∆ °C', rotation=0)
    plt.title(data.index.name)

In [None]:
country_plot(usa, 'Meteorological year')

In [None]:
afg = country_df(df, 'Afghanistan')
country_plot(afg, 'Meteorological year')

In [None]:
ger = country_df(df, 'Germany')
country_plot(ger, 'Meteorological year')

As expected we see a clear increase in temperature over the years in our sampled countries. 

# Continents Visualization

Let's zoom out and take a look at the continental temperature changes. 

In [None]:
africa = seasons_df(df, 'Africa')
asia = seasons_df(df, 'Asia')
europe = seasons_df(df, 'Europe')
north_america = seasons_df(df, 'Northern America')
south_america = seasons_df(df, 'South America')
australia = seasons_df(df, 'Australia')
antarctica = seasons_df(df, 'Antarctica')

In [None]:
continents_str=["Africa","Asia","Europe","Northern America","South America","Australia","Antarctica"]
continents=[africa, asia, europe, north_america, south_america, australia, antarctica]

In [None]:
 africa.head()

In [None]:
cont_yearly = pd.concat(continents, axis=1, ignore_index=False)
cont_yearly.drop(columns=seasons, axis=1, inplace=True)
cont_yearly.set_index(africa['year'], inplace=True)
cont_yearly.drop('year', axis=1, inplace=True)
cont_yearly.columns = continents_str
cont_yearly.head()

In [None]:
sns.set_palette(sns.color_palette('muted', 7))
colors = sns.color_palette('muted')

In [None]:
violins = plt.figure(figsize=(15,10))
sns.violinplot(data=cont_yearly, inner='quartile', cut=0, bw=0.3)  
plt.ylabel('∆ °C', rotation=0)
plt.title('Continental Temperature Shifts  \nper year average')
plt.show()

According to our plot, all continents have a mean temperature increase above 0, with some quartiles reaching near or above 1. 

Let's look at the continents per season.

In [None]:
def continent_season_plot(season, axes=None, subplot=False):
    if subplot == False:
        p = plt.figure(figsize=(10,10))
    for con, c in list(zip(continents, colors)):
        sns.regplot(ax=axes, data=con, x='year', y=season, fit_reg=True, lowess=True, label=con.index.name, 
                    scatter_kws={'alpha':0.2}, ci=None, color=c, line_kws={'lw':2, 'alpha':0.75})
    if subplot == False:
        plt.ylabel('∆ °C', rotation=0)
        plt.title(f'{season} ∆ Continental Temperatures')
        plt.legend(loc='best', frameon=False)
    else:
        axes.set_ylabel('∆ °C', rotation=0)
        axes.set_title(f'{season} ∆ Continental Temperatures')
        axes.legend(loc='upper left', frameon=True)

In [None]:
continent_season_plot('Winter')

In [None]:
fig, ax = plt.subplots(2,2, constrained_layout=True, figsize=(15,15))
fig.suptitle('Seasonal ∆ Continental Temperatures')
continent_season_plot('Winter', ax[0,0], subplot=True)
ax[0,0].set_ylim(-3.5,3.5)
continent_season_plot('Spring', ax[0,1], subplot=True)
ax[0,1].set_ylim(-3.5,3.5)
continent_season_plot('Summer', ax[1,0], subplot=True)
ax[1,0].set_ylim(-3.5,3.5)
continent_season_plot('Fall', ax[1,1], subplot=True)
ax[1,1].set_ylim(-3.5,3.5)
plt.show()

Although the degree of increase varies between continents, it's clear that all have seen increasing temperatures in all seasons. Most appear to be consistently above 1°C in recent years.

# World Visualization

Now let's view the world as a whole. 

In [None]:
world = sum(continents) / 7
world.index.name = 'world'
world.head()

In [None]:
plt.figure(figsize=(8,8))
for s in seasons:
    sns.regplot(data=world, x='year', y=s, fit_reg=True, lowess=True, label=s, scatter_kws={'alpha':0.2}, line_kws={'lw':2, 'alpha':0.75})
plt.gca().set_ylabel('∆ °C', rotation=0)
plt.gca().set_title('World ∆ Continental Temperatures')
plt.legend(loc='best', frameon=False)
plt.show()

As expected we see a clear increase for the world as a whole. 