## Final Project

The NOAA Merged Land Ocean Global Surface Temperature Analysis Dataset (NOAAGlobalTemp) merges two dataset to form one land–ocean surface temperature analysis (formerly known as MLOST).NOAAGlobalTEmp consists of a spatially gridded (5° × 5°) global surface temperature dataset, with monthly resolution from January 1880 to present. We combine a global sea surface (water) temperature (SST) dataset with a global land surface air temperature dataset into this merged dataset of both the Earth’s land and ocean surface temperatures, currently as version v5. The Extended Reconstructed Sea Surface Temperature (ERSST) version 5 provides the foundational SST observations. The land surface air temperature observations come from the Global Historical Climatology Network Monthly (GHCN-Monthly) database, version 4.

#### Some read me from https://www.ncdc.noaa.gov/monitoring-references/faq/anomalies.php

##### Anomaly
The term temperature anomaly means a departure from a reference value or long-term average. A positive anomaly indicates that the observed temperature was warmer than the reference value, while a negative anomaly indicates that the observed temperature was cooler than the reference value.

##### Why?
Absolute estimates of global average surface temperature are difficult to compile for several reasons. Some regions have few temperature measurement stations (e.g., the Sahara Desert) and interpolation must be made over large, data-sparse regions. In mountainous areas, most observations come from the inhabited valleys, so the effect of elevation on a region's average temperature must be considered as well. For example, a summer month over an area may be cooler than average, both at a mountain top and in a nearby valley, but the absolute temperatures will be quite different at the two locations. The use of anomalies in this case will show that temperatures for both locations were below average.

Using reference values computed on smaller [more local] scales over the same time period establishes a baseline from which anomalies are calculated. This effectively normalizes the data so they can be compared and combined to more accurately represent temperature patterns with respect to what is normal for different places within a region.

For these reasons, large-area summaries incorporate anomalies, not the temperature itself. Anomalies more accurately describe climate variability over larger areas than absolute temperatures do, and they give a frame of reference that allows more meaningful comparisons between locations and more accurate calculations of temperature trends.

In [44]:
import pandas as pd
import numpy as np

In [6]:
# The conversion is in another notebook
df_raw = pd.read_csv('NOAAGlobalTemp_v5.0.0_gridded_s188001_e202010_c20201108T133314.csv')

### Global temp anomaly in 1880
Some of the area of the world does not have temperature record in 1880

In [23]:
df_1880 = df_raw[df_raw['time'].str.contains('1880')]
df_1880 = df_1880[['lat', 'lon', 'anom']]
df_1880 = df_1880.dropna()
df_1880_mean = df_1880.groupby(['lat', 'lon']).mean().reset_index()

In [33]:
import plotly.express as px
fig = px.density_mapbox(df_1880_mean, lat='lat', lon='lon', z='anom', radius=15,
                        center=dict(lat=0, lon=180), zoom=0,
                        mapbox_style="stamen-terrain")
fig.show()

### Global temp anomaly in 2020

In [35]:
df_2020 = df_raw[df_raw['time'].str.contains('2020')]
df_2020 = df_2020[['lat', 'lon', 'anom']]
df_2020 = df_2020.dropna()
df_2020_mean = df_2020.groupby(['lat', 'lon']).mean().reset_index()

In [36]:
import plotly.express as px
fig = px.density_mapbox(df_2020_mean, lat='lat', lon='lon', z='anom', radius=15,
                        center=dict(lat=0, lon=180), zoom=0,
                        mapbox_style="stamen-terrain")
fig.show()

### A comparison of data in 1880 and 2020

In [42]:
# The distribution of temperature anomaly in 1880 and 2020
import plotly.figure_factory as ff

data_1880_2020 = [df_1880_mean['anom'], df_2020_mean['anom']]
labels = ['Temp Anomaly in 1880', 'Temp Anomaly in 2020']
fig = ff.create_distplot(data_1880_2020, labels)
fig.show()

### World average temp anomaly from 1880 - 2020

In [None]:
df_raw[['Year', 'Month', 'Day']] = df_raw['time'].str.split('-', expand = True)

In [71]:
df_year_mean = df_raw[['Year', 'anom']]
df_year_mean = df_year_mean.groupby(['Year']).mean().reset_index()

In [72]:
import plotly.express as px

fig = px.line(df_year_mean, x="Year", y="anom", title='World average temp anomaly from 1880 - 2020')
fig.show()

### World temp anomaly in each year 

In [74]:
df_each_year = df_raw[['Year', 'Month', 'anom']].dropna()
df_each_year = df_each_year.groupby(['Year', 'Month']).mean().reset_index()

In [78]:
import plotly.express as px

fig = px.line(df_each_year, x="Month", y="anom",
              color="Year")
fig.show()

It looks really messy so then plot the 5 coldest and warmest years

In [79]:
df_each_year_2 = df_each_year.groupby(['Year']).mean().reset_index()

In [94]:
df_each_year_2_ascend = df_each_year_2.sort_values(by='anom', ascending=True)
warm_5_year = list(df_each_year_2_ascend['Year'])[-5:]
cold_5_year = list(df_each_year_2_ascend['Year'])[:5]

In [101]:
import plotly.graph_objects as go

df_warm_5 = df_each_year[df_each_year['Year'].isin(warm_5_year)]
df_cold_5 = df_each_year[df_each_year['Year'].isin(cold_5_year)]

month = ['January', 'February', 'March', 'April', 'May', 'June', 'July',
         'August', 'September', 'October', 'November', 'December']
fig = go.Figure()
for year in warm_5_year:
    df_that_year = df_warm_5[df_warm_5['Year'].str.contains(year)]
    anom = list(df_that_year['anom'])
    fig.add_trace(go.Scatter(x=month, y=anom, name='Temp anom in ' + year,
                         line=dict(color='firebrick', width=2)))
for year in cold_5_year:
    df_that_year = df_cold_5[df_cold_5['Year'].str.contains(year)]
    anom = list(df_that_year['anom'])
    fig.add_trace(go.Scatter(x=month, y=anom, name='Temp anom in ' + year,
                         line=dict(color='royalblue', width=2)))    
    
    
# Edit the layout
fig.update_layout(title='Temp anomaly in each year for the 5 warmest and coldest years',
                   xaxis_title='Month',
                   yaxis_title='Temperature Anomaly (degrees C)')

fig.show()