# Keeling Curve - Time Series 

Author: Harry Yau

This notebook will be investing time series forecasting using the Facebook Prophet package. https://facebook.github.io/prophet/. The dataset that will be utilized is the carbon dioxide (CO<sub>2</sub>) measurements from the Mauna Loa Observatory. This monitoring station has been recording data since 1958. This dataset is also known as the Keeling Curve, which is aptly named to the scientist Charles David Keeling, and is a graph that shows the accumulation of CO<sub>2</sub> in the atmosphere. The Keeling Curve is famous for bringing the attention of the increase in CO<sub>2</sub> concentrations in the atmosphere.

For more information about this Keeling Curve, please read the following Wikipedia page: https://en.wikipedia.org/wiki/Keeling_Curve

In [1]:
import pandas as pd
import numpy as np

import datetime
import matplotlib.pyplot as plt


import warnings
warnings.filterwarnings('ignore')

from fbprophet import Prophet

#### Loading the Data

The data will be downloaded from this website: http://scrippsco2.ucsd.edu/data/atmospheric_co2/mlo. The monthly frequency will be used.

In [None]:
url = "http://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/in_situ_co2/monthly/monthly_in_situ_co2_mlo.csv"

data = pd.read_csv(url, skiprows=56, na_values=-99.99)

#Change the column names. The column names on the data set spans two rows.
colname = ['Yr', 'Mn', 'Excel Date', 'Date', 'CO2 (PPM)', 'CO2 Seasonally Adjusted (PPM)', 
           'CO2 Fit (PPM)', 'CO2 Seasonally Adjusted Fit (PPM)', 'CO2 Filled (PPM)', 'CO2 Seasonally Adjusted Filled (PPM)']
data.columns = colname

In [None]:
data.head()

We will be using the column 'CO2 Filled (PPM)' as the monthly CO2 data. For a detailed explanation on the different columns of CO2's, please download the csv file and there is a detailed explanation between row 39 to 51.

#### Convert Dates to Time Stamp

In [None]:
data['TimeStamp'] = pd.to_datetime(data['Yr'].astype(str).str.cat(data['Mn'].astype(str), sep='-'))

#### Creating the DataFrame for Prophet

Fitting the function for Prophet requires the input DataFrame to have column names of 'ds' and 'y'.

- ds: A column with type timestamps
- y: The value that the timeseries forecasting will be based on

In [None]:
df_prophet = data.loc[:, ['TimeStamp', 'CO2 Filled (PPM)']].copy()
df_prophet.columns = ['ds', 'y']
df_prophet.dropna(inplace=True) #The 'filled' data is being used, so rows of NaN will only occur at the top and bottom of the dataset.
df_prophet.reset_index(drop=True, inplace=True)

In [None]:
df_prophet.head()

#### Fitting the Prophet Model

In [None]:
keeling_model = Prophet(interval_width=0.95, yearly_seasonality=True, weekly_seasonality=False, daily_seasonality=False)
keeling_model.fit(df_prophet)

#### Forecasting using Prophet

We will be forecasting the Keeling Curve for 20 years in the future with a monthly frequency.

In [None]:
year = 20
fcst_period = year * 12

future_dates = keeling_model.make_future_dataframe(periods=fcst_period, freq='M')
keeling_fcst = keeling_model.predict(future_dates)

In [None]:
#Font Settings
font_size_title = 20
font_size_label = 15
font_size_tick = 12

#Grabbing the first index
first_idx = df_prophet.shape[0]

plt.figure(figsize=(8,8))

plt.fill_between(keeling_fcst['ds'][first_idx:], keeling_fcst['yhat_lower'][first_idx:], 
                 keeling_fcst['yhat_upper'][first_idx:], color='lightcoral', alpha = 0.5)
plt.plot(keeling_fcst['ds'][first_idx:], keeling_fcst['yhat'][first_idx:], color='red', linestyle ='dashed')
plt.plot(df_prophet['ds'].astype('datetime64'), df_prophet['y'], 'k-')
plt.xlabel('Year', fontsize=font_size_label)
plt.ylabel('CO$_2$ Concentration (ppm)', fontsize=font_size_label)
plt.title('Monthly Average CO$_2$ Concentration', fontsize=font_size_label);
plt.xticks(fontsize=font_size_tick)
plt.yticks(fontsize=font_size_tick)
plt.show();

#### Plotly Version of the Chart

In [None]:
import plotly
import plotly.graph_objs as go

plotly.offline.init_notebook_mode()

In [None]:
#Font Settings
font_size_title = 20
font_size_label = 15
font_size_tick = 12

#Grabbing the first index
first_idx = df_prophet.shape[0]

fig = go.Figure()

#Add traces
fig.add_trace(go.Scatter(x=keeling_fcst['ds'][first_idx:], 
                         y=keeling_fcst['yhat_lower'][first_idx:], 
                         mode='lines', 
                         line={'color': 'lightcoral'},
                         hoverinfo='skip',
                         showlegend=False
                        ))

fig.add_trace(go.Scatter(x=keeling_fcst['ds'][first_idx:], 
                         y=keeling_fcst['yhat_upper'][first_idx:], 
                         fill='tonexty',
                         fillcolor='rgba(240,128,128,0.5)', #Opacity only accepts RGBA
                         mode='lines', 
                         line={'color': 'lightcoral'},
                         hoverinfo='skip', 
                         name='95% Confidence Interval'
                        ))

fig.add_trace(go.Scatter(x=df_prophet['ds'], 
                         y=df_prophet['y'], 
                         mode='lines',
                         line={'color': 'black'},
                         hoverinfo=['x', 'y'],
                         name='Observed'
                        ))

fig.add_trace(go.Scatter(x=keeling_fcst['ds'][first_idx:], 
                         y=round(keeling_fcst['yhat'][first_idx:], 2), 
                         mode='lines', 
                         line={'dash': 'dash', 'color': 'red'}, 
                         name='Projected'
                        ))

fig.update_layout(
    
    title=go.layout.Title(
        text = 'Monthly Average CO<sub>2</sub> Concentration',
        font=dict(size=font_size_title)
     ),
    
    xaxis=go.layout.XAxis(
        title=go.layout.xaxis.Title(
            text = 'Year',
            font = dict(size=font_size_label)
        )
    ),
    
    yaxis=go.layout.YAxis(
        title=go.layout.yaxis.Title(
            text='CO<sub>2</sub> Concentration',
            font = dict(size=font_size_label)
        ),
        ticksuffix=' ppm', 
        showtickprefix='first',
        showticksuffix='last'
    ),
    
    legend=go.layout.Legend(
        x=0,
        y=1
    )
    
)

plotly.offline.iplot(fig, filename = 'basic-line')