# Machine Learning - Simple Time Series Forecasting Models

In this notebook, we are going to explore a couple of diffrent machine learning models to predict time-series data.

Here is a link to all articles/tutorials:
 - [Time Series Archive](http://machinelearningmastery.com/category/time-series/)
 
Here are links to specific articles:
 - [How to Make Out-of-Sample Forecasts with ARIMA in Python](http://machinelearningmastery.com/make-sample-forecasts-arima-python/)
 - [Sensitivity Analysis of History Size to Forecast Skill with ARIMA in Python](http://machinelearningmastery.com/sensitivity-analysis-history-size-forecast-skill-arima-python/)
 - [Feature Selection for Time Series Forecasting with Python](http://machinelearningmastery.com/feature-selection-time-series-forecasting-python/)
 - [Simple Time Series Forecasting Models to Test So That You Don’t Fool Yourself](http://machinelearningmastery.com/simple-time-series-forecasting-models/)
 - [Autoregression Models for Time Series Forecasting With Python](http://machinelearningmastery.com/autoregression-models-time-series-forecasting-python/)

## Simple Time Series Forecasting Models

We are going to start by testing a few simple time series forecasting techniques by following the next 5 steps:

1. Dataset Description: An overview of the standard time series dataset we will use.
2. Test Setup: How we will evaluate forecast models in this tutorial.
3. Persistence Forecast: The persistence forecast and how to automate it.
4. Expanding Window Forecast: The expanding window forecast and how to automate it.
5. Rolling Window Forecast: The rolling window forecast and how to automate it.

### Dataset Description

In [1]:
import pandas as pd
import numpy as np

from sklearn.metrics import mean_squared_error

import matplotlib.pyplot as plt
%matplotlib inline

import plotly.offline as py
py.init_notebook_mode(connected=True)

from plotly.graph_objs import *

In [2]:
# load dataset
data = pd.read_csv('data/slo_weather_history.csv', index_col=0)

# display first few rows
data.head()

Unnamed: 0_level_0,dew_point_f_avg,dew_point_f_high,dew_point_f_low,events,humidity_%_avg,humidity_%_high,humidity_%_low,precip_in_sum,sea_level_press_in_avg,sea_level_press_in_high,sea_level_press_in_low,temp_f_avg,temp_f_high,temp_f_low,visibility_mi_avg,visibility_mi_high,visibility_mi_low,wind_gust_mph_high,wind_mph_avg,wind_mph_high
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
2012-01-01,44.0,50.0,34.0,Fog,80.0,100.0,25.0,0.0,30.15,30.23,30.08,56.0,73.0,39.0,6.0,10.0,0.0,0.0,1.0,8.0
2012-01-02,47.0,52.0,43.0,Fog,93.0,100.0,63.0,0.0,30.23,30.3,30.19,52.0,63.0,42.0,4.0,10.0,0.0,0.0,3.0,14.0
2012-01-03,43.0,50.0,37.0,Fog,85.0,100.0,32.0,0.01,30.24,30.28,30.17,58.0,77.0,39.0,6.0,10.0,0.0,0.0,2.0,10.0
2012-01-04,42.0,47.0,37.0,,69.0,96.0,33.0,0.0,30.24,30.3,30.2,56.0,73.0,39.0,10.0,10.0,8.0,0.0,1.0,9.0
2012-01-05,42.0,51.0,36.0,,66.0,93.0,23.0,0.0,30.15,30.22,30.09,60.0,78.0,42.0,10.0,10.0,7.0,22.0,4.0,18.0


In [3]:
def plot_helper(x, traces, trace_name, title):
    data = []
    
    for i, trace in enumerate(traces):
        # prepare data for plot
        data.append(
            Scatter(x=np.array(x),
                    y=trace,
                    name=trace_name[i])
        )

    layout = Layout({
        'title': title
        }
    )
    
    # plot scores over persistence values
    fig = Figure(data=data, layout=layout)
    py.iplot(fig)

### Persistence Forecast

The persistence forecast involves using the previous observation to predict the next time step.

For this reason, the approach is often called the naive forecast.

Instead of blindly using the previous observation, in this section, we will look at automating the persistence forecast and evaluate the use of any arbitrary prior time step to predict the next time step.

We will explore using each of the prior 730 days (2 years) of point observations in a persistence model. Each configuration will be evaluated using the test harness and RMSE scores collected. We will then display the scores and graph the relationship between the persisted time step and the model skill.

In [4]:
# split into train and test sets
X = data['temp_f_low'].values
train, test = X[0: -730], X[-730:]

In [5]:
persistence_values = range(1, 731)
scores = list()

for p in persistence_values:
    # walk-forward validation
    history = [x for x in train]
    predictions = list()
    
    for i in range(len(test)):
        # make prediction
        yhat = history[-p]
        predictions.append(yhat)
        
        # observation
        history.append(test[i])
    
    # report performance
    rmse = np.sqrt(mean_squared_error(test, predictions))
    scores.append(rmse)
#     print('p=%d RMSE:%.3f' % (p, rmse))

In [6]:
# plot scores over persistence values
plot_helper(persistence_values, [scores], ['RMSE'], 'Persisted Observation to RMSE on the Dayli Temperature for San Luis Obispo, CA')

Unfortunately, from the results, it is clear that the best result is achieved from t-1 with an RMSE of 2.970 &deg;F. The second best result is from t-365 with an RMSE of 6.568 &deg;F.

### Expanding Window Forecast

An expanding window refers to a model that calculates a statistic on all available historic data and uses that to make a forecast. It is an expanding window because it grows as more real observations are collected.

Two good starting point statistics to calculate are the mean and the median historical observation.

In [7]:
def calc_RMSE_for(train, test, func):
    # walk-forward validation
    history = [x for x in train]
    predictions = list()

    for i in range(len(test)):
        # make prediction
        yhat = func(history)
        predictions.append(yhat)

        # observation
        history.append(test[i])

    # return performance
    return np.sqrt(mean_squared_error(test, predictions)), predictions

In [8]:
rmse, predictions = calc_RMSE_for(train, test, np.mean)
print('RMSE for mean: %.3f' % rmse)

rmse, predictions = calc_RMSE_for(train, test, np.median)
print('RMSE for median: %.3f' % rmse)

RMSE for mean: 7.260
RMSE for median: 7.004


We can see that on this problem the historical mean produced a better result than the median, but both were worse models than using the optimized persistence values.

In [16]:
# plot predictions vs observations
plot_helper(persistence_values, [test, predictions], ['Test', 'Forecast'], 'Line Plot of Predicted Values vs Test Dataset for the Median Expanding Window Model')

The plot shows what a poor forecast looks like and how it does not follow the movements of the data at all, other than a slight rising trend.

### Rolling Window Forecast

A rolling window model involves calculating a statistic on a fixed contiguous block of prior observations and using it as a forecast. It is much like the expanding window, but the window size remains fixed and counts backwards from the most recent observation. It may be more useful on time series problems where recent lag values are more predictive than older lag values.

We will automatically check different rolling window sizes from 1 to 730 days (2 years) and start by calculating the mean observation and using that as a forecast.

In [10]:
def calc_RMSE_rolling_window_for(train, test, func, max_window):
    window_sizes = range(1, max_window)
    scores = list()

    for w in window_sizes:
        # walk-forward validation
        history = [x for x in train]
        predictions = list()

        for i in range(len(test)):
            # make prediction
            yhat = func(history[-w:])
            predictions.append(yhat)

            # observation
            history.append(test[i])

        # report performance
        rmse = np.sqrt(mean_squared_error(test, predictions))
        scores.append(rmse)
        # print('w=%d RMSE:%.3f' % (w, rmse))
    
    return scores

In [11]:
scores = calc_RMSE_rolling_window_for(train, test, np.mean, 731)

# plot predictions vs observations
plot_helper(range(1, 731), [scores], ['RMSE'], 'Line Plot of Rolling Window Size to RMSE for a Mean Forecast on the Daily Temperature for San Luis Obispo, CA')

In [12]:
scores = calc_RMSE_rolling_window_for(train, test, np.median, 731)

# plot predictions vs observations
plot_helper(range(1, 731), [scores], ['RMSE'], 'Line Plot of Rolling Window Size to RMSE for a Median Forecast on the Daily Temperature for San Luis Obispo, CA')

On both plots, we can see that best results were achieved with a window size of w=1 with an RMSE of 4.311342 &deg;F, which was essentially a t-1 persistence model.

We could imagine better results with a weighted combination of window observations, this idea leads to using linear models such as ARIMA and Autoregression (AR).