In [1]:
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
import plotly.express as px
import seaborn as sns
import warnings
import statsmodels.api as sm
warnings.filterwarnings("ignore")

## **Reframe Time Series Forecasting Problems**

A reframing can ultimately result in better and/or more robust forecasts.

After completing this tutorial, you will know:
- How to reframe your time series forecast problem as an alternate regression problem.
- How to reframe your time series forecast problem as a classification prediction problem.
- How to reframe your time series forecast problem with an alternate time horizon.

**Benefits of Reframing Your Problem**

Reframing your problem is the idea of exploring alternate perspectives on what is to be predicted. There are two potential benefits to exploring alternate framings of your time series forecast problem:

**Simplify Your Problem:** Perhaps the largest wins on a forecasting project can come from a reframing of the problem. This is because the structure and type of prediction problem has so much more impact than the choice of data transforms, choice of model, or the choice of model hyperparameters. It is the biggest lever in a project and must be carefully considered.

**Ensemble Forecast:** Models of differing perspectives on the same problem may capture different information from the input and in turn result in predictions that are skillful, but in different ways. These predictions may be combined in an ensemble to result in a more skillful or more robust forecast.

**daily-minimum-temperatures.csv** -> This is the reference dataset for the understanding of reframing 

**Naive Time Series Forecast**

The risk is that a preconceived idea of how to frame the problem has in influenced data collection, and in turn perhaps **limited** the results.

**Regression Framings**

Below are 5 different ways that this prediction problem could be rephrased
as an alternate regression problem:
- Forecast the change in the minimum temperature compared to the previous day.
- Forecast the minimum temperature relative to the average from the past 14 days.
- Forecast the minimum temperature relative to the average the same month last year.
- Forecast the minimum temperature rounded to the nearest 5 degrees Celsius.
- Forecast the average minimum temperature for the next 7 days.

**Transforming the granularity of a prediction problem does change the difficulty of the problem and can be very useful if the requirements of the problem permit such redefinitions.**

In [67]:
series = pd.read_csv("daily-minimum-temperatures.csv",index_col=0,header=0,squeeze=True, parse_dates=True)
# Create lagged dataset
values = series.values
values = pd.DataFrame(values)
dataframe = pd.concat([values.shift(1),values], axis=1)
dataframe.columns=["t", "t+1"]
dataframe.head()

Unnamed: 0,t,t+1
0,,20.7
1,20.7,17.9
2,17.9,18.8
3,18.8,14.6
4,14.6,15.8


**Classification Framings**

Classification involves predicting categorical or label outputs (like hot and cold). Below are 5 different ways that this prediction problem can be rephrased as a classification problem:
- Forecast whether a minimum temperature will be cold, moderate, or warm.
- Forecast whether a change in minimum temperature will be small or large.
- Forecast whether the minimum temperature will be a monthly minimum or not.
- Forecast whether the minimum will be higher or lower than the minimum in the previous year.
- Forecast whether the minimum temperature will rise or fall over the next 7 days.

A move to classification can simplify the prediction problem. This approach opens up ideas on labels as well as binary classification framings.
The native regression representation of the output variable means that most classification framings are likely to keep the ordinal structure (e.g. cold, moderate, hot). Meaning that there is an ordered relationship between the classes being predicted, which may not be the case when predicting labels like dog and cat. 
The ordinal relationship permits both a hard classification problem as well as an integer prediction problem that can be post-hoc rounded into a specific category. Below is an example of transforming the Minimum Daily Temperatures forecast problem to a classification problem where each temperature value is an ordinal value of cold, moderate, or hot. These labels are mapped to integer values, defined as:
- 0 (cold): < 10 degrees Celsius.
- 1 (moderate): >= 10 and < 25 degrees Celsius.
- 2 (hot):  25 >= degrees Celsius.

In [68]:
dataframe

Unnamed: 0,t,t+1
0,,20.7
1,20.7,17.9
2,17.9,18.8
3,18.8,14.6
4,14.6,15.8
...,...,...
3645,14.6,14.0
3646,14.0,13.6
3647,13.6,13.5
3648,13.5,15.7


In [69]:
# reframe regression as classification

for i in range(len(dataframe['t+1'])):
    value = dataframe['t+1'][i]
    # < 10 degrees Celsius.
    if value < 10.0:
        dataframe['t+1'][i] = 0
    # >= degrees Celsius.           
    elif value >= 25.0:
        dataframe['t+1'][i] = 2
    # >= 10 and < 25 degrees Celsius.
    else:
        dataframe['t+1'][i] = 1
        
dataframe.head()

Unnamed: 0,t,t+1
0,,1.0
1,20.7,1.0
2,17.9,1.0
3,18.8,1.0
4,14.6,1.0


In [70]:
# Filters
cold = dataframe['t+1'] ==  0
medium = dataframe['t+1'] == 1
hot = dataframe['t+1'] == 2

In [71]:
# Filtering Rows
dataframe.loc[cold].head()
dataframe.loc[medium].head()
dataframe.loc[hot].head()

Unnamed: 0,t,t+1
14,21.5,2.0
39,17.1,2.0
384,19.7,2.0
410,22.8,2.0


**Time Horizon Framings**

The time horizon is the number of time steps in the future that are being predicted. Below are 5 different ways that this prediction problem can be re-phrased as a different time horizon:
- Forecast the minimum temperature for the next 7 days.
- Forecast the minimum temperature in 30 days time.
- Forecast the average minimum temperature next month.
- Forecast the day in the next week that will have the lowest minimum temperature.
- Forecast one year of minimum temperature values.

It is easy to get caught up in the idea that you require one-step forecasts. Focusing on reframing of the problem around time horizon forces you to think about point versus multi-step forecasts and how far in the future to consider.

You may be able to forecast far into the future, but the skill is likely going to vary, degrading further into the future you project. When thinking through the horizon of the forecast, also consider the minimum acceptable performance of forecasts.

In [72]:
series

Date
1981-01-01    20.7
1981-01-02    17.9
1981-01-03    18.8
1981-01-04    14.6
1981-01-05    15.8
              ... 
1990-12-27    14.0
1990-12-28    13.6
1990-12-29    13.5
1990-12-30    15.7
1990-12-31    13.0
Name: Temp, Length: 3650, dtype: float64

In [77]:
# create lagged dataset
# +ve pushes it forward while -ve pushes it backwards
values = pd.DataFrame(series.values)
dataframe = pd.concat([values.shift(1),values,
                    values.shift(-1),values.shift(-2),
                    values.shift(-3),values.shift(-4),
                    values.shift(-5),values.shift(-6)],axis=1)

dataframe.columns = ['t', 't+1', 't+2', 't+3', 't+4', 't+5', 't+6', 't+7']
dataframe

Unnamed: 0,t,t+1,t+2,t+3,t+4,t+5,t+6,t+7
0,,20.7,17.9,18.8,14.6,15.8,15.8,15.8
1,20.7,17.9,18.8,14.6,15.8,15.8,15.8,17.4
2,17.9,18.8,14.6,15.8,15.8,15.8,17.4,21.8
3,18.8,14.6,15.8,15.8,15.8,17.4,21.8,20.0
4,14.6,15.8,15.8,15.8,17.4,21.8,20.0,16.2
...,...,...,...,...,...,...,...,...
3645,14.6,14.0,13.6,13.5,15.7,13.0,,
3646,14.0,13.6,13.5,15.7,13.0,,,
3647,13.6,13.5,15.7,13.0,,,,
3648,13.5,15.7,13.0,,,,,


In [78]:
dataframe.head(14)

Unnamed: 0,t,t+1,t+2,t+3,t+4,t+5,t+6,t+7
0,,20.7,17.9,18.8,14.6,15.8,15.8,15.8
1,20.7,17.9,18.8,14.6,15.8,15.8,15.8,17.4
2,17.9,18.8,14.6,15.8,15.8,15.8,17.4,21.8
3,18.8,14.6,15.8,15.8,15.8,17.4,21.8,20.0
4,14.6,15.8,15.8,15.8,17.4,21.8,20.0,16.2
5,15.8,15.8,15.8,17.4,21.8,20.0,16.2,13.3
6,15.8,15.8,17.4,21.8,20.0,16.2,13.3,16.7
7,15.8,17.4,21.8,20.0,16.2,13.3,16.7,21.5
8,17.4,21.8,20.0,16.2,13.3,16.7,21.5,25.0
9,21.8,20.0,16.2,13.3,16.7,21.5,25.0,20.7


Running the example prints the first 14 records of the transformed dataset. The problem is defined as: given the minimum daily temperature from the day before in degrees Celsius, forecast the minimum daily temperature for the next 7 days.