In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 

## Benefits of Reframing Your Problem
Reframing your problem is the idea of exploring alternate perspectives on what is to be predicted.
There are two potential benefits to exploring alternate framings of your time series forecast
problem:
1. Simplify your problem.
2. Provide the basis for an ensemble forecast.

Both benefits ultimately have the result of leading to more skillful and/or more robust
forecasts.

### Simplify Your Problem
Perhaps the largest wins on a forecasting project can come from a reframing of the problem.
This is because the structure and type of prediction problem has so much more impact than the
choice of data transforms, choice of model, or the choice of model hyperparameters. It is the
biggest lever in a project and must be carefully considered.

### Ensemble Forecast
In addition to changing the problem you are working on, reframing plays another role: it
can provide you with a suite of dierent, but highly related problems that you can model.
The benefit of this is that the framings may be dierent enough to require dierences in data
preparation and modeling methods.
Models of differing perspectives on the same problem may capture dierent information
from the input and in turn result in predictions that are skillful, but in dierent ways. These
predictions may be combined in an ensemble to result in a more skillful or more robust forecast.
In this tutorial, we will explore three dierent ways you may consider reframing your time series
forecast problem. Before we dive in, let's look at a simple univariate time series problem of
forecasting the minimum daily temperature to use as context for the discussion.

### Regression Framings 

Most time series prediction problems are regression problems, requiring the prediction of a
real-valued output. Below are 5 dierent ways that this prediction problem could be re-phrased
as an alternate regression problem:
* Forecast the change in the minimum temperature compared to the previous day.
* Forecast the minimum temperature relative to the average from the past 14 days.
* Forecast the minimum temperature relative to the average the same month last year.
* Forecast the minimum temperature rounded to the nearest 5 degrees Celsius.
* Forecast the average minimum temperature for the next 7 days.


In [33]:
import re
df = pd.read_csv('./dataset/daily-minimum-temperatures-in-me.csv', parse_dates=['Date'], index_col='Date')
df.iloc[:, 0] = df.iloc[:, 0].apply(lambda x: re.sub(r'\?', '', x, re.I)).astype(float)

In [34]:
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 3650 entries, 1981-01-01 to 1990-12-31
Data columns (total 1 columns):
 #   Column                                                         Non-Null Count  Dtype  
---  ------                                                         --------------  -----  
 0   Daily minimum temperatures in Melbourne, Australia, 1981-1990  3650 non-null   float64
dtypes: float64(1)
memory usage: 57.0 KB


In [35]:
X = df.values
new_df = pd.DataFrame(
    {
        't+1':X.flatten(),
        't':pd.Series(X.flatten()).shift(1)
    }
)

df_ = new_df.copy(deep=True)

# round forecast to nearest 5
for i in range(len(new_df['t+1'])):
    df_.loc[i, 't+1'] = int(new_df.loc[i, 't+1'] / 5) * 5.0
df_.head()

Unnamed: 0,t+1,t
0,20.0,
1,15.0,20.7
2,15.0,17.9
3,10.0,18.8
4,15.0,14.6


## Classification Framings
Classification involves predicting categorical or label outputs (like hot and cold). Below are 5
different ways that this prediction problem can be rephrased as a classification problem:

* Forecast whether a minimum temperature will be cold, moderate, or warm.
* Forecast whether a change in minimum temperature will be small or large.
* Forecast whether the minimum temperature will be a monthly minimum or not.
* Forecast whether the minimum will be higher or lower than the minimum in the previous year.
* Forecast whether the minimum temperature will rise or fall over the next 7 days.

In [43]:
new_df['class'] = new_df['t+1'].apply(lambda x: 0 if x < 10 else 2 if x >= 25 else 1)
new_df[new_df['t+1'] > 25]

Unnamed: 0,t+1,t,class
384,25.2,19.7,2
410,26.3,22.8,2


## Time Horizon Framings

Another axis that can be varied is the time horizon. The time horizon is the number of time
steps in the future that are being predicted. Below are 5 dierent ways that this prediction
problem can be re-phrased as a dierent time horizon:
* Forecast the minimum temperature for the next 7 days.
* Forecast the minimum temperature in 30 days time.
* Forecast the average minimum temperature next month.
* Forecast the day in the next week that will have the lowest minimum temperature.
* Forecast one year of minimum temperature values.

In [47]:
dataframe = pd.DataFrame(
    {
        't':df.iloc[:, 0].shift(1),
        't+1':df.iloc[:, 0],
        't+2':df.iloc[:, 0].shift(-1),
        't+3':df.iloc[:, 0].shift(-2),
        't+4':df.iloc[:, 0].shift(-3),
        't+5':df.iloc[:, 0].shift(-4),
        't+6':df.iloc[:, 0].shift(-5),
        't+7':df.iloc[:, 0].shift(-6),
    }
)
dataframe

Unnamed: 0_level_0,t,t+1,t+2,t+3,t+4,t+5,t+6,t+7
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1981-01-01,,20.7,17.9,18.8,14.6,15.8,15.8,15.8
1981-01-02,20.7,17.9,18.8,14.6,15.8,15.8,15.8,17.4
1981-01-03,17.9,18.8,14.6,15.8,15.8,15.8,17.4,21.8
1981-01-04,18.8,14.6,15.8,15.8,15.8,17.4,21.8,20.0
1981-01-05,14.6,15.8,15.8,15.8,17.4,21.8,20.0,16.2
...,...,...,...,...,...,...,...,...
1990-12-27,14.6,14.0,13.6,13.5,15.7,13.0,,
1990-12-28,14.0,13.6,13.5,15.7,13.0,,,
1990-12-29,13.6,13.5,15.7,13.0,,,,
1990-12-30,13.5,15.7,13.0,,,,,
