## Piecewise Linear Regression & Change-points

* Simple changes in trend can be modeled using a Piecewise Linear Regression.
* Piecewise Linear Regression amounts to creating new features for every change-points 
* This enables linear model to handle non-linear trend in timeseries

#### PROS
* Easy to implement 
* Provides a method of handling simple non-linear trend

#### CONS
* Need to manually choose change-points
* Mostly used for linear models 
* If timeseries is highly non-linear, it would be tedious to break it down into change-points

In [64]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from sklearn.preprocessing import MinMaxScaler, PolynomialFeatures
from sklearn.linear_model import LinearRegression, Lasso, Ridge, ElasticNet
from sklearn.pipeline import make_pipeline, make_union
from sklearn.metrics import mean_absolute_error,mean_absolute_percentage_error,mean_squared_error

from feature_engine.imputation import DropMissingData
from feature_engine.timeseries.forecasting.lag_features import LagFeatures

from sktime.transformations.series.time_since import TimeSince
from sktime.transformations.series.summarize import WindowSummarizer

## Data 

The retail sales dataset found [here](https://raw.githubusercontent.com/facebook/prophet/master/examples/example_retail_sales.csv).


In [2]:
data = pd.read_csv(
    "../../Datasets/example_retail_sales.csv",
    parse_dates=["ds"],
    index_col=["ds"],
)

data.plot(figsize=(15,4));

<img src='./plots/retail-sales-dataset.png'>

## Change points

In [16]:
ax = data.plot(figsize=(15,4));
changepoints = ["2008-01-01", "2009-04-01"]
ax.vlines(x=changepoints, ymin=ax.get_ylim()[0], ymax=ax.get_ylim()[1], colors=['salmon'])
ax.set(title=' & '.join(changepoints)+' : Changepoints');

<img src='./plots/retail-sales-dataset-changepoints.png'>

#### How do we create features to acheive piecewise linear regression ?

In [20]:
data.index - pd.to_datetime("1992-01-01")

TimedeltaIndex([   '0 days',   '31 days',   '60 days',   '91 days',
                 '121 days',  '152 days',  '182 days',  '213 days',
                 '244 days',  '274 days',
                ...
                '8613 days', '8644 days', '8674 days', '8705 days',
                '8735 days', '8766 days', '8797 days', '8826 days',
                '8857 days', '8887 days'],
               dtype='timedelta64[ns]', name='ds', length=293, freq=None)

In [26]:
# Since our time Frequency is -- `MS` -- "month start"
# lets convert the time difference to months

np.round((data.index - pd.to_datetime("1992-01-01")) / np.timedelta64(1, 'M'))

Float64Index([  0.0,   1.0,   2.0,   3.0,   4.0,   5.0,   6.0,   7.0,   8.0,
                9.0,
              ...
              283.0, 284.0, 285.0, 286.0, 287.0, 288.0, 289.0, 290.0, 291.0,
              292.0],
             dtype='float64', name='ds', length=293)

In [3]:
changepoints = [
    "1992-01-01",  # start of time series
    "2008-01-01",  # changepoint
    "2009-04-01",  # changepoint
]

df = data.copy()

index = data.index
for ch in changepoints:
    
    # Create feature
    feature = index - pd.to_datetime(ch)
    feature = feature / np.timedelta64(1, 'M')
    feature = np.round(feature)

    # Clip negative values to zero
    feature = np.clip(feature, a_min=0, a_max=None)

    # add to dataframe
    df[f'timse-since-{ch}'] = feature



# check out 
df

Unnamed: 0_level_0,y,timse-since-1992-01-01,timse-since-2008-01-01,timse-since-2009-04-01
ds,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1992-01-01,146376,0.0,0.0,0.0
1992-02-01,147079,1.0,0.0,0.0
1992-03-01,159336,2.0,0.0,0.0
1992-04-01,163669,3.0,0.0,0.0
1992-05-01,170068,4.0,0.0,0.0
...,...,...,...,...
2016-01-01,400928,288.0,96.0,81.0
2016-02-01,413554,289.0,97.0,82.0
2016-03-01,460093,290.0,98.0,83.0
2016-04-01,450935,291.0,99.0,84.0


## # Creating the features with sktime 

In [4]:
transformer = TimeSince(
    start=changepoints, keep_original_columns=True ,to_numeric=True,
    positive_only=True
)

df = data.copy()

transformer.fit_transform(df)

Unnamed: 0_level_0,y,time_since_1992-01-01 00:00:00,time_since_2008-01-01 00:00:00,time_since_2009-04-01 00:00:00
ds,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1992-01-01,146376,0,0,0
1992-02-01,147079,1,0,0
1992-03-01,159336,2,0,0
1992-04-01,163669,3,0,0
1992-05-01,170068,4,0,0
...,...,...,...,...
2016-01-01,400928,288,96,81
2016-02-01,413554,289,97,82
2016-03-01,460093,290,98,83
2016-04-01,450935,291,99,84


## Let's forecast

In [79]:
# capture the trend | features for peicewise non linear regression
time_since = TimeSince(start=changepoints, positive_only=True, keep_original_columns=False)

# create lag-features
window_summary = WindowSummarizer(
    lag_feature={
    'lag':[1], 
    'mean':[[1, 12]]
    },
    target_cols='y',
    truncate='bfill'
)

# feature union 
features = make_union(time_since, window_summary)
# scaling
features_scaled = make_pipeline(features, MinMaxScaler())

In [95]:
# Define time of first forecast, this determines our train / test split
forecast_start_time = pd.to_datetime("2013-01-01")

# model 
model = LinearRegression()

# Define number of steps to forecast.
num_of_forecast_steps = 42

# forecast horizon
forecast_horizon = pd.date_range(start=forecast_start_time, periods=num_of_forecast_steps, freq='MS')

# How much data in the past is needed to create our features
look_back_window_size = pd.DateOffset(months=12)

In [109]:
# --- CREATE TRAINING & TESTING DATAFRAME  --- #
# Ensure we only have training data up to the start
# of the forecast.

df_train = data.query(f'index < "{forecast_start_time}"')
df_test = data.query(f'index >= "{forecast_start_time}"')

train_features = features_scaled.fit_transform(df_train)
train_targets = df_train['y']

# train the model
model.fit(train_features, train_targets)
# let's make predictions on train data
y_preds_train = pd.DataFrame(
    data=model.predict(train_features), index=df_train.index, 
    columns=['predictions_on_training_data'])


# plots
ax = df_train.plot(figsize=(15,4))
y_preds_train.plot(ax=ax)

<img src='./plots/piecewise-regression-retail-sales-prediction-train-set.png'>

#### Recursive forecast

In [106]:
index = forecast_start_time -look_back_window_size
predict_df = data.loc[data.index>=index].copy()
predict_df['predictions_on_test_data'] = np.nan

for fh in forecast_horizon:
    # preprocessing
    x = predict_df.loc[:fh, ['y']]
    x = features_scaled.transform(x)
    x = x[-1]

    # get prediction
    y_pred = model.predict([x])

    predict_df.loc[fh, ['predictions_on_test_data']] = y_pred[0]

# predict_df.dropna(inplace=True)


In [110]:
# plots
ax = df_train.plot(figsize=(15,4))
y_preds_train.plot(ax=ax)
predict_df.plot(ax=ax)

#### We can see that the changepoint features can help capture changes in trend in the data when using linear models.

<img src='./plots/piecewise-regression-retail-sales-prediction-test-set.png'>