![image.png](attachment:image.png)

# Table Of Contents
* [Introduction](#intro)
* [Reading Data and Preparation](#rdp)
* [Decomposition of Data](#dcp)
* [Model Fitting](#fit)
* [Conclusion](#conc)

# Introduction
<a id="intro"></a>

When it comes to Auto Regression, the variants of ARIMA dominate the time series market. Can we use machine learning models for the same purpose? We can certainly structure the Time-Series data in a way that it serves our purpose.
Ths notebook focuses on using a new and promising Python library [Sktime](https://github.com/alan-turing-institute/sktime) for multi step ahead forecasts using Tree Ensembles(or any Regression Model).

In [None]:
! pip install sktime==0.4.3

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import warnings
import itertools
import numpy as np
import matplotlib.pyplot as plt
import lightgbm as lgb
from sktime.forecasting.base import ForecastingHorizon
from pylab import rcParams

rcParams['figure.figsize'] = 18, 8
warnings.filterwarnings("ignore")

from statsmodels.tsa.seasonal import seasonal_decompose

#-----------------------Imports from Sktime--------------------------

from sktime.transformers.single_series.detrend import Deseasonalizer, Detrender
from sktime.forecasting.trend import PolynomialTrendForecaster
from sktime.forecasting.model_selection import (
    ForecastingGridSearchCV,
    SlidingWindowSplitter,
    temporal_train_test_split,
)
from sktime.utils.plotting import plot_series
from sktime.forecasting.compose import (
    TransformedTargetForecaster,
    ReducedRegressionForecaster
)

# Reading Data and Preparation
<a id="rdp"></a>

In [None]:
#------------------------Reading Data from CSV---------------------------

df = pd.read_csv("/kaggle/input/demand-forecasting-kernels-only/train.csv"
                ,parse_dates=['date'])
df.head()

## Preparing the Data

In [None]:
#--------------------------Data Preparation-------------------------

#For the sake of demonstartion, we will train our model on monthly aggregated Sales data of a particular store 

# Select sales for Store 1 Only.
store1_agg = df.loc[df['store']==1].groupby(['date'])['sales'].sum()
store1_agg.index = pd.to_datetime(store1_agg.index)

#Aggregate the Data on a Monthly basis.
store1_agg_monthly = store1_agg.resample('M').sum()
store1_agg_monthly.head()

In [None]:
#--------------------Visulaize Data on a Time Plot-------------------

sns.lineplot(
    data=store1_agg_monthly, 
)
plt.show()

## Decomposition of Data
<a id="dcp"></a>

The above plot gives us an idea that the time series is seasonal and heteroskedastic. We need to use a **multiplicative decomposition** in such a case, to examine its **seasonal and trend** components and the residuals.

In [None]:
#----------------------Time Series Decomposition---------------------

seasonal_decompose(store1_agg_monthly,model="multiplicative",period=12).plot()
plt.show()

## Model Fitting
<a id="fit"></a>
Here we explore Sktime's Regression Wrapper for Time Series Forecating.

In [None]:
#--------------------Time Series Train-Test split-------------------

store1_agg_monthly.index = store1_agg_monthly.index.to_period('M') 
y_train, y_test = temporal_train_test_split(store1_agg_monthly, test_size=0.2)

### Fitting a Detrender
We fit a **Linear Trend**  detector in this step, we will use this to forecast the future trend. 

In [None]:
#--------------------------Detrender-----------------------------

#degree=1 for Linear
forecaster = PolynomialTrendForecaster(degree=1) 
transformer = Detrender(forecaster=forecaster)

#Get the residuals after fitting a linear trend
y_resid = transformer.fit_transform(y_train)

# Internally, the Detrender uses the in-sample predictions
# of the PolynomialTrendForecaster
forecaster = PolynomialTrendForecaster(degree=1)
fh_ins = -np.arange(len(y_train))  # in-sample forecasting horizon
y_pred = forecaster.fit(y_train).predict(fh=fh_ins)

plot_series(y_train, y_pred, y_resid, labels=["y_train", "fitted linear trend", "residuals"]);

### Fitting a Deseasonalizer
We fit a Deseasonalizer in this step, we will use this to forecast the seasonality. We extract the Monthly Seasonality in this step(period=12)

In [None]:
#--------------------------Detrender-----------------------------

#Multiplicative Deseasonalizer, period = 12(for Monthly Data)
deseasonalizer = Deseasonalizer(model="multiplicative", sp=12)
plot_series(deseasonalizer.fit_transform(y_train))
seasonal = deseasonalizer.fit_transform(y_train)

### Fitting an Autoregressor.
After removing the Linear Trend and Seasonal Components, we use an Autoregressor to regress on its previous values and forecast. These 3 sequential steps can be conveniently pipelined using **TransformedTargetForecaster**. So it abstracts the steps of first forecasting Seasonality, Trend, the forecasts of the Autoregressor, and then finally combining the three components to get a single forecast! To generate the Prediction Intervals we create three such abstractions.

In [None]:
#----------------------------Create Pipeline--------------------

def get_transformed_target_forecaster(alpha,params):
    
    #Initialize Light GBM Regressor 
    
    regressor = lgb.LGBMRegressor(alpha = alpha,**params)

    #-----------------------Forecaster Pipeline-----------------
    
    #1.Separate the Seasonal Component.
    #2.Fit a forecaster for the trend.
    #3.Fit a Autoregressor to the resdiual(autoregressing on two historic values).
    
    forecaster = TransformedTargetForecaster(
        [
            ("deseasonalise", Deseasonalizer(model="multiplicative", sp=12)),
            ("detrend", Detrender(forecaster=PolynomialTrendForecaster(degree=1))),
            (
                # Recursive strategy for Multi-Step Ahead Forecast.
                # Auto Regress on two previous values
                "forecast",
                ReducedRegressionForecaster(
                    regressor=regressor, window_length=4, strategy="recursive",
                ),
            ),
        ]
    )
    return forecaster

In [None]:
#-------------------Fitting an Auto Regressive Light-GBM-----------------

#Setting Quantile Regression Hyper-parameter.
params = {
    'objective':'quantile'
}

#A 10 percent and 90 percent prediction interval(0.1,0.9 respectively).
quantiles = [.1, .5, .9] #Hyper-parameter "alpha" in Light GBM

#Capture forecasts for 10th/median/90th quantile, respectively.
forecasts = []

#Iterate for each quantile.
for alpha in quantiles:
    
    forecaster = get_transformed_target_forecaster(alpha,params)
    
    #Initialize ForecastingHorizon class to specify the horizon of forecast
    fh = ForecastingHorizon(y_test.index, is_relative=False)
    
    #Fit on Training data.
    forecaster.fit(y_train)
    
    #Forecast the values.
    y_pred = forecaster.predict(fh)
    
    #List of forecasts made for each quantile.
    y_pred.index.name="date"
    y_pred.name=f"predicted_sales_q_{alpha}"
    forecasts.append(y_pred)
    
#Append the actual data for plotting.
store1_agg_monthly.index.name = "date"
store1_agg_monthly.name = "original"
forecasts.append(store1_agg_monthly)

In [None]:
#-------------------Final Plotting of Forecasts------------------

plot_data = pd.melt(pd.concat(forecasts,axis=1).reset_index(), id_vars=['date'],\
        value_vars=['predicted_sales_q_0.1', 'predicted_sales_q_0.5',
                   'predicted_sales_q_0.9','original'])
plot_data['date'] = pd.to_datetime(plot_data['date'].astype(str).to_numpy())
plot_data['if_original'] = plot_data['variable'].apply(
    lambda r:'original' if r=='original' else 'predicted' 
)
sns.lineplot(data = plot_data,
        x='date',
        y='value',
        hue='if_original',
             style="if_original",
        markers=['o','o'],
)
plt.show()

## Conclusion
<a id="conc"></a>
Will create more notebooks demonstrating other features from this Library.Thanks!