# Item Demand Forecasting
This notebook details some basic code to get a simple time series forecasting algorithms up and running for several item demand profiles. The focus here is not an optimised algorithm, but to create a simple base model from which we can explore the concepts of machine learning engineering in the rest of the book.

## The Problem
The problem outlined in the book is deceptively complex, so here we will re-iterate some of the details

1. The aim is to create a suite of forecasting models for many items in many regions
2. The user will interface with the models via a basic frontend
3. The user should have the ability to:
    a. Select the forecasts they want to see
    b. Ask for a retrain of the selected forecasting models (and then have the forecasts updated)
    c. [BONUS] Add conditional variables to perform scenario analysis
    
From a machine learning point of view this can be supplied via:

1. A batch training of N baseline models (N = combinations of region and product)
2. Storing of the models in MLFlow
3. The app, when requested for specific models, can retrieve the models from MLFlow and cache them for the session
4. The app, when requested for specific forecasts, can performt the forecasts on the cached models
5. The app, when requested for retrains, can send a request to trigger a training run of the selected models in the original batch system.
6. The training system, when requested for retrains, can compare performance against current models in MLFlow and decised to promote to 'production'
7. The app will have to detect that there is a new production model in MLFlow and replace current model (to work this out)

## This Notebook
This notebook will create some baseline experiments to show that forecasting can be done at the region and item level, but will not concern itself with scaling to all regions and items nor with the deployment architecture and methodology (to be discussed later in the book).

## Set Up

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 22})
from fbprophet import Prophet


## Get Dataset

In [None]:
df = pd.read_csv('../data/rossman/train.csv')

In [None]:
df.dtypes

In [None]:
df['Date'] = pd.to_datetime(df['Date'])

In [None]:
df[['Date', 'Store', 'Sales']]

In [None]:
df[['StateHoliday','SchoolHoliday']].value_counts()

## Train basic forecaster

In [None]:
df.rename(columns= {'Date': 'ds', 'Sales': 'y'}, inplace=True)
df_store1 = df[
    (df['Store']==4) &\
    (df['Open']==1)
].reset_index(drop=True)
df_store1 = df_store1.sort_values('ds', ascending=True)
df_store1.plot(x='ds', y='y')

In [None]:
df_store1['DayOfWeek'].max()

In [None]:
# df_store1 = df_store1[['ds', 'y']].set_index('ds').resample('1W').mean().reset_index()
# df_store1.plot(x='ds', y='y')
# #df_store1[['ds', 'y']].set_index('ds').resample('1W').mean().reset_index().plot(x='ds', y='y')

In [None]:
# df_store1[
#     (df_store1['ds']>'2013-02') & (df_store1['ds']<'2013-09')
# ].sort_values('ds', ascending=False).plot(x='ds', y='y')

In [None]:
# df_store1 = df_store1[
#     (df_store1['ds']>'2013-02') & (df_store1['ds']<'2013-09')
# ]

In [None]:
df_store1.sort_values('ds', ascending=True)

In [None]:
seasonality = {
    'yearly': True,
    'weekly': True,
    'daily': False
}

def train_predict(df, train_index, seasonality=seasonality):
    # grab split data
    df_train = df.copy().iloc[0:train_index]
    df_test = df.copy().iloc[train_index:]
    
    #create Prophet model
    model=Prophet(
        yearly_seasonality=seasonality['yearly'],
        weekly_seasonality=seasonality['weekly'],
        daily_seasonality=seasonality['daily'],
        interval_width = 0.95
    )

    # train and predict
    model.fit(df_train)
    predicted = model.predict(df_test)
    return predicted, df_train, df_test
    
    
    
# def time_split_train_test(df, time_series_splits, seasonality=seasonality):
#     # for outputting
#     df_results = pd.DataFrame()
    
#     for i,(train_i,test_i) in enumerate(time_series_splits.split(df)):
        
#         # grab split data
#         df_train = df.copy().iloc[train_i,:]
#         df_test = df.copy().iloc[test_i,:]

#         # create Prophet model
#         model=Prophet(
#             yearly_seasonality=seasonality['yearly'],
#             weekly_seasonality=seasonality['weekly'],
#             daily_seasonality=seasonality['daily']
#         )

#         # train and predict
#         model.fit(df_train)
#         predicted = model.predict(df_test)

#         # combine pred and training df's for plotting
#         df_pred = predicted.loc[:,["ds","yhat"]]
        
#         df_pred["y"] = df_test['y'].tolist()
        
#         # Train or Test?
#         df_train["train"]=True
#         df_pred["train"]=False
        
#         df_sub = df_train.append(df_pred).reset_index(drop=True)
#         df_sub["split"]=i
#         df_sub["rmse"]=(np.mean((df_sub.yhat-df_sub.y)**2))**.5 #calculating rmse for the split
        
#         df_results = df_results.append(df_sub).reset_index(drop=True)
#     return df_results

In [None]:
train_index = int(0.8*df_store1.shape[0])
predicted, df_train, df_test = train_predict(
    df = df_store1,
    train_index = train_index,
    seasonality=seasonality
)

In [None]:
fig, ax = plt.subplots(figsize=(20,10))

df_test.plot(x='ds', y='y', ax=ax, label='Truth', linewidth=1, markersize=5, color='tab:blue',alpha=0.9, marker='o')
predicted.plot(x='ds', y='yhat', ax=ax, label='Prediction', linewidth=2, markersize=5, color='k')
ax.fill_between(x=predicted['ds'], y1=predicted['yhat_upper'], y2=predicted['yhat_lower'], alpha=0.1, color='k')
df_train.iloc[train_index-100:].plot(x='ds', y='y', ax=ax, color='tab:blue', label='_nolegend_', alpha=0.5, marker='o')

ax.set_xlabel('Date')
ax.set_ylabel('Sales')

In [None]:
train_index = int(0.8*df_store1.shape[0])
# grab split data
df_train = df_store1.copy().iloc[0:train_index]
df_test = df_store1.copy().iloc[train_index:]

#create Prophet model
model=Prophet(
    yearly_seasonality=seasonality['yearly'],
    weekly_seasonality=seasonality['weekly'],
    daily_seasonality=seasonality['daily']
)

# train and predict
model.fit(df_train)
predicted = model.predict(df_test)

In [None]:
df_store1.iloc[0:train_index]