# Store Item Demand Forecasting Challenge

<a href="https://www.kaggle.com/c/demand-forecasting-kernels-only">Link to competition on Kaggle.</a>

In this notebook, a number of time series forecasting techniques of varying levels of sophistication are presented.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

pd.options.display.max_columns = 99
plt.rcParams['figure.figsize'] = (16, 9)

## Load Data

In [None]:
df_train = pd.read_csv('../input/train.csv', parse_dates=[0])
df_test = pd.read_csv('../input/test.csv', parse_dates=[0])
df_train.shape, df_test.shape

In [None]:
df_train.head()

In [None]:
for s in df_train['store'].unique():
    for i in df_train['item'].unique():
        t = df_train.loc[(df_train['store'] == s) & (df_train['item'] == i)]
        plt.plot_date(t['date'], t['sales'], alpha=0.1)
plt.grid()
plt.xlabel('date')
plt.ylabel('sales');

## Average Method

For our first and simplest model, we make our predictions using the average value from the historical data.

In [None]:
am_results = df_test.copy()
am_results['sales'] = 0

for s in am_results['store'].unique():
    for i in am_results['item'].unique():
        historical_average = df_train.loc[(df_train['store'] == s) & (df_train['item'] == i), 'sales'].mean()
        am_results.loc[(am_results['store'] == s) & (am_results['item'] == i), 'sales'] = historical_average

In [None]:
am_results.drop(['date', 'store', 'item'], axis=1, inplace=True)
am_results.head()

In [None]:
am_results.to_csv('am_results.csv', index=False)

Scores 28.35111 on the leaderboard.

## Seasonal Naive Method

For this model, we predict the value from the same time the previous year.

In [None]:
snm_results = df_test.copy()
snm_results['date'] = snm_results['date'].astype(np.datetime64)
snm_results['sales'] = 0

In [None]:
import datetime

prev_dates = snm_results.loc[(snm_results['store'] == 1) & (snm_results['item'] == 1), 'date'] - datetime.timedelta(days=365)

In [None]:
for s in snm_results['store'].unique():
    for i in snm_results['item'].unique():
        snm_results.loc[(snm_results['store'] == s) & (snm_results['item'] == i), 'sales'] = \
            df_train.loc[((df_train['store'] == s) & (df_train['item'] == i)) & (df_train['date'].isin(prev_dates)), 'sales'].values

In [None]:
snm_results.drop(['date', 'store', 'item'], axis=1, inplace=True)
snm_results.head()

In [None]:
snm_results.to_csv('snm_results.csv', index=False)

Scores x on the leaderboard.