<a id="1"></a>
<h3 style="background-color:green;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">Problem definition</h3>

# Store Sales - Time Series Forecasting
- you’ll use time-series forecasting to forecast store sales on data from Corporación Favorita, a large Ecuadorian-based grocery retailer.
- you'll build a model that more accurately predicts the unit sales for thousands of items sold at different Favorita stores.

### Importing libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import statsmodels.api as sm
import itertools

import warnings
warnings.filterwarnings("ignore")

matplotlib.rcParams['axes.labelsize'] = 14
matplotlib.rcParams['xtick.labelsize'] = 12
matplotlib.rcParams['ytick.labelsize'] = 12
matplotlib.rcParams['text.color'] = 'k'
plt.style.use('Solarize_Light2')

In [None]:
print(plt.style.available)

### importing data

In [None]:
DATA_PATH = "../input/store-sales-time-series-forecasting/"
data = pd.read_csv(f"{DATA_PATH}/train.csv")

In [None]:
# copying data 
df = data.copy()

In [None]:
df.head()

In [None]:
df.info()

In [None]:
# changing a type casting to date column
df.date = pd.to_datetime(df.date,dayfirst=True)

In [None]:
# check the descriptive stats
df.describe()

In [None]:
# finding min and max date
df.date.min(),df.date.max()

### data processing

In [None]:
# checking for null values
df.isna().sum()

In [None]:
# let us plot only the AUTOMATIVE family first
automotive_df = df[df['family'] == 'AUTOMOTIVE']
automotive_df

In [None]:
# droping the columns 
cols = ['id','store_nbr','family','onpromotion']
automotive_df.drop(cols,axis =1,inplace=True)

In [None]:
automotive_df = automotive_df.groupby('date')['sales'].sum().reset_index()
automotive_df.set_index('date',inplace=True)
automotive_df

In [None]:
automotive_df.index

- I will use the averages daily sales value for that month, and we are using the start of each month as the timestamp.

In [None]:
y = automotive_df['sales'].resample('MS').mean()

### Visualizing TS data

In [None]:
y.plot(figsize=(20,6))
plt.show()

### Visualizing using decomposition technique

In [None]:
from pylab import rcParams
rcParams['figure.figsize'] = 18,8

decomposition = sm.tsa.seasonal_decompose(y,model='additive')
fig = decomposition.plot()
plt.show()

### Time series forcating with ARIMA (Autoregressive Integrated Moving Average.)


In [None]:
# for testing purpose using pdq =(1,10), seasonal_order= (1,1,0,12)
mod = sm.tsa.statespace.SARIMAX(y,
                                order=(1, 1, 0),
                                seasonal_order=(1, 1, 0, 12),
                                enforce_stationarity=False,
                                enforce_invertibility=False)
results = mod.fit()
print(results.summary().tables[1])

### Model diagnostics 

In [None]:
# investigateing any unusual behaviour
results.plot_diagnostics(figsize=(16, 8))
plt.show()

### One step forcasting and validating

In [None]:
pred = results.get_prediction(start=pd.to_datetime('2017-01-01'), dynamic=False)
pred_ci = pred.conf_int()
ax = y['2014':].plot(label='observed')
pred.predicted_mean.plot(ax=ax, label='One-step ahead Forecast', alpha=.7, figsize=(14, 7))
ax.fill_between(pred_ci.index,
                pred_ci.iloc[:, 0],
                pred_ci.iloc[:, 1], color='k', alpha=.2)
ax.set_xlabel('Date')
ax.set_ylabel('automotive')
plt.legend()
plt.show()

### RMSE of our forcast

In [None]:
y_forecasted = pred.predicted_mean
y_truth = y['2017-01-01':]
mse = ((y_forecasted - y_truth) ** 2).mean()
print('The Mean Squared Error of our forecasts is {}'.format(round(mse, 2)))

In [None]:
print('The Root Mean Squared Error of our forecasts is {}'.format(round(np.sqrt(mse), 2)))

### Visualising forcast

In [None]:
# visualizing it for 25 steps
pred_uc = results.get_forecast(steps=25)
pred_ci = pred_uc.conf_int()
ax = y.plot(label='observed', figsize=(14, 7))
pred_uc.predicted_mean.plot(ax=ax, label='Forecast')
ax.fill_between(pred_ci.index,
                pred_ci.iloc[:, 0],
                pred_ci.iloc[:, 1], color='k', alpha=.25)
ax.set_xlabel('Date')
ax.set_ylabel('automotive')
plt.legend()
plt.show()

### Will continue..
Please upvote if you like it, 
thank you 