# Alchohol Sales Prediction
![alchohol](https://i0.wp.com/xtalks.com/wp-content/uploads/2020/04/drinks-e1585833542715.jpg?resize=1098%2C600&ssl=1)

In this notebook we are going to predict Alchohol Sales Time Series Data using LSTM Model in PyTorch.

[Dataset Link](https://www.kaggle.com/bulentsiyah/for-simple-exercises-time-series-forecasting)

Thanks [@bulentsiyah](https://www.kaggle.com/bulentsiyah/) for the Dataset

# Intro to LSTMs
> Long short-term memory (LSTM) is a recurrent neural network (RNN) architecture used in the field of deep learning LSTM networks are well-suited to classifying, processing and making predictions based on time series data, since there can be lags of unknown duration between important events in a time series.

                                                                    Wikipedia

LSTM networks are a type of recurrent neural network capable of learning order dependence in sequence prediction problems.

We'll not be going into details of LSTM. If you're curious follow [this link](https://machinelearningmastery.com/gentle-introduction-long-short-term-memory-networks-experts/)

# Standard Imports


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
sns.set_style("darkgrid")

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

# Loading Data

In [None]:
sales = pd.read_csv("../input/for-simple-exercises-time-series-forecasting/Alcohol_Sales.csv",index_col=0,parse_dates=True)

In [None]:
sales.head()

In [None]:
sales.tail()

In [None]:
sales.plot(figsize=(16,5),grid=True,legend = False);

We can observe that the Sales has a pattern for each year but overall Sales has increased from 1994 to 2019

In [None]:
X = sales.index
Y = sales['S4248SM144NCEN'].values.astype(float)

In [None]:
X = np.array(X)

In [None]:
X[0]

We have stored the Dates in special data type defined in Numpy

# Defining Training and Testing Data

In [None]:
test_size = 12
train_set = Y[:-test_size]
test_set = Y[-test_size:]

## Normalizing the Train Set

In [None]:
from sklearn.preprocessing import MinMaxScaler

In [None]:
scaler = MinMaxScaler(feature_range=(-1,1))

In [None]:
train_norm = scaler.fit_transform(train_set.reshape(-1,1))

In [None]:
train_norm = train_norm.flatten()

In [None]:
train_norm = torch.FloatTensor(train_norm)

## Preparing the Train Data

We'll be diving our Dataset into windows of size 12<br>
After our model has been trained, we will predict the Sales for the next 12 months i.e. an Year

In [None]:
def get_windows(data,ws):
    out = []
    L = len(data)
    for i in range(L-ws):
        out.append((data[i:i+ws],data[i+ws:i+ws+1]))
    return out

In the code above, we are inserting a tuples into an array. Each tuple has Sales data for the given window size which in our case is an Year. We are also inserting the Sales for the next immediate month of the given window 

In [None]:
window_size = 12
train_data = get_windows(train_norm,window_size)

## Defining and Instatiating the LSTM Model,Optimizing and Loss Function

In [None]:
class LSTM(nn.Module):
    def __init__(self,in_size = 1,hidden_size = 100,out_size = 1):
        super().__init__()
        self.hidden_size = hidden_size
        self.lstm = nn.LSTM(in_size,hidden_size)
        self.linear = nn.Linear(hidden_size,out_size)
        self.hidden = (torch.zeros(1,1,self.hidden_size).cuda(),
                       torch.zeros(1,1,self.hidden_size).cuda())
    def forward(self,X):
        lstm_out,self.hidden = self.lstm(X.view(len(X),1,-1),self.hidden)
        pred = self.linear(lstm_out.view(len(X),-1))
        return pred[-1]

In [None]:
model = LSTM().cuda()

In [None]:
optimizer = torch.optim.Adam(model.parameters(),lr = 0.001)
criterion = nn.MSELoss()

In [None]:
model

# Training the Model
Here we are training our model based on the Data excluding the last window. We'll be predicting the last window of our dataset which is also our Test Data

In [None]:
import time
start = time.time()
epochs = 100
for i in range(epochs):
    for X_train,Y_train in train_data:
        X_train = X_train.cuda()
        Y_train = Y_train.cuda()
        optimizer.zero_grad()
        model.hidden = (torch.zeros(1,1,model.hidden_size).cuda(),
                        torch.zeros(1,1,model.hidden_size).cuda())
        Y_pred = model(X_train)
        loss = criterion(Y_pred,Y_train)
        loss.backward()
        optimizer.step()
    print(f"Epoch : {i+1} LOSS : {loss.item():.7f}")
end = time.time()
dur = end-start
print(f"Duration : {int(dur/60)} minutes and {int(dur%60)} seconds")

## Testing on the last Window Size
In the code given below, we are predicting the values based off of the last window and then adding this predicted value to the previous window thus sliding the window forward

In [None]:
future = 12
preds = train_norm[-window_size:].tolist()
model.eval()
for i in range(future):
    X_test = torch.FloatTensor(preds[-window_size:]).cuda()
    with torch.no_grad():
        model.hidden = (torch.zeros(1,1,model.hidden_size).cuda(),
                      torch.zeros(1,1,model.hidden_size).cuda())
        preds.append(model(X_test).item())


In [None]:
preds[-window_size:]

We can see that our predicted data is normalized. Let's invert the normalization

## Inverting the Normalization

In [None]:
true_predictions = scaler.inverse_transform(np.array(preds[-window_size:]).reshape(-1,1))
true_predictions

In [None]:
sales['S4248SM144NCEN'][-12:]

In [None]:
dates = np.arange('2018-02-01', '2019-02-01', dtype='datetime64[M]').astype('datetime64[D]')
dates

In [None]:
plt.figure(figsize=(20,7))
plt.grid(True)
plt.plot(sales['S4248SM144NCEN'],label = 'Original')
plt.plot(dates,true_predictions,label = 'Predicted')
plt.legend()
plt.show()

In [None]:
plt.figure(figsize=(20,7))
plt.grid(True)
plt.plot(sales['S4248SM144NCEN']['2018-01-01':],label = "Original")
plt.plot(dates,true_predictions,label = "Predicted")
plt.legend()
plt.show()

The predicted Data closely resembles our Original Data

## Training with the entire Dataset
For predicting the next Year Sales we are going to train our model over the entire Dataset this time! 

In [None]:
Y_norm = scaler.fit_transform(Y.reshape(-1,1))
Y_norm = torch.FloatTensor(Y_norm).view(-1)
full_train_data = get_windows(Y_norm,window_size)

In [None]:
start = time.time()
epochs = 100
model.train()
for i in range(epochs):
    for X_train,Y_train in full_train_data:
        X_train = X_train.cuda()
        Y_train = Y_train.cuda()
        optimizer.zero_grad()
        model.hidden = (torch.zeros(1,1,model.hidden_size).cuda(),
                        torch.zeros(1,1,model.hidden_size).cuda())
        Y_pred = model(X_train)
        loss = criterion(Y_pred,Y_train)
        loss.backward()
        optimizer.step()
    print(f"Epoch {i+1} LOSS : {loss.item():.8f}")
end = time.time()
print(f"Train Duration {int((end-start)/60)} minutes {int((end-start)%60)} seconds")

## Predicting into the Unknown Future
Using the same approach as explained earlier we are going to predict the Sales for the next year

In [None]:
model.eval()
preds = Y_norm[-window_size:].tolist()
for i in range(future):
    X_test = torch.FloatTensor(preds[-window_size:]).cuda()
    with torch.no_grad():
        model.hidden = (torch.zeros(1,1,model.hidden_size).cuda(),
                      torch.zeros(1,1,model.hidden_size).cuda())
        preds.append(model(X_test).item())

In [None]:
preds[-window_size:]

## Inverting the normalization

In [None]:
true_predictions = scaler.inverse_transform(np.array(preds[-window_size:]).reshape(-1,1))
true_predictions

In [None]:
true_predictions = true_predictions.flatten()

In [None]:
dates = np.arange('2019-02-01', '2020-02-01', dtype='datetime64[M]').astype('datetime64[D]')
dates

In [None]:
plt.figure(figsize=(20,6))
plt.grid(True)
plt.plot(sales['S4248SM144NCEN'])
plt.plot(dates,true_predictions)
plt.show()

In [None]:
plt.figure(figsize=(20,6))
plt.grid(True)
plt.plot(sales['S4248SM144NCEN']['2017-01-01':])
plt.plot(dates,true_predictions)
plt.show()

Note that the gap here is due to the fact that both data are from different sources! Actually they are continous