# MuchLearningSuchWow - LSTM - Training

This notebook contains the code we used to define and train our LSTM network. The training code is based primarily on [this kernel](https://www.kaggle.com/bountyhunters/baseline-lstm-with-keras-0-7).

### Imports & Data Paths

In [1]:
import numpy as np
import pandas as pd
import pickle

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
from keras.layers import Conv1D
from keras.utils import plot_model
from keras.optimizers import Adam

Using TensorFlow backend.


In [2]:
outputPath = "output/"
modelPath = "models/"

### Constants

In [3]:
timesteps = 14 # Number of previous days that will be used to predict the next day
startDay = 1000 # Number of days at start of data that will be ignored during training

### Loading Data

In [4]:
with open(outputPath + "/preprocessed_train_data.pkl", "rb") as f:
    df_train = pickle.load(f)

### Create Training Data and Labels

In [5]:
X_train = []
y_train = []
for i in range(timesteps, 1913 - startDay):
    X_train.append(df_train[i-timesteps:i])
    y_train.append(df_train[i][0:30490]) # Only use first 30490 columns (sales) as labels

In [6]:
del df_train

In [7]:
# Convert data to np array to be able to feed it to the model
X_train = np.array(X_train)
y_train = np.array(y_train)
print(X_train.shape)
print(y_train.shape)

(899, 14, 30491)
(899, 30490)


### LSTM Model

In [8]:
model = Sequential()

# 1D convolution layer
model.add(Conv1D(filters=32, kernel_size=7, strides=1, padding="causal",activation="relu",input_shape=(X_train.shape[1], X_train.shape[2])))

# LSTM layers
layer_1_units=150
model.add(LSTM(units = layer_1_units, return_sequences = True))
model.add(Dropout(0.1))

layer_2_units=300
model.add(LSTM(units = layer_2_units, return_sequences = True))
model.add(Dropout(0.1))

layer_3_units=400
model.add(LSTM(units = layer_3_units))
model.add(Dropout(0.1))

# Output layer
model.add(Dense(units = 30490))

In [9]:
plot_model(model, modelPath + "/model.png")
print(model.summary())

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_1 (Conv1D)            (None, 14, 32)            6830016   
_________________________________________________________________
lstm_1 (LSTM)                (None, 14, 150)           109800    
_________________________________________________________________
dropout_1 (Dropout)          (None, 14, 150)           0         
_________________________________________________________________
lstm_2 (LSTM)                (None, 14, 300)           541200    
_________________________________________________________________
dropout_2 (Dropout)          (None, 14, 300)           0         
_________________________________________________________________
lstm_3 (LSTM)                (None, 400)               1121600   
_________________________________________________________________
dropout_3 (Dropout)          (None, 400)              

### Training

In [10]:
# Compiling the model
model.compile(optimizer = Adam(learning_rate=0.001), loss = 'mean_squared_error')

# Fitting the model to the training set
nr_epochs = 50
batch_size = 16
model.fit(X_train, y_train, epochs = nr_epochs, batch_size = batch_size, verbose=1)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.callbacks.History at 0x2693b606188>

### Saving Result

In [11]:
model.save(modelPath + "/lstm_model")