# MuchLearningSuchWow - LSTM - Training

This notebook contains the code we used to test our LSTM network. The training code is based primarily on [this kernel](https://www.kaggle.com/bountyhunters/baseline-lstm-with-keras-0-7).

### Imports & Data Paths

In [1]:
import numpy as np
import pandas as pd
import pickle
import time
import keras

Using TensorFlow backend.


In [2]:
inputPath = "input/m5-forecasting-accuracy/"
outputPath = "output/"
modelPath = "models/"
submissionPath = "submissions/"

### Constants

In [3]:
timesteps = 14 # Number of previous days that will be used to predict the next day

### Loading Data

In [4]:
with open(outputPath + "/unscaled_train_data.pkl", "rb") as f:
    df_train = pickle.load(f)
with open(outputPath + "/days_before_event_valid.pkl", "rb") as f:
    daysBeforeEventValid = pickle.load(f)
with open(outputPath + "/scaler.pkl", "rb") as f:
    scaler = pickle.load(f)

model = keras.models.load_model(modelPath + "/lstm_model")
df_sample_submission = pd.read_csv(inputPath + "/sample_submission.csv")

### Testing

In [5]:
# Create initial inputs for testing (the last "timesteps" days of the training data)
inputs = df_train[-timesteps:]
inputs = scaler.transform(inputs)

X_test = []
X_test.append(inputs[0:timesteps])
X_test = np.array(X_test)

In [6]:
# Predict sales for all 28 days
predictions = []
for j in range(timesteps,timesteps + 28):
    predicted_sales = model.predict(X_test[0,j - timesteps:j].reshape(1, timesteps, 30491))
    testInput = np.column_stack((np.array(predicted_sales), daysBeforeEventValid[0][1913 + j - timesteps]))
    X_test = np.append(X_test, testInput).reshape(1,j + 1,30491)
    predicted_sales = scaler.inverse_transform(testInput)[:,0:30490]
    predictions.append(predicted_sales)

### Writing Submission File

In [7]:
# Note: right now, we are simply submitting the validation predictions twice (third line), not validation & evaluation
submission = pd.DataFrame(data=np.array(predictions).reshape(28,30490))
submission = submission.T
submission = pd.concat((submission, submission), ignore_index=True)
idColumn = df_sample_submission[["id"]]
submission[["id"]] = idColumn  
cols = list(submission.columns)
cols = cols[-1:] + cols[:-1]
submission = submission[cols]
colsdeneme = ["id"] + [f"F{i}" for i in range (1,29)]
submission.columns = colsdeneme
currentDateTime = time.strftime("%d%m%Y_%H%M%S")
submission.to_csv(submissionPath + "/submission.csv", index=False)