# Develop RNN Models with Keras
In this tutorial we go over creating two keras models. 

The first model is a simple Artificial Neural Network (ANN) with Keras. 

The second model is RNN model based on LSTM architeture. 

### Jupyter Environment
Let's start by getting familiar with jupyter environment and some simple tricks.

In [None]:
# Use this cell for some simple commands.
# Press ctrl+enter to execute a cell
# Use shift+enter to execute a cell and move on to the next cell
a = 1
print(a)

## A. Simple ANN model with Keras

### 1. Import necessary packages

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from pathlib import Path

# import keras layers needed
from keras.layers import Activation
from keras.layers import Dense

# Import keras Sequential model
from keras.models import Sequential

# Import Scikit-learn data splitting functions
from sklearn.model_selection import train_test_split

### 2. Download a sample dataset
Then we will download a sample data set. The dataset we will be using is "Appliances Energy Prediction Dataset".

Here is more information about his data set.
https://archive.ics.uci.edu/ml/datasets/Appliances+energy+prediction

Attribute Information:

date time year-month-day hour:minute:second<br>
Appliances, energy use in Wh<br>
lights, energy use of light fixtures in the house in Wh<br>
T1, Temperature in kitchen area, in Celsius<br>
RH_1, Humidity in kitchen area, in %<br>
T2, Temperature in living room area, in Celsius<br>
RH_2, Humidity in living room area, in %<br>
T3, Temperature in laundry room area<br>
RH_3, Humidity in laundry room area, in %<br>
T4, Temperature in office room, in Celsius<br>
RH_4, Humidity in office room, in %<br>
T5, Temperature in bathroom, in Celsius<br>
RH_5, Humidity in bathroom, in %<br>
T6, Temperature outside the building (north side), in Celsius<br>
RH_6, Humidity outside the building (north side), in %<br>
T7, Temperature in ironing room , in Celsius<br>
RH_7, Humidity in ironing room, in %<br>
T8, Temperature in teenager room 2, in Celsius<br>
RH_8, Humidity in teenager room 2, in %<br>
T9, Temperature in parents room, in Celsius<br>
RH_9, Humidity in parents room, in %<br>
To, Temperature outside (from Chievres weather station), in Celsius<br>
Pressure (from Chievres weather station), in mm Hg<br>
RH_out, Humidity outside (from Chievres weather station), in %<br>
Wind speed (from Chievres weather station), in m/s<br>
Visibility (from Chievres weather station), in km<br>
Tdewpoint (from Chievres weather station), Â°C<br>
rv1, Random variable 1, nondimensional<br>
rv2, Random variable 2, nondimensional<br>

In [None]:
# Let's get a sample dataset as a pandas dataframe
df = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/00374/energydata_complete.csv")


# Print a few row of the data
df...

In [None]:
# How many samples do we have in this data set?
print("Total number of samples: ", df.shape)

# Let's visualize some of the data

n_samples = 1000
feature_name = "T7"
target_name = "T2"

fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.plot(df[feature_name].values[:n_samples], 'b-')
ax2.plot(df[target_name].values[:n_samples], 'g-')
ax1.set_xlabel('Samples')
ax1.set_ylabel(feature_name, color='b')
ax2.set_ylabel(target_name, color='g')
plt.show()

### 3. Create input and output
We should extract our inputs and outputs from the dataframe. We will use the living room temperature as the target. To further speed up training, we will use a subset of all the available features.
We will also exclude the temperature of the kitchen as it is very correlated with the living room temperature.

In [None]:
features_to_use = ["lights", # energy use of light fixtures in the house in Wh
                   "T3", # Temperature in laundry room area
                   "T4", # Temperature in office room, in Celsius
                   "T5", # Temperature in bathroom, in Celsius
                   "T6", # Temperature outside the building (north side), in Celsius
                   "T7", # Temperature in ironing room , in Celsius
                   "T8", # Temperature in teenager room 2, in Celsius
                   "T9", # Temperature in parents room, in Celsius
                   "T_out", # Temperature outside (from Chievres weather station), in Celsius
                   "Press_mm_hg", # (from Chievres weather station), in mm Hg
                   "RH_out", # Humidity outside (from Chievres weather station), in %
                   "Windspeed", # Windspeed (from Chievres weather station), in m/s
                   "Visibility", # Visibility (from Chievres weather station), in km
                   "Tdewpoint" # Dew point (from Chievres weather station), Â°C
                  ]
# Grab a portion of the data to make training and testing faster
samples = -1
stride = 3
data = df[features_to_use].values[::stride, :]
target = df[target_name].values[::stride].reshape(-1, 1)

### 4. Split the data into train, test, validation
For training a model and evaluating the performance, we devide the model into train, validation, and test sets. 

We will use the training and validation set to design the architecture, train the model, and optimize the hyperparameters. Then use the test set to report the accuracy.

In [None]:
# Determine train test splits
test_ratio = 0.15

# Split the data into training and testing
x_trn, x_tst, y_trn, y_tst = train_test_split(data, target, test_size=test_ratio, shuffle=True, random_state=0)

# Split the training data into training and validation
x_trn, x_vld, y_trn, y_vld = train_test_split(x_trn, y_trn, test_size=test_ratio, shuffle=True, random_state=0)

# Print how many samples we have in each set
print("Number of samples in the training set: ", ...)
print("Number of samples in the validation set: ", ...)
print("Number of samples in the test set: ", ...)

### 5. Normalize the Data
Most of the time, we should "prepare" our data and make it ready for model development. The preperation might include dealing with missing data, normalization, etc. 

Here, we will normalize the data. Can you explain why we need to normalize the data?

In [None]:
# Normalize the data
mean = x_trn.mean(axis=0)
std = x_trn.std(axis=0)
x_trn = ...
x_vld = ...
x_tst = ...

### 6. Developing ANN Model
First, we need to decide on the architecture. For now, we can start by a two layer ANN, where the activation function of the first layer can be "relu" or "sigmoid" (Feel free to try both). The activation funciton of the last layer should be linear as our task is regression.  

There are two ways of developing a keras model, Sequential, and Functional. In a nutshell, in sequential models, we keep stacking layers on top of each other until we get the desired architecture. 

Funcitonal models give more flexibility is desinging complext architectues with multiple inputs, multiple outputs, etc. They start by first determining the inputs and then passing them through the desired blocks.  

Let's start by creating a sequential model to predict housing prices in Boston!

In [None]:
# First, we create a placeholder for the model
model = ...

# Then, we add the two layers
model.add(Dense(units=10, activation=..., input_shape=x_trn.shape[1:]))
model.add(Dense(units=1, activation=...))

# Now, we can compile the model
model.compile(optimizer='adam',
              loss='mae')

# And see the summary
model.summary()

### 7. Training the model
We can now start training the model. Once this model is trained, we can go back and add more layers to make it more powerful. We should be careful about overfitting though. That's why we should keep an eye on the validation loss.

In [None]:
# Training the model, history will include some informaiton about the training...
history = model.fit(...,
                    ...,
                    validation_data=(..., ...),
                    epochs=50,
                    batch_size=32)

# Plot the training and validation loss curves
plt.plot(history.history["loss"], label="Training Loss")
plt.plot(history.history["val_loss"], label="Validation Loss")
plt.legend()
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.show()

### 8. Making Predictions with the Trained Model
Once the model is trained and we have optimized the parameteres, we can use it to evalute the performance on the test set or any new input data.

In [None]:
# Let's evaluate the model on the test set
test_loss = model.evaluate(..., ...)
print("Test accuracy is: ", str(test_loss))

# Making prediciton on a new data sample
target_prd = model.predict((data-mean)/std)
samples_to_plot = 200
plt.figure(figsize=(10, 4))
plt.plot(target[-samples_to_plot:])
plt.plot(target_prd[-samples_to_plot:], "--")
plt.show()

## B. Simple RNN Model with Keras
Now that we are familiar with making models in Keras, we can start developing RNN models. The steps are pretty much similar. We should just pick the right layers. In this tutorial we will use LSTM layers.

### 1. Import Necessary Packages
In addition to the packages we loaded before, let's import the following packages.

In [None]:
# Import LSTM layer from keras
from keras.layers import SimpleRNN
from keras.layers import LSTM
from keras.layers import GRU

### 2. Load the Dataset
Let's load the dataset we will be using and print some informaiton about it.

In [None]:
def chop_data(data, target, history_to_use):
    k = history_to_use
    data_list = []
    target_list = []
    while k <= data.shape[0]:
        data_list += [data[k-history_to_use:k, :]]
        target_list += [target[k-history_to_use:k]]        
        k += 1
    chopped_data = np.stack(data_list, axis=0)
    chopped_target = np.stack(target_list)
    return chopped_data, chopped_target
data_ts, target_ts = chop_data(data, target, 12)

In [None]:
print(data_ts.shape)
print(target_ts.shape)

### 3. Split the data into train, test, validation
This is pretty much the same as before. However, splitting time-series data can be tricky spcecially if the data distribution changes over time. So, shuffling the data before splitting can result in false high accuracy values. For now, we will not get into the details, but something you should read on further. 

In [None]:
# Determine train test splits
test_ratio = 0.15

# Split the data into training and testing
x_trn, x_tst, y_trn, y_tst = train_test_split(data_ts, target_ts, test_size=test_ratio, shuffle=True, random_state=0)

# Split the training data into training and validation
x_trn, x_vld, y_trn, y_vld = train_test_split(x_trn, y_trn, test_size=test_ratio, shuffle=True, random_state=0)

# Print how many samples we have in each set
print("Number of samples in the training set: ", x_trn.shape[0])
print("Number of samples in the validation set: ", x_vld.shape[0])
print("Number of samples in the test set: ", x_tst.shape[0])

### 4. Normalize the Data
Most of the time, we should "prepare" our data and make it ready for model development. The preperation might include dealing with missing data, normalization, etc. 

Here, we will normalize the data. Can you explain why we need to normalize the data?

In [None]:
mean = x_trn.mean(axis=(0, 1))
std = x_trn.std(axis=(0, 1))

x_trn = (x_trn - mean)/std
x_vld = (x_vld - mean)/std
x_tst = (x_tst - mean)/std

### 5. Developing RNN Model
We can use the same Sequential API as before to develop our RNN model.

In [None]:
from keras import initializers
# First, we create a placeholder for the model
model = ...

# Then, we add the two layers
model.add(GRU(units=15, return_sequences=..., input_shape=x_trn.shape[1:]))
model.add(Dense(units=1, bias_initializer=initializers.Constant(target.mean())))

# Now, we can compile the model
model.compile(optimizer='Adam',
              loss='mae')

# And see the summary
model.summary()

### 6. Training the model
We can now start training the model. Once this model is trained, we can go back and add more layers to make it more powerful. We should be careful about overfitting though. That's why we should keep an eye on the validation loss.

In [None]:
# Training the model, history will include some informaiton about the training...
history = model.fit(...,
                    ...,
                    validation_data=(..., ...),
                    epochs=50,
                    batch_size=32)

# Plot the training and validation loss curves
plt.plot(history.history["loss"], label="Training Loss")
plt.plot(history.history["val_loss"], label="Validation Loss")
plt.legend()
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.show()

In [None]:
# Let's evaluate the model on the test set
test_loss = model.evaluate(..., ...)
print("Test accuracy is: ", str(test_loss))

# Make some predictions and plot some of the results
target_ts_prd = model.predict((data_ts-mean)/std)
samples_to_plot = 200
plt.figure(figsize=(10, 4))
plt.plot(target_ts[-samples_to_plot:, 10])
plt.plot(target_ts_prd[-samples_to_plot:, 10], "--")
plt.show()

### What to try next
You can read about the following topics if you like to further pursue this topic:
- Setting shuffle to False when splitting the data (Why do we get different result? Should we shuffle or not shuffle the data?)
- Dropout
- BatchNotmalization
- Bidirectional RNN
- Loss functions (mae, mse, hinge loss, etc.)
- Optimization methods (Adam, SGD, Adadelta, etc.)
- Regularization
- Early stopping

Then, revise the model and check if it results in better or worse performance.