# Intro to Recurrent Neural Networks
### Starter Code
* **PyData Bristol - 5th Meetup:** https://www.meetup.com/PyData-Bristol/events/255667468/
* **Event URL:** https://www.eventbrite.co.uk/e/intro-to-recurrent-neural-networks-tickets-52401888459
* **Date:** Tue 13th November 2018
* **Instructor:** John Sandall
* **Contact:** john@coefficient.ai / @john_sandall

---

In [None]:
import numpy as np

%matplotlib inline
np.random.seed(0)

## Lab: Build A Recurrent Neural Network

Let's build a basic RNN using just numpy. We won't train it for now, we'll instead just get a feeling for how it's working. We'll use input data that has 20 samples, each with two-features, and two time points (t=0 and t=1).

In [None]:
n_features = 2
n_samples = 20

In [None]:
# Create our input data. Here's X at t=0
X0 = np.random.randint(low=-10, high=10, size=(n_samples, n_features))
X0

In [None]:
# Similarly here's X at t=1
X1 = np.random.randint(low=-10, high=10, size=(n_samples, n_features))

Let's also create the weight matrices `Wx` (connecting X to neurons) and `Wy` (connecting output y at t-1 to neurons at time t).

In [None]:
n_neurons = 3

# Connects 2-features to 3-neurons
Wx = np.random.randint(low=-5, high=5, size=(n_features, n_neurons))
Wx

In [None]:
# Connects 3-neuron output at time t-1 to 3-neurons at time t (the recurrent weights)
Wy = np.random.randint(low=-5, high=5, size=(n_neurons, n_neurons))
Wy

In [None]:
# We'll also need the bias
b = np.ones(n_neurons)
b

> #### Exercise: Calculate Y0!
> 
> **Tips**:
> - Remember `Y0 = activation(X0*Wx + b)` and `Y1 = activation(X0*Wx + Y0*Wy + b)`
> - You'll need `np.matmul()` to do multiply two matrixes.
> - You'll need `np.heaviside(some_vector, 0)` for your activation function.

## Lab: Build A Recurrent Neural Network using Keras

Let's work through a simple example now using Keras.

In [None]:
from keras.layers import SimpleRNN, Dense, TimeDistributed
from keras.models import Sequential

In [None]:
# Check if Keras is using GPU version of TensorFlow
from tensorflow.python.client import device_lib

print(device_lib.list_local_devices())

Let's now look at 5 time steps, with:
- input X has 20 samples and two features
- output y is of length 3 (we have three neurons).

In [None]:
# Input format shape for Keras is (sample size, number of time steps, features)
n_steps = 5

X = np.random.randint(low=-10, high=10, size=(n_samples, n_steps, n_features))
X.shape

In [None]:
y = np.random.randint(low=-10, high=10, size=(n_samples, n_steps, n_neurons))
y.shape

> #### Exercise: Define a simple `Sequential` RNN model using Keras
> - The model should contain one layer (`SimpleRNN` with 3 units, and `return_sequences=True`
> - Assign it to a variable called `model`
> - Use the Keras documentation if you get stuck!

In [None]:
# Define your model here...


> #### Exercise: Compile & fit the model
> - Use MSE loss and `rmsprop` optimizer.
> - Fit it to X and y, using 10 epochs and batch size of 32.

Let's try it out! We'll generate some new data `X_new` in the same shape as X.

In [None]:
# We'll have one sample, so we want it to have shape (1, 5, 2)
X.shape

In [None]:
# This has shape (1, 5, 2)
X_new = np.array([
    [[1, 0],  # t = 0 (two features)
     [0, 1],  # t = 1
     [0, 1],  # t = 2
     [0, 1],  # t = 3
     [0, 1],  # t = 4
    ]
])
X_new.shape

In [None]:
# Our RNN is able to predict some outcomes y of length 3, for each time step.
model.predict(X_new)

> #### Exercise: Predict single value outputs for y (instead of vectors of length 3)
> - Within your `Sequential` model, add a fully connected `Dense()` network with `input_dim=1` and `output_dim=1`
> - Compile as before
> - Fit to the new y provided
> - Predict for `X_new` again, confirming that your outputs are a single time series of 5 numbers.

In [None]:
# We want a newly shaped y to predict, containing 20 samples over 5 time steps, but otherwise scalar output.
y = np.random.randint(low=-10, high=10, size=(n_samples, n_steps, 1))
y.shape

In [None]:
model.predict(X_new)

> #### Exercise: Train a more fully fledged RNN on real data.
> - We'll construct an X input with `1` at t=0 and `0` otherwise.
> - Our `y` output just has a simple pattern.
> - The RNN should be able to learn the relationship between the X pattern, and the corresponding y pattern.
> - Re-use your code from before, i.e. a Sequential model containing a SimpleRNN (this time with 50 units), plus a Dense layer with 1 unit and `sigmoid` activation.
> - Compile as before, and fit to `x_train` and `y_train` using 10 epochs.

In [None]:
# These are our sequences. The RNN should learn to predict the
# 0.8 and 0.6 correctly because it can remember the 1 in the inputs.
x_seed = [1, 0, 0, 0, 0, 0]
y_seed = [1, 0.8, 0.6, 0, 0, 0]

In [None]:
# Let's create 1000 identical samples.
n_samples = 1000

x_train = np.array([[x_seed] * n_samples]).reshape(n_samples, len(x_seed), 1)
y_train = np.array([[y_seed] * n_samples]).reshape(n_samples, len(y_seed), 1)

x_train.shape

In [None]:
# Define your model here...


In [None]:
# Compile...


In [None]:
# Fit...


In [None]:
# Let's predict for this x_new
x_new = np.array([[[1],[0],[0],[0],[0],[0]]])
x_new

In [None]:
model.predict(x_new)

## Lab: LSTMs and GRUs

In [None]:
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import LSTM, GRU

> #### Exercise: Try using the LSTM and GRU units from Keras on the previous example. Does it appear to perform any better?

> #### Exercise: Try adding some additional components from the example provided [on the Keras docs here](https://keras.io/getting-started/sequential-model-guide/), such as Dropout. How does this improve things?

> #### Suggested "homework" exercise: Work through the Keras "text generation example" code: https://github.com/keras-team/keras/blob/master/examples/lstm_text_generation.py
> 
> Try applying this to your own text dataset!