# An RNN model to generate sequences
RNN models can generate long sequences based on past data. This can be used to predict stock markets, temperatures, traffic or sales data based on past patterns. They can also be adapted to [generate text](https://docs.google.com/presentation/d/18MiZndRCOxB7g-TcCl2EZOElS5udVaCuxnGznLnmOlE/pub?slide=id.g139650d17f_0_1185). The quality of the prediction will depend on training data, network architecture, hyperparameters, the distance in time at which you are predicting and so on. But most importantly, it will depend on wether your training data contains examples of the behaviour you are trying to predict.

<div class="alert alert-block alert-info">
<ol>
Things to do:<br/>
<li> [Choose a waveform](#assignment1) then execute the entire notebook. See results at the bottom: not great...
<li> [Implement the RNN model](#assignment2) and try again
<li> Check that your state is passed around correctly:
    <ol>
    <li> Did you use `dynamic_rnn(initial_state=Hin)` [in your model](#assignment3A) ?
    <li> [During inference](#assignment3B): check the state (hint: it's OK)
    <li> [In the training loop](#assignment3C): check the state (hint: it's a bug)
    <li> [When batching your data](#assignment3C): check the state (hint: it's a bug - [see this slide](https://docs.google.com/presentation/d/18MiZndRCOxB7g-TcCl2EZOElS5udVaCuxnGznLnmOlE/pub?slide=id.g139650d17f_0_584) to understand why then use `rnn_minibatch_sequencer` instead of `dumb_minibatch_sequencer`)
    </ol>
<li> [Make the predictions more robust](#assignment4). You can try any of the following:
    <ol>
    <li> Use GRUCell instead of BasicRNNCell [in your model](#model)
    <li> Train longer NB_EPOCHS 5 -> 10 [in your training loop](#training)
    <li> Larger SEQLEN (16->32) [in hyperparameters](#hyperparameters)
    <li> Use a stacked RNN cell with 2 layers with `tf.nn.rnn_cell.MultiRNNCell` [in your model](#model).<br/>Do not forget to also set N_LAYERS=2 [in hyperparameters](#hyperparameters).
    <li> Use dropout between the RNN cell layers [in your model](#model)
    </ol>
</ol>
    
Play with these options until you get a good fit for at least 128 predicted samples. You can then try a [different waveform](#assignment1).
</div>

In [None]:
import math
import numpy as np
from matplotlib import pyplot as plt
import utils_prettystyle
import utils_batching
import tensorflow as tf
import math
print("Tensorflow version: " + tf.__version__)

<a name="assignment1"></a>
## Generate fake dataset
<div class="alert alert-block alert-info">
**Assignment #1**: Choose a waveform. Three possible choices on the next line: 0, 1 or 2
</div>

In [None]:
WAVEFORM_SELECT = 0 # select 0, 1 or 2

def create_time_series(datalen):
    # good waveforms
    frequencies = [(0.2, 0.15), (0.35, 0.3), (0.6, 0.55)]
    freq1, freq2 = frequencies[WAVEFORM_SELECT]
    noise = [np.random.random()*0.2 for i in range(datalen)]
    x1 = np.sin(np.arange(0,datalen) * freq1)  + noise
    x2 = np.sin(np.arange(0,datalen) * freq2)  + noise
    x = x1 + x2
    return x.astype(np.float32)

DATA_SEQ_LEN = 1024*128
data = create_time_series(DATA_SEQ_LEN)
plt.plot(data[:512])
plt.show()

<a name="hyperparameters"></a>
## Hyperparameters

In [None]:
RNN_CELLSIZE = 64   # size of the RNN cells
N_LAYERS = 1         # number of stacked RNN cells (needed for tensor shapes but code must be changed manually)
SEQLEN = 16         # unrolled sequence length
BATCHSIZE = 32      # mini-batch size
DROPOUT_PKEEP = 0.7 # probability of neurons not being dropped (should be between 0.5 and 1)

## Visualize training sequences
This is what the neural network will see during training.

In [None]:
# The function dumb_minibatch_sequencer splits the data into batches of sequences sequentially.
for features, labels, epoch in utils_batching.dumb_minibatch_sequencer(data, BATCHSIZE, SEQLEN, nb_epochs=1):
    break
print("Features shape: " + str(features.shape))
print("Labels shape: " + str(labels.shape))
print("Excerpt from first batch:")
subplot = 231
for i in range(6):
    plt.subplot(subplot)
    plt.plot(features[i])
    subplot += 1
plt.show()

<a name="assignment2"></a>
<a name="assignment3A"></a>
<a name="model"></a>
## The model definition
When executed, this function instantiates the Tensorflow graph for our model.

<div class="alert alert-block alert-info">
**Assignment #2**: implement a single-layer RNN using GRU cells (tf.nn.rnn_cell.GRUCell)
</div>

<div class="alert alert-block alert-info">
**Assignment #3.A**: check that state is passed around correctly: `dynamic_rnn(initial_state=Hin)`
</div>

![deep RNN schematic](images/RNN1.svg)
<div style="text-align: right; font-family: monospace">
  X shape [BATCHSIZE, SEQLEN, 1]<br/>
  Y shape [BATCHSIZE, SEQLEN, 1]<br/>
  H shape [BATCHSIZE, RNN_CELLSIZE*NLAYERS]
</div>

In [None]:
def model_fn(features, Hin, labels, dropout_pkeep):
    # inputs shapes during training (for inference, we will use BATCHSIZE=1 and SEQLEN=1):
    # features [BATCHSIZE, SEQLEN, 1]
    # labels [BATCHSIZE, SEQLEN, 1]
    # Hin [BATCHSIZE, RNN_CELLSIZE*N_LAYERS]
    X = features
    batchsize = tf.shape(X)[0] # determined dynamically
    seqlen = tf.shape(X)[1] # determined dynamically
    
    # Goals:
    # Tranform input "X=features" into output "Yout"
    # Tranform input "Hin" into output "H" (these will be input and output states in an RNN)
    # Compute a loss between "Yout" and "labels" and minimize it
    
    # dummy model that does almost nothing (one trainable variable is needed)
    Yr = X * tf.Variable(tf.ones([]))
    H = Hin
    
    # TODO: create a tf.nn.rnn_cell.GRUCell
    # TODO: unroll the cell using tf.nn.dynamic_rnn
    # TODO: add a regression head using tf.layers.dense with just 1 neuron and no activation
    # TIP: you might need to reshape the sequence of outputs from the unrolled RNN (tf.reshape)
    
    # Yr[BATCHSIZE, SEQLEN, 1]
    Yout = Yr[:,-1,:]
    # Last output in sequence Yout [BATCHSIZE, 1]
    
    # shapes:
    # Yr [BATCHSIZE, SEQLEN, 1]
    # labels[BATCHSIZE, SEQLEN, 1]
    loss = tf.losses.mean_squared_error(Yr, labels)
    optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
    train_op = optimizer.minimize(loss)
    
    # output shapes:
    # Yout [BATCHSIZE, 1]
    # H [BATCHSIZE, RNN_CELLSIZE*N_LAYERS]
    return Yout, H, loss, train_op

## Instantiate the model

In [None]:
# placeholder for inputs
Hin = tf.placeholder(tf.float32, [None, RNN_CELLSIZE * N_LAYERS])
features = tf.placeholder(tf.float32, [None, None, 1]) # [BATCHSIZE, SEQLEN, 1]
labels = tf.placeholder(tf.float32, [None, None, 1]) # [BATCHSIZE, SEQLEN, 1]
dropout_pkeep = tf.placeholder(tf.float32)

# instantiate the model
Yout, H, loss, train_op = model_fn(features, Hin, labels, dropout_pkeep)

<a name="assignment3B"></a>
<a name="inference"></a>
## Inference
This is a generative model: run one trained RNN cell in a loop


<div class="alert alert-block alert-info">
**Assignment #3.B**: Check that the RNN state is passed around correctly (hint: it's OK)
</div>

In [None]:
def prediction_run(prime_data, run_length):
    H_ = np.zeros([1, RNN_CELLSIZE * N_LAYERS]) # zero state initially
    Yout_ = np.zeros([1, 1])
    data_len = prime_data.shape[0]

    # prime the state from data
    if data_len > 0:
        Yin = np.array(prime_data)
        Yin = np.reshape(Yin, [1, data_len, 1]) # reshape as one sequence
        feed = {Hin: H_, features: Yin, dropout_pkeep: 1.0} # no dropout during inference
        Yout_, H_ = sess.run([Yout, H], feed_dict=feed)
    
    # run prediction
    # To generate a sequence, run a trained cell in a loop passing as input and input state
    # respectively the output and output state from the previous iteration.
    results = []
    for i in range(run_length):
        Yout_ = np.reshape(Yout_, [1, 1, 1]) # batch of a single sequence of a single vector with one element
        feed = {Hin: H_, features: Yout_, dropout_pkeep: 1.0} # no dropout during inference
        Yout_, H_ = sess.run([Yout, H], feed_dict=feed)
        results.append(Yout_[0,0])
        
    return np.array(results)

## Initialize Tensorflow session
This resets all neuron weights and biases to initial random values

In [None]:
# first input state
Hzero = np.zeros([BATCHSIZE, RNN_CELLSIZE * N_LAYERS])
# variable initialization
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run([init])

<a name="assignment3C"></a>
<a name="training"></a>
## The training loop
You can re-execute this cell to continue training

<div class="alert alert-block alert-info">
**Assignment #3.C**: find and resolve RNN state bugs.<br?>
**hint**: there are 2 bugs. One in the core of the training loop and one in the way the data was sliced into batches of sequences. Special care is needed when batching sequences for an RNN. [See this slide](https://docs.google.com/presentation/d/18MiZndRCOxB7g-TcCl2EZOElS5udVaCuxnGznLnmOlE/pub?slide=id.g139650d17f_0_584) to understand the situation. You can fix it by using `rnn_minibatch_sequencer` instead of `dumb_minibatch_sequencer`)
</div>

In [None]:
NB_EPOCHS = 5

H_ = Hzero
losses = []
indices = []
for i, (next_features, next_labels, epoch) in enumerate(utils_batching.dumb_minibatch_sequencer(data, BATCHSIZE, SEQLEN, nb_epochs=NB_EPOCHS)):
    next_features = np.expand_dims(next_features, axis=2) # model wants 3D inputs [BATCHSIZE, SEQLEN, 1] 
    next_labels = np.expand_dims(next_labels, axis=2)

    feed = {Hin: Hzero, features: next_features, labels: next_labels, dropout_pkeep: DROPOUT_PKEEP}
    Yout_, _, loss_, _ = sess.run([Yout, H, loss, train_op], feed_dict=feed)
    # print progress
    if i%100 == 0:
        print("epoch " + str(epoch) + ", batch " + str(i) + ", loss=" + str(np.mean(loss_)))
    if i%10 == 0:
        losses.append(np.mean(loss_))
        indices.append(i)

In [None]:
plt.ylim(ymax=np.amax(losses[1:])) # ignore first value for scaling
plt.plot(indices, losses)
plt.show()

In [None]:
PRIMELEN=256
RUNLEN=512
OFFSET=20

prime_data = data[OFFSET:OFFSET+PRIMELEN]

results = prediction_run(prime_data, RUNLEN)

disp_data = data[OFFSET:OFFSET+PRIMELEN+RUNLEN]
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']
plt.subplot(211)
plt.text(PRIMELEN,2.5,"DATA |", color=colors[1], horizontalalignment="right")
plt.text(PRIMELEN,2.5,"| PREDICTED", color=colors[0], horizontalalignment="left")
displayresults = np.ma.array(np.concatenate((np.zeros([PRIMELEN]), results)))
displayresults = np.ma.masked_where(displayresults == 0, displayresults)
plt.plot(displayresults)
displaydata = np.ma.array(np.concatenate((prime_data, np.zeros([RUNLEN]))))
displaydata = np.ma.masked_where(displaydata == 0, displaydata)
plt.plot(displaydata)
plt.subplot(212)
plt.text(PRIMELEN,2.5,"DATA |", color=colors[1], horizontalalignment="right")
plt.text(PRIMELEN,2.5,"| +PREDICTED", color=colors[0], horizontalalignment="left")
plt.plot(displayresults)
plt.plot(disp_data)
RMSELEN=128
plt.axvspan(PRIMELEN, PRIMELEN+RMSELEN, color='grey', alpha=0.1, ymin=0.05, ymax=0.95)
plt.show()

rmse = math.sqrt(np.mean((data[OFFSET+PRIMELEN:OFFSET+PRIMELEN+RMSELEN] - results[:RMSELEN])**2))
print("RMSE on {} predictions (shaded area): {}".format(RMSELEN, rmse))

<a name="assignment4"></a>

<div class="alert alert-block alert-info">
**Assignment #4**: Make the predictions more robust. You can try any of the following:
<ol>
    <li> Use GRUCell instead of BasicRNNCell [in your model](#model)</li>
    <li> Train longer NB_EPOCHS 5 -> 10 [in your training loop](#training)</li>
    <li> Larger SEQLEN (16->32) [in hyperparameters](#hyperparameters)</li>
    <li> Use a stacked RNN cell with 2 layers with `tf.nn.rnn_cell.MultiRNNCell` [in your model](#model)</li>
    <li> Use dropout between the RNN cell layers [in your model](#model)</li>
</ol>
Play with these options until you get a good fit for at least 128 predicted samples. You can then try a [different waveform](#assignment1).
</div>

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
[http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.