<a href="https://colab.research.google.com/github/raquelaoki/DataAnalysis2016/blob/master/RNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##RNN

These are a type of Neural Networks (NN) that can have a long or shor 'memory'. In a scenario where the sequence of interest has $K$ elements/time stemps, the NN will have $K$ layers, one for each element. Considering a folded RNN format, there are 3 nodes: $x_t$ (input at step $t$), $s_t$ (hidden state at time $t$, the 'memory', $s_t = f(U\times x_t+W\times s_{t-1}))$ and $o_t$ (the output on the state $t$, $o_t = actfunc(V\times s_t)$). A RNN share the parameters $U$, $V$ and $W$ across all steps.

There are some variations that can be adopted: one-to-many (Image Description), many-to-one (sentiment analysis), many-to-many (Text Generation, Translation). 

###LSTM 

Long Short Term Memory (LSTM) are a subtype of RNN, design for long-term dependences. While RNN can also support long-term dependences, on its pure form, has a poor performance. 


References: 
- [The Unreasonable Effectiveness of Recurrent Neural Networks](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)

- [Tutorial in lua 1](https://github.com/jcjohnson/torch-rnn)
- [Tutorial in lua 1](https://github.com/karpathy/char-rnn/blob/master/model/LSTM.lua)
- [Tutorial in Python - More theory](https://towardsdatascience.com/recurrent-neural-networks-by-example-in-python-ffd204f99470)
- [Tutorial for LSTM - Tensorflow 1](https://adventuresinmachinelearning.com/recurrent-neural-networks-lstm-tutorial-tensorflow/)
- [Tutorial for RNN/LSTM - Tensorflow 2](https://github.com/dragen1860/TensorFlow-2.x-Tutorials/blob/master/09-RNN-Sentiment-Analysis/main.py)

Note: This exercise uses tensorflow 2



In [1]:
#!pip uninstall tensorflow -y
#!pip uninstall tf-nightly -y
#!pip uninstall tf-nightly-gpu -y 
#!pip install tensorflow-gpu==2.0.0 


Uninstalling tensorflow-1.15.0:
  Successfully uninstalled tensorflow-1.15.0
Collecting tensorflow-gpu==2.0.0
[?25l  Downloading https://files.pythonhosted.org/packages/25/44/47f0722aea081697143fbcf5d2aa60d1aee4aaacb5869aee2b568974777b/tensorflow_gpu-2.0.0-cp36-cp36m-manylinux2010_x86_64.whl (380.8MB)
[K     |████████████████████████████████| 380.8MB 47kB/s 
Collecting tensorboard<2.1.0,>=2.0.0
[?25l  Downloading https://files.pythonhosted.org/packages/76/54/99b9d5d52d5cb732f099baaaf7740403e83fe6b0cedde940fabd2b13d75a/tensorboard-2.0.2-py3-none-any.whl (3.8MB)
[K     |████████████████████████████████| 3.8MB 33.3MB/s 
Collecting tensorflow-estimator<2.1.0,>=2.0.0
[?25l  Downloading https://files.pythonhosted.org/packages/fc/08/8b927337b7019c374719145d1dceba21a8bb909b93b1ad6f8fb7d22c1ca1/tensorflow_estimator-2.0.1-py2.py3-none-any.whl (449kB)
[K     |████████████████████████████████| 450kB 54.9MB/s 
Collecting google-auth<2,>=1.6.3
[?25l  Downloading https://files.pythonhosted.org

In [0]:
import  os
import  tensorflow as tf
import  numpy as np
from    tensorflow import keras

#Using the code from available in:
#https://github.com/dragen1860/TensorFlow-2.x-Tutorials/blob/master/09-RNN-Sentiment-Analysis/main.py
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
tf.__version__

In [0]:
#what does os.environ? 
#fix random seed for reproducibility and checking tensor version
tf.random.set_seed(22)
np.random.seed(22)
assert tf.__version__.startswith('2.')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'



In [17]:
#TEST WITH A DIFFERENT DATASET
#while its a functional exemple, does not show the generated text back 



# load the dataset but only keep the top n words, zero the rest
top_words = 10000
# truncate and pad input sequences
max_review_length = 80
(X_train, y_train), (X_test, y_test) = keras.datasets.imdb.load_data(num_words=top_words)
# X_train = tf.convert_to_tensor(X_train)
# y_train = tf.one_hot(y_train, depth=2)
print('Pad sequences (samples x time)')
x_train = keras.preprocessing.sequence.pad_sequences(X_train, maxlen=max_review_length)
x_test = keras.preprocessing.sequence.pad_sequences(X_test, maxlen=max_review_length)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)

Pad sequences (samples x time)
x_train shape: (25000, 80)
x_test shape: (25000, 80)


In [0]:
class RNN(keras.Model):

    def __init__(self, units, num_classes, num_layers):
        super(RNN, self).__init__()

        # self.cells = [keras.layers.LSTMCell(units) for _ in range(num_layers)]
        #
        # self.rnn = keras.layers.RNN(self.cells, unroll=True)
        self.rnn = keras.layers.LSTM(units, return_sequences=True)
        self.rnn2 = keras.layers.LSTM(units)
        # self.cells = (keras.layers.LSTMCell(units) for _ in range(num_layers))
        # #
        # self.rnn = keras.layers.RNN(self.cells, return_sequences=True, return_state=True)
        # self.rnn = keras.layers.LSTM(units, unroll=True)
        # self.rnn = keras.layers.StackedRNNCells(self.cells)

        # have 1000 words totally, every word will be embedding into 100 length vector
        # the max sentence lenght is 80 words, top_words = 10000
        self.embedding = keras.layers.Embedding(top_words, 100, input_length=max_review_length)
        self.fc = keras.layers.Dense(1)

    def call(self, inputs, training=None, mask=None):

        # print('x', inputs.shape)
        # [b, sentence len] => [b, sentence len, word embedding]
        x = self.embedding(inputs)
        # print('embedding', x.shape)
        x = self.rnn(x) 
        x = self.rnn2(x) 
        # print('rnn', x.shape)

        x = self.fc(x)
        print(x.shape)

        return x

In [10]:
#def main():

units = 64
num_classes = 2
batch_size = 32
epochs = 20

model = RNN(units, num_classes, num_layers=2)

model.compile(optimizer=keras.optimizers.Adam(0.001),
              loss=keras.losses.BinaryCrossentropy(from_logits=True),
              metrics=['accuracy'])

# train
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs,
          validation_data=(x_test, y_test), verbose=1)

# evaluate on test set
scores = model.evaluate(x_test, y_test, batch_size, verbose=1)
print("Final test loss and accuracy :", scores)



(None, 1)
Train on 25000 samples, validate on 25000 samples
Epoch 1/20
(None, 1)
(None, 1)
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Final test loss and accuracy : [1.238747698506564, 0.81832]
