## Lab 10, Part 2:   Recurrent Neural Networks (RNN)  -- Extra Credit

When it comes to model sequential data such as sentences, documents and videos, etc, the state of the art approach is to use Recurrent neural network (RNN). At each timestep, RNN takes an element (such as a word) as input, combines with past information encoded as a vector (such as all information in the sentence before this timestep), generate a new vector encoding both current input and past information, then delivers it to next timestep.

For more details about LSTM (a very popular variant of RNN), please refer to http://colah.github.io/posts/2015-08-Understanding-LSTMs/ and here is a very good video explaining RNN: https://www.youtube.com/watch?v=WCUNPb-5EYI.

### Generating text with Long Short-Term Memory Networks

RNN can be used to generate text. For more information, please read: https://karpathy.github.io/2015/05/21/rnn-effectiveness/.

The following is an example script to generate text from Nietzsche's writings.

Note: 
- At least 20 epochs are required before the generated text
starts sounding coherent.

- It is recommended to run this script on GPU, as recurrent
networks are quite computationally intensive.

- If you try this script on new data, make sure your corpus
has at least ~100k characters. ~1M is better.

In [1]:
#Import necessary libraries 
from __future__ import print_function
from keras.callbacks import LambdaCallback
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
import numpy as np
import random
import sys
import io

Using TensorFlow backend.


In [2]:
#Get the data - available from amazon
path = get_file('nietzsche.txt', origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
with io.open(path, encoding='utf-8') as f:
    text = f.read().lower() # make it all lowercase 
print('corpus length:', len(text))

chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

corpus length: 600893
total chars: 57


In [3]:
# Cut the text in semi-redundant sequences of maxlen characters
## Cut the text into a series of windows. 
## Each window is 40 characters
## The window moves 3 steps forward each step

maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

# Turn these sentances into one-hot encoded vectors
## For all words in the sentances, there is a one, else there is a zero in that index of the vector

print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

nb sequences: 200285
Vectorization...


Now we have data to feed a model for text generation. Next  we build a LSTM model to fit the data. Using Keras this is only few lines of code!

In [4]:
# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))

optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

Build model...


In [5]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)


def on_epoch_end(epoch, logs):
    # Function invoked at end of each epoch. Prints generated text.
    print()
    print('----- Generating text after Epoch: %d' % epoch)

    start_index = random.randint(0, len(text) - maxlen - 1)
    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print('----- diversity:', diversity)

        generated = ''
        sentence = text[start_index: start_index + maxlen]
        generated += sentence
        print('----- Generating with seed: "' + sentence + '"')
        sys.stdout.write(generated)

        for i in range(400):
            x_pred = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x_pred[0, t, char_indices[char]] = 1.

            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            sentence = sentence[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

### Training (reduce the number of epochs, it takes a lot of time!!)
-  Each epoch takes 5-10 minutes or so on a CPU (an epoch took 7.5 minutes for my PC)
-  Recall that training on at least 20 epochs will give intelligible results 
-  So you're gonna have to let that puppy run for a while (2-3 hours)

In [7]:
print_callback = LambdaCallback(on_epoch_end=on_epoch_end)

model.fit(x, y,
          batch_size=128,
          epochs=10,
          callbacks=[print_callback])
model.save('shakespear100.h5')

Epoch 1/10

----- Generating text after Epoch: 0
----- diversity: 0.2
----- Generating with seed: "e control of women, in the marriage cust"
e control of women, in the marriage cust in the proper of the proper and strect of the sumet and the such and the proper of the proper and the suphisting of the proper of the such the the and and the such the super the consification of the proper of the strection of the proper of the super of the man interpress of the prople of the such intellest that is a contine of the man interpoure of the super and the superstand in the super that i
----- diversity: 0.5
----- Generating with seed: "e control of women, in the marriage cust"
e control of women, in the marriage cust propection of fuelly to the spirit of hempentuly that in the proposity of the has of the superstanting that the something with rither and let propliance case of littous and still is with the into the man interman to perhaps inter the conternations of the scirction, its end, the presen

"actually" experience-.=--things for them be invoined that the awbilly you are, "to with efual commanik, whose ghild, still hafter by its toolound, bast for partance
at i doce.--it
hatility of himgent, say to things evidings as crest and individualismss, imperasitayars revalien involplene fullening itte.the may not."--the most mankhand when he south but returning minds, best hoper as things himselfs? micho, for
----- diversity: 1.2
----- Generating with seed: "s of our soul as anything
"actually" exp"
s of our soul as anything
"actually" explanasy, the historance, that they"ktrow tom, upunity and ornam abyest victogy taste-this bndy of tt
day he
turning; this hyporien dustwangs, over-stilipnefulme--and altisusi
endure. in anset mighty how conous indeed
ectoers
of
regaditure--whan hus dangerscies h
that of one's points
aif at
the sour by the whole that thas knowed are noble aroful-prease what binok"
! pinism us, nkir. by for onlf the 
Epoch 5/10

----- Generating text after Epoch: 4
---

virtues of the sense of the propers of the more the subtlest the same present which the same in the participately and all the sense of the stands of the same the sense of the same lights of the spirit of the fact of the consideration of the stands of the specially of the sense of the present of the same belief of a sense of the sense of the such man and sense of the same belief of the more which th
----- diversity: 0.5
----- Generating with seed: " and relapse into old loves and narrow
v"
 and relapse into old loves and narrow
virtues itself in in the more which the such scientian falses origin of the domain of the respectant by
the philosophers in
the appreciately end the earth as in a respect of the and all german one; and a pain themselves by the hers of it, the necessary than their life conscience of the former to there is not believed that that is more and respectacle that the works in the strength of his far in the
----- diversity: 1.0
----- Generating with seed: " and relapse in

  after removing the cwd from sys.path.


out the right to the contradientic the spiritual philosophers approan of the such more philosophers to charm of the soul this superscitibles to him in the capacity, as the
----- diversity: 1.0
----- Generating with seed: " an
important refinement of vision and o"
 an
important refinement of vision and on the mistom having has the possible matious have
no fanc it despainary of this very worthy
who
you no had been an and preal strengs
thereforouse of
the knowledge of the
difficultences it havideng, extentedly and to a nature
savage who would be brack to hur hoftine here is is the herper, and
fasher thing the trageronism as
the is proportion: what, allathy!",", one masple. to canner
virtue
can not 
----- diversity: 1.2
----- Generating with seed: " an
important refinement of vision and o"
 an
important refinement of vision and of the subery of the image only in our sendence expedience of the
frimiely to inothe."--not much readced increake a nobly rigng cause would punish, neale thingsenes

In [None]:
model.fit(x, y,
          batch_size=128,
          epochs=50,
          callbacks=[print_callback])
model.save('shakespear100.h5')

Epoch 1/50

----- Generating text after Epoch: 0
----- diversity: 0.2
----- Generating with seed: "" is perhaps the greatest audacity and ""
" is perhaps the greatest audacity and "the spiritualism of the most surpristion of the subjection of the surpristion of the most soul of the surpristion of the surcultarily of the soul. the surpristed to the soul. the sense of the subjected to the conscious and as a morality of the spiritualism of the soul of the surpristion of the surpristion of the struggle and who has a morality of the states of the soul of the surpristion of the sp
----- diversity: 0.5
----- Generating with seed: "" is perhaps the greatest audacity and ""
" is perhaps the greatest audacity and "states of the such as thereby all the struggle with a certain their personaly and a spiritualism of the conditions of the loves of its
any soul in ohthers and the master, the free spirituality is the surpress of scholar simply and in the end, and an end the religious desparnd in the gr

  after removing the cwd from sys.path.


allegness of the scholar is desprebtrant of him reprisent and all and in man as the complevely much read
----- diversity: 1.0
----- Generating with seed: "i" "free-thinkers,"
and whatever these h"
i" "free-thinkers,"
and whatever these have he attained
the would commanding "will much statumage." but traver, ipparoding to modern
must
they more
allaters of su-my had a most advancemated which "ejuthhtic thou us,
evhilavent cellent of just univertip. i herangeve that notial he culture!


1ohhe power--it necessarilizances every against enterdendered vidatedly, which of problem,"
gryen will so fellig, is an interpretation!ahus habition
----- diversity: 1.2
----- Generating with seed: "i" "free-thinkers,"
and whatever these h"
i" "free-thinkers,"
and whatever these halling kind efflat his motives. i knownvem! something wend lightnable wasn, enflined cannheub, which
sacovisis "modenswdenced may cannot philosoch to preglibity that ourve "sacome obtranga to away "mask: thereifanktishic
quire, me

In [8]:
del model


## Load pre-trained model
Since it is time consuming to train this LSTM model with CPU for more epochs, we provided a pre-trained model which is trained on GPU for 100 epochs. Use the following code to check how coherency the model is.

It requires h5py packages, please install it to test the following code.

In [9]:
# build the model: a single LSTM
print('Load pre-trained model...')
from keras.models import load_model
model = load_model('shakespear100.h5')


def lstm_generate(seed, model):
    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print('----- diversity:', diversity)

        generated = ''
        generated += seed
        print('----- Generating with seed: "' + seed + '"')
        sys.stdout.write(generated)

        for i in range(400):
            x_pred = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(seed):
                x_pred[0, t, char_indices[char]] = 1.

            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            seed = seed[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()


seed = "from an anguish with which no other is t"
lstm_generate(seed, model)


Load pre-trained model...
----- diversity: 0.2
----- Generating with seed: "from an anguish with which no other is t"
from an anguish with which no other is the master, and should also the sense of the sense of the subjection of the subjection of the person of the consciousness, and the so the subjection of the subjection of the sense of man is not the more and the consciousness of the present the more the consciousness of the present the consciousness and the sense of the sense of the such a present and the subjection of the case to the subjection of 
----- diversity: 0.5
----- Generating with seed: "ection of the case to the subjection of "
ection of the case to the subjection of contemptations to see us which have not thought what there is a finally every and nature be the above-sensuious man and desire known in all religious age everything the consciousness, and even who have been the great not consual"--in the see marter the things, that the claims, he may be indesprected is every

### Exercise: try it to generate baby names
-  The baby name data set contains 8000 names. You can download and process the name data set as follows:

```python
name_path = get_file('names.txt', origin='http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/corpora/names/other/names.txt')
with io.open(name_path, encoding='utf-8') as f:
    text = f.read() # make it all lowercase 
    
text = text.split()
text = ', '.join(text)
```

Using the baby name data set, answer the following tasks:

- Train a LSTM to generate the baby names.
- How long does it take to train? How coherent does it sound? 
- Can you train the LSTM, but for every epoch, shuffle the order of names before call model.fit()? How long does it take to train? Does it improve the coherency?

