# Character Level RNN using LSTM cells.

- Trained on 1MB of Shakespeare.
- Outputs "fake" Shakespeare.

Much comes from a [Keras example](https://github.com/fchollet/keras/blob/master/examples/lstm_text_generation.py).

In [2]:
## Much borrowed from https://github.com/fchollet/keras/blob/master/examples/lstm_text_generation.py

from __future__ import print_function
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
from keras.models import load_model
import numpy as np
import random
import sys

#path = get_file('nietzsche.txt', origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
text = open("tiny-shakespeare.txt").read().lower()
print('corpus length:', len(text))

chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

Using TensorFlow backend.


corpus length: 1115394
total chars: 39


## Setup Environment

- Import Keras
- Open up the Shakespeare corpus
- Give each leter an index and create dictionaries to translate from index to character.

## Setup Training Data

- Cut up the corpus into sequences of 40 characters.
- Change indexes into "one-hot" vector encodings.

## Model

- Model has one hidden layer of 128 LSTM cells.

## Training

- Train on batches of 128 examples


In [5]:
# cut the text in semi-redundant sequences of maxlen characters
maxlen = 50
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

print('Vectorization...')
X = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        X[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1
    
print('done.')

nb sequences: 371782
Vectorization...
done.


In [3]:
# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))

optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
model.summary()



Build model...
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 128)               86016     
_________________________________________________________________
dense_1 (Dense)              (None, 39)                5031      
_________________________________________________________________
activation_1 (Activation)    (None, 39)                0         
Total params: 91,047.0
Trainable params: 91,047.0
Non-trainable params: 0.0
_________________________________________________________________


In [4]:
# Training the Model.
model.fit(X, y, batch_size=128, epochs=5)
model.save("keras-shakespeare-LSTM-model.h5")

Epoch 1/5

KeyboardInterrupt: 

In [8]:
model = load_model("keras-shakespeare-LSTM-model.h5")
quote = "Be not afraid of greatness: some are born great, some achieve greatness, and some have greatness thrust upon them."
quote = quote.lower()

def sample_model(seed, length=400):
    generated = ''
    sentence = seed.lower()[:50]
    generated += sentence
    sys.stdout.write(generated)
    
    for i in range(length):
        x = np.zeros((1, maxlen, len(chars)))
        for t, char in enumerate(sentence):
            x[0, t, char_indices[char]] = 1.
            
        preds = model.predict(x, verbose=0)[0]
        next_index = sample(preds, 0.8)
        next_char = indices_char[next_index]
        
        generated += next_char
        sentence = sentence[1:] + next_char
    print(generated)

sample_model(quote)

be not afraid of greatness: some are born great, sbe not afraid of greatness: some are born great, so,
which infectiesse of shoeves there loves,
there were expt my sovereign against the strest
throw then forrole as give me brother--
good moring to alon your mother,
and there to reads to ready's prosh, my space
the friend of herebother to the fellowidse
which i leave to he would true mother and thir?

angelo:
there and witch a dild.

propt,
is the kate as my shepphing.

juliet:
if he hath he will


In [9]:
# train the model, output generated text after each iteration
for iteration in range(1, 60):
    print()
    print('-' * 50)
    print('Iteration', iteration)
    model.fit(X_train, y_train,
              batch_size=128,
              epochs=1)

    start_index = random.randint(0, len(text) - maxlen - 1)

    for diversity in [0.5]:
        print()
        print('----- diversity:', diversity)

        generated = ''
        sentence = text[start_index: start_index + maxlen]
        generated += sentence
        print('----- Generating with seed: "' + sentence + '"')
        sys.stdout.write(generated)

        for i in range(400):
            x = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x[0, t, char_indices[char]] = 1.

            preds = model.predict(x, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            sentence = sentence[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()


--------------------------------------------------
Iteration 1
Epoch 1/1

----- diversity: 0.5
----- Generating with seed: "ch hope have all the line of john of gaunt!

richa"
ch hope have all the line of john of gaunt!

richard youk:
why, so so now a crowers and will be sould i say
that prison of you here so so so like his soul
and we so like the monger to the more fore all me
and so the head to the painted and lase for a crain.

second murderer:
a may you hold to lest were soul to the stay,
and the hath you so follow the fore shall he couses
consin you the other give the beneres are to
to the made of my bound of your

--------------------------------------------------
Iteration 2
Epoch 1/1

----- diversity: 0.5
----- Generating with seed: "r conscience sake, to help to get thee a wife.

se"
r conscience sake, to help to get thee a wife.

second serving:
without the man thou art whom made unto unter there,
and the lady great shall poor of the king richard
as the rettent to the many t



 the post possess'd upon
the gods presence and what he was a forth some
when i have and with me the daughter.

claudio:
i love me, i would thee,
through'd speak the people of this friends,
that w

--------------------------------------------------
Iteration 9
Epoch 1/1

----- diversity: 0.5
----- Generating with seed: "hat studied torments, tyrant, hast for me?
what wh"
hat studied torments, tyrant, hast for me?
what while the denution to him hear and lord?

capulet:
why, if i do be rich a bards,
and thy dead; i think then the day to the sacred
that had so, and this is been the very rawle
to see the common too is bear the rost,
i will we love the partied to hereford,
which, have thens he is not be in saw men,
and i'll weep the duke him and seed.

gloucester:
and i know a far mess in this heart,
lows is marry and

--------------------------------------------------
Iteration 10
Epoch 1/1

----- diversity: 0.5
----- Generating with seed: "ys to wail.
to fear the foe, since fear oppresseth

KeyboardInterrupt: 