# Text Generation: LSTM

LSTM networks (Long Short Term Memory) are a special kind of RNN (Recurrent neural network) that are good at learning long-term dependencies.

For a good explanation see this [post](http://colah.github.io/posts/2015-08-Understanding-LSTMs/).

In this exmaple we will use an LSTM to generate text based on some input text (Nietzsche's wrtitings.)

In [2]:
from __future__ import print_function

from keras.callbacks import LambdaCallback
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file

import numpy as np
import random
import sys
import io


## Data

The dataset is a freely avaiable text dataset of [Nietzsche's writings](https://s3.amazonaws.com/text-datasets/nietzsche.txt) you can download.

This dataset will need spliting up into smaller sequences that can then be used to train the LSTM

In [5]:
# get the data
path = get_file('nietzsche.txt', origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
with io.open(path, encoding='utf-8') as f:
    text = f.read().lower()
print('corpus length:', len(text))

Downloading data from https://s3.amazonaws.com/text-datasets/nietzsche.txt
corpus length: 600893


Sort the dataset so that we have a dictionary with indices corresponding to each character and vice versa

In [6]:
# how many different characters is that?
chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

total chars: 57


### Pre-processing

Here we take a sequence and store it, we also store the following character, as this is what the LSTM will be training on

In [7]:
# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

nb sequences: 200285


Put the sequences into an array

In [11]:
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Vectorization...


We now have an array with sequences of characters represented as True at a given index

## The Model: LSTM

In [9]:
model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))

In [10]:
# compile the model
optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

### Helper functions

In [14]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [15]:
def on_epoch_end(epoch, logs):
    # Function invoked at end of each epoch. Prints generated text.
    print()
    print('----- Generating text after Epoch: %d' % epoch)

    start_index = random.randint(0, len(text) - maxlen - 1)
    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print('----- diversity:', diversity)

        generated = ''
        sentence = text[start_index: start_index + maxlen]
        generated += sentence
        print('----- Generating with seed: "' + sentence + '"')
        sys.stdout.write(generated)

        for i in range(400):
            x_pred = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x_pred[0, t, char_indices[char]] = 1.

            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            sentence = sentence[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

Set the function as a custom callback to be called at the end of each epoch

In [16]:
print_callback = LambdaCallback(on_epoch_end=on_epoch_end)

# Fit

In [18]:
model.fit(x, y,
          batch_size=128,
          epochs=60,
callbacks=[print_callback])

Epoch 1/60

----- Generating text after Epoch: 0
----- diversity: 0.2
----- Generating with seed: " with a variety of
allied sentiments and"
 with a variety of
allied sentiments and are the constingt in the morality and the world, and the the instand and the sould and the morality and the morality and the the morality and intenting the the great in the morality and the truent of the trueth of the world and the morality and the sound and and the properality and any sould and the morality and the condividual the constingt of the condividusing and the sould and the lite the mor
----- diversity: 0.5
----- Generating with seed: " with a variety of
allied sentiments and"
 with a variety of
allied sentiments and cruation of the respination of the world, and indeven, and every in formance, and indeed and mose psince, accound we are an aristanion and anth had in the sould, the an
ancinnences of the constitally sound and has even caring
to the are as the desint the still not always and truence, 

KeyboardInterrupt: 