https://gilberttanner.com/blog/generating-text-using-a-recurrent-neuralnetwork

In [1]:
# Creating our dictionary
with open('shakespeare.txt', 'r') as file:
    text = file.read().lower()
print('text length', len(text))

chars = sorted(list(set(text)))  # getting all unique chars
print('total chars: ', len(chars))

char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

text length 425879
total chars:  39


To get valuable data,which we can use to train our model we will split  our  data up into subsequences with a length of 40 characters. Then we will transform our data to an boolean array.

In [2]:
import numpy as np

maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])

x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Although creating a RNN sounds complex, the implementation is pretty easy using Keras. We will create a simple RNN with the following structure:

LSTM Layer: will learn the sequence
Dense(Fully connected) Layer: one output neuron for each unique char
Softmax Activation: Transforms outputs to probability values
We will use the RMSprop optimizer and the Categorical Crossentropy loss function.

In [3]:
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM
from keras.optimizers import RMSprop

model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))

optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

Helper Functions

In order to see the improvements our model makes whilst training we will create two helper functions. These two functions are from the official LSTM text generation example from the Keras Team.

The first helper function will sample an index from the output(probability array). It has a parameter called temperature which defines the freedom the function has when creating text. The second will generate text with four different temperatures at the end of each epoch so we can see how our model does.

In [4]:
from tensorflow.python.keras.callbacks import LambdaCallback
import sys
import random


def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)


def on_epoch_end(epoch, logs):
    # Function invoked at end of each epoch. Prints generated text.
    print()
    print('----- Generating text after Epoch: %d' % epoch)

    start_index = random.randint(0, len(text) - maxlen - 1)
    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print('----- diversity:', diversity)

        generated = ''
        sentence = text[start_index: start_index + maxlen]
        generated += sentence
        print('----- Generating with seed: "' + sentence + '"')
        sys.stdout.write(generated)

        for i in range(400):
            x_pred = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x_pred[0, t, char_indices[char]] = 1.

            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            sentence = sentence[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()


print_callback = LambdaCallback(on_epoch_end=on_epoch_end)

We will also define two other callback functions. The first is called ModelCheckpoint. It will save our model each epoch the loss decreases.

In [5]:
from keras.callbacks import ModelCheckpoint

filepath = "weights.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss',
                             verbose=1, save_best_only=True,
                             mode='min')

The other callback will reduce the learning rate each time our learning plateaus.

In [6]:
from keras.callbacks import ReduceLROnPlateau

reduce_lr = ReduceLROnPlateau(monitor='loss', factor=0.2,
                              patience=1, min_lr=0.001)

callbacks = [print_callback, checkpoint, reduce_lr]

Training a model and generating new text

For training we need to select an batch_size and the number of epochs we want to train. For the batch_size I choose 128 which is just an arbitrary number. I trained the model for only 5 epochs so I didn’t need to wait for so long but if you want you can train it for a lot more.

In [7]:
# train epoch
model.fit(x, y, batch_size=128, epochs=100, callbacks=callbacks)

Epoch 1/100
----- Generating text after Epoch: 0
----- diversity: 0.2
----- Generating with seed: "nd, who wooes
even now the frozen bosom "
nd, who wooes
even now the frozen bosom him and the south the shall the string
the shall the sead the shall in the shall the shall sir,
and the lid the shall the planter the sead of the streats to the shall the seed the sead the shall the for the say and the sead the sin the shall sould and the for the sine the say the shall the father the proming of the sard the say,
and the sould hamlet shall the sould and the sin the sark the seall t
----- diversity: 0.5
----- Generating with seed: "nd, who wooes
even now the frozen bosom "
nd, who wooes
even now the frozen bosom cost the fares and the bading of the first,
the sead and the sword, no should be and brutus of all hes and will the shull youne,
and the countrant them the know in than the spuck in it.
marce let on my lifir and not thou are made my love there and sir,
the sore me nor ear a butter spil

  preds = np.log(preds) / temperature


 speak to, good thinking my

Epoch 00040: loss improved from 1.09138 to 1.06577, saving model to weights.hdf5
Epoch 41/100
----- Generating text after Epoch: 40
----- diversity: 0.2
----- Generating with seed: "ter romeo and juliet above, at the windo"
ter romeo and juliet above, at the window
and all the pointce, and the seathman, the world
and the body and beauty of the word of the streets,
and the world it with a man itselfess.
cassius
i will not speak to the sears than the strenges,
and the body and lady more than the heaven,
to speak to the word of my mind to bed,
and be the seek his face of my sweet caesar,
the point!'
cossiclano

hamlet
what is the streets of his sword, and whe
----- diversity: 0.5
----- Generating with seed: "ter romeo and juliet above, at the windo"
ter romeo and juliet above, at the window
and death, then then, that may be so all his face.
king claudius
why, is the dispose to caesar's dead of the
scilentge of made you?
or shall come and fair on the world:
the

<tensorflow.python.keras.callbacks.History at 0x7fbfaedd9040>

To generate text ourselves we will create a function similar to the on_epoch_end function. It will take a random starting index, take out the next 40 chars from the text and then use them to make predictions. As a parameter we will pass it the length of the text we want to generate and the diversity of the generated text.

In [8]:
def generate_text(length, diversity):
    # Get random starting text
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated = ''
    sentence = text[start_index: start_index + maxlen]
    generated += sentence
    for i in range(length):
        x_pred = np.zeros((1, maxlen, len(chars)))
        for t, char in enumerate(sentence):
            x_pred[0, t, char_indices[char]] = 1.

        preds = model.predict(x_pred, verbose=0)[0]
        next_index = sample(preds, diversity)
        next_char = indices_char[next_index]

        generated += next_char
        sentence = sentence[1:] + next_char
    return generated

Now we can create text by just calling the generate_text function:

In [25]:
# generate text
print(generate_text(250, 0.5))

r of what might fall, so to prevent
the standed is at the earth her face of the streets;
and she is, you and there is no more thans
the roman be read, the world it did true.
what, with heavens, my lord, i am to thee,
to caesar's spiring with all the bloody,
and for the winds, to live thee 


Recurrent Neural Networks are a technique of working with sequential data,   because they can remember the last inputs via an internal memory. They achieve state of the art performance on pretty much every sequential problem and are used by most major companies. A RNN can be used to generate text in the style of a specific author.

The steps of creating a text generation RNN are:

    Creating or gathering a dataset
Building the RNN model
Creating new text by taking a random sentence as a starting point
The details of this project can be found here. I’d encourage anyone to play around with the code and maybe change the dataset and preprocessing steps and see what happens.

There are also a lot of things you can improve about the model to get better outputs. A few of them are:

    Using a more sophisticated network structure (more LSTM-, Dense Layers)
Training for more epochs
Playing around with the batch_size
If you liked this article consider subscribing on my Youtube Channel and following me on social media.

If you have any question or critic I can be reached via Twitter or the comment section.
