The end goal of this project is to build a recurrent neural network model that can generate sequence text data. Various degree of randomness via the softmax temperature parameter inorder to strike the right balance between generating predictable, realistic sequences and suprising, creative sequences.

The data used for training this model is a free ebook (Magic Shadow) and can be found on www.gutenberg.org

Much of the structure of the code used can be found in the 8th chapter of Francois Chollet's book Deep Learning with Python.

PREPARING THE DATA

The data is downloaded, parsed and converted to lower case.

In [1]:
import numpy as np
from tensorflow import keras

In [2]:
path = keras.utils.get_file(
    'magic_shadow.txt',
    origin='https://www.gutenberg.org/files/64578/64578-0.txt')

Downloading data from https://www.gutenberg.org/files/64578/64578-0.txt


In [3]:
text = open(path, encoding='utf-8').read().lower()

In [4]:
print('Corpus Length:', len(text))

Corpus Length: 432284


Sequences of 60 characters are extracted, one-hot encoded and packed in a 3D numpy array x of shape (sequences, maximum length of the sequences, unique characters found in the corpus). The characters that comes after each extracted sequence (y) are also simultaneously one-hot encoded.

In [5]:
maxLen = 60
step = 3  #New sequences are sampled after every 3 characters
sentences = []  #Holds the extracted sequences(x)
nextChars = []  #Holds the targets or follow up characters(y)

In [6]:
for i in range(0, len(text) - maxLen, step): 
    sentences.append(text[i: i + maxLen])
    nextChars.append(text[i + maxLen])

In [7]:
print('Number of Sequences:', len(sentences))

Number of Sequences: 144075


In [8]:
chars = sorted(list(set(text)))  #List of unique characters in the corpus
print('Unique Characters:', len(chars)) 

Unique Characters: 89


In [9]:
charIndices = dict((char, chars.index(char)) for char in chars)
#maps unique characters to their index in the list of characters

One Hot Encoding of the Characters into Binary Arrays

In [10]:
x = np.zeros((len(sentences), maxLen, len(chars)), dtype=np.bool)

In [11]:
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)

In [12]:
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, charIndices[char]] = 1
        y[i, charIndices[nextChars[i]]] = 1

Building the Network

The neural network consists of one long short term memory(LSTM) layer with 128 hidden units followed by a dense layer with 89 (number of unique characters). Softmax activation since the model is expected to predict or output a range of sequence beyond 0, 1.

In [13]:
from tensorflow.keras import layers, models

In [14]:
model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxLen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

In [15]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm (LSTM)                  (None, 128)               111616    
_________________________________________________________________
dense (Dense)                (None, 89)                11481     
Total params: 123,097
Trainable params: 123,097
Non-trainable params: 0
_________________________________________________________________


In [16]:
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.RMSprop(lr=0.01))

Function to sample the next character given the model’s predictions

The code reweights the original probability distribution coming out of the model and draw a character index from it (the sampling function)

In [17]:
def sample(preds, temperature=1.0):
    preds=np.asarray(preds).astype('float64')
    preds=np.log(preds)/temperature
    expPreds = np.exp(preds)
    preds = expPreds/np.sum(expPreds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [18]:
import random, sys

The following loop repeatedly trains and generates text. It generates
text using a range of different temperatures after every epoch. This allows the viewing of how the generated text evolves as the model begins to converge, as well as the impact
of temperature in the sampling strategy

In [21]:
for epoch in range(1, 60): #The model is trained for 60 epochs
    print('Epoch:', epoch)
    model.fit(x, y, batch_size=128, epochs=1) #fits the model for one iteration on the data
    startIndex = random.randint(0, len(text) - maxLen - 1)
    generatedText = text[startIndex: startIndex + maxLen]    #Text seeds are selected at random
    print('Generating Text with Seed: "'+ generatedText + '"')
    
    for temperature in [0.2, 0.5, 1.0]: #Different range of sampling temperatures
        print('......Temperature:', temperature)
        sys.stdout.write(generatedText)
        
        for i in range(500): #Generates 500 characters starting from seed text
            sampled = np.zeros((1, maxLen, len(chars)))
            for t, char in enumerate(generatedText): #One hot encodes the characters generated
                sampled[0, t, charIndices[char]] = 1.
                
            preds = model.predict(sampled, verbose=0)[0] #Samples the next character
            nextIndex = sample(preds, temperature)
            nextChar = chars[nextIndex]
            
            generatedText += nextChar
            generatedText = generatedText[1:]
            
            sys.stdout.write(nextChar)

Epoch: 1
Generating Text with Seed: "instrument was patented, made and worked before any
other sa"
......Temperature: 0.2
instrument was patented, made and worked before any
other same and state and screen of the projector of the projector of the magic lantern of the screen in the magic lantern of the screen and screen of the fact and scheet and stanter of the real in the projection of the real the projection of the magic lantern of the projection of the real in the first the ending the motion pictures of the first in the projection of the projection of the projector and show and scheet and stampfer the magic lantern of the world the screen and the screen of the project gut......Temperature: 0.5
rn of the world the screen and the screen of the project gutenberg-tm intertance of the projector which would that the unine to the projector and the magic lantern best in the ports, schoot.

experiment of projector of the ending the antient stant and stanter, drence that the stered the project

  This is separate from the ipykernel package so we can avoid doing imports until


the work on a theatres of a sun arrament was made or years in the screen of real the magic lantern as the origine and shadow plays were projector and a art which was the english in marce......Temperature: 1.0
lays were projector and a art which was the english in marcels device battocornatic notory instrument (1 on observe 7there one in 1886, experiments somephance.”

realy
ushaucal art.

back, th: as
the first also during glassing and a hund1. then was the cortion
the peleingly of
the geieninia was obfainion of the jesuntde of a proved
with a reporter at the
thearval exhibiting to teid.

in london defusion bescreid of was no records and _la know and thought made with
somethcodires wall action.

steel phirci of picture.. exhibiting pelacisascopes.

storme
in Epoch: 20
Generating Text with Seed: "r senses really work.

alhazen did valuable work himself but"
......Temperature: 0.2
r senses really work.

alhazen did valuable work himself but the model of the scientific work of the model o

In [None]:
# Temperature 0.5 seems to be the right mix between realistic and creative sequences
# The higher the number of epochs, the better the performance of the model