# Long Short-term Memory for Text Generation

This notebook uses LSTM neural network to generate text from Nietzsche's writings.

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import time
import random
import sys
import io


import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import Sequential
from tensorflow.keras import optimizers
from tensorflow.keras.callbacks import LambdaCallback
from tensorflow.keras.utils import get_file



## Dataset

### Get the data
Nietzsche's writing dataset is available online. The following code download the dataset.

In [2]:
path = get_file(
    'nietzsche.txt',
    origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
with io.open(path, encoding='utf-8') as f:
    text = f.read().lower()

Downloading data from https://s3.amazonaws.com/text-datasets/nietzsche.txt


### Visualize data

In [3]:
print('corpus length:', len(text))

corpus length: 600893


In [4]:
print(text[10:513])

supposing that truth is a woman--what then? is there not ground
for suspecting that all philosophers, in so far as they have been
dogmatists, have failed to understand women--that the terrible
seriousness and clumsy importunity with which they have usually paid
their addresses to truth, have been unskilled and unseemly methods for
winning a woman? certainly she has never allowed herself to be won; and
at present every kind of dogma stands with sad and discouraged mien--if,
indeed, it stands at all!


In [5]:
chars = sorted(list(set(text)))
# total nomber of characters
print('total chars:', len(chars))

total chars: 57


### Clean data

We cut the text in sequences of maxlen characters with a jump size of 3.
The features for each example is a matrix of size maxlen*num of chars.
The label for each example is a vector of size num of chars, which represents the next character.

In [6]:
# create (character, index) and (index, character) dictionary
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

In [7]:
# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

nb sequences: 200285


In [8]:
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Vectorization...


Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  y = np.zeros((len(sentences), len(chars)), dtype=np.bool)


## The model

### Build the model - fill in this box

## 1. Model 1

we need a recurrent layer with input shape (maxlen, len(chars)) and a dense layer with output size  len(chars)

In [9]:
# Build the model
model = tf.keras.models.Sequential([
    tf.keras.layers.LSTM(128, input_shape=(maxlen, len(chars))),
    tf.keras.layers.Dense(len(chars), activation='softmax')
])
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

### Inspect the model

Use the `.summary` method to print a simple description of the model

In [10]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 128)               95232     
                                                                 
 dense (Dense)               (None, 57)                7353      
                                                                 
Total params: 102,585
Trainable params: 102,585
Non-trainable params: 0
_________________________________________________________________


### Train the model

In [11]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [12]:
class PrintLoss(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, _):
        # Function invoked at end of each epoch. Prints generated text.
        print()
        print('----- Generating text after Epoch: %d' % epoch)

        start_index = random.randint(0, len(text) - maxlen - 1)
        for diversity in [0.5, 1.0]:
            print('----- diversity:', diversity)

            generated = ''
            sentence = text[start_index: start_index + maxlen]
            generated += sentence
            print('----- Generating with seed: "' + sentence + '"')
            sys.stdout.write(generated)

            for i in range(400):
                x_pred = np.zeros((1, maxlen, len(chars)))
                for t, char in enumerate(sentence):
                    x_pred[0, t, char_indices[char]] = 1.

                preds = model.predict(x_pred, verbose=0)[0]
                next_index = sample(preds, diversity)
                next_char = indices_char[next_index]

                sentence = sentence[1:] + next_char

                sys.stdout.write(next_char)
                sys.stdout.flush()
            print()

In [13]:
EPOCHS = 60
BATCH = 128

early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=2)

history = model.fit(x, y,
                    batch_size = BATCH,
                    epochs = EPOCHS,
                    validation_split = 0.2,
                    verbose = 1,
                    callbacks = [early_stop, PrintLoss()])

Epoch 1/60
----- Generating text after Epoch: 0
----- diversity: 0.5
----- Generating with seed: "--the greeks, for
instance, were a natio"
--the greeks, for
instance, were a natiot hhe the sons whit an the the ther andertithe tor
ftor wire be the whe hore he wore pererand of the matres fhat of whers the frarit to the thit be the med we there the wimo the the for whe to the ang as wall fhan the the the bhas win he sang and remenesons and sore the the asd ofre dere be cor mere the soun the the remong and thit theis be the mone longtithe sor inmesby fpered and bere and the in
----- diversity: 1.0
----- Generating with seed: "--the greeks, for
instance, were a natio"
--the greeks, for
instance, were a natiols, da re9atwis tpecy ociphi dosikg iope taeli! limand weds. ho ckere tooverhoe ace h pnde none"s ly érifslty entiws be ale of mhe rorerod"
s, mosl
orese toy an me fgre hare bh puinhelb rebsondygang the. tiny mp sotell -fmemgen the farl prof be greandiss bpese fosptutfrrpern ant, wur in

## Model 2 - 256 lstm units with 2 layers 


In [31]:
# define LSTM model
from keras.layers import Embedding, LSTM, Dense, Dropout
model = Sequential()
model.add(tf.keras.layers.LSTM(256, input_shape=(maxlen, len(chars)), return_sequences=True))
model.add(Dropout(0.2))
model.add(tf.keras.layers.LSTM(256))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(len(chars), activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')



In [32]:
model.summary()

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm_3 (LSTM)               (None, 40, 256)           321536    
                                                                 
 dropout (Dropout)           (None, 40, 256)           0         
                                                                 
 lstm_4 (LSTM)               (None, 256)               525312    
                                                                 
 dropout_1 (Dropout)         (None, 256)               0         
                                                                 
 dense_1 (Dense)             (None, 57)                14649     
                                                                 
Total params: 861,497
Trainable params: 861,497
Non-trainable params: 0
_________________________________________________________________


In [33]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [34]:
class PrintLoss(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, _):
        # Function invoked at end of each epoch. Prints generated text.
        print()
        print('----- Generating text after Epoch: %d' % epoch)

        start_index = random.randint(0, len(text) - maxlen - 1)
        for diversity in [0.5, 1.0]:
            print('----- diversity:', diversity)

            generated = ''
            sentence = text[start_index: start_index + maxlen]
            generated += sentence
            print('----- Generating with seed: "' + sentence + '"')
            sys.stdout.write(generated)

            for i in range(400):
                x_pred = np.zeros((1, maxlen, len(chars)))
                for t, char in enumerate(sentence):
                    x_pred[0, t, char_indices[char]] = 1.

                preds = model.predict(x_pred, verbose=0)[0]
                next_index = sample(preds, diversity)
                next_char = indices_char[next_index]

                sentence = sentence[1:] + next_char

                sys.stdout.write(next_char)
                sys.stdout.flush()
            print()

In [35]:
EPOCHS = 30
BATCH = 64

early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=2)

history = model.fit(x, y,
                    batch_size = BATCH,
                    epochs = EPOCHS,
                    validation_split = 0.2,
                    verbose = 1,
                    callbacks = [early_stop, PrintLoss()])

Epoch 1/30
----- Generating text after Epoch: 0
----- diversity: 0.5
----- Generating with seed: "y he would then
have repulsed somewhat t"
y he would then
have repulsed somewhat the repulled of the sours are that his wall on the reality the sare and allope thith the sencestesting to the amen as the many as itserfing to the shith be dith the somst of the seress and the self the saper of the parly and some and gelinge to one the strever and has has a prearidion, the which as the alr the stind of the prosulion, and and renours and it which the mevility and and a the reves the
----- diversity: 1.0
----- Generating with seed: "y he would then
have repulsed somewhat t"
y he would then
have repulsed somewhat the toal thay thisk betres actainht abbucies ove scensed of couthicivor: "purtedidy,
ceater,, a pronocabitly. and mil, hau4h the mash his apve
ave
ucwistss, with nitfrety,
who speniany ven exseng
mphist utlahspind exhired be hin
sorcen to loulth a per in) he it ally uncirlesudes"-.ofe.s 

# 3. Model 3 - 512 lstm units with 2 layers - Best Performance

In [15]:
# define LSTM model
from keras.layers import Embedding, LSTM, Dense, Dropout
model = Sequential()
model.add(tf.keras.layers.LSTM(512, input_shape=(maxlen, len(chars)), return_sequences=True))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.LSTM(512))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(len(chars), activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')



In [16]:
model.summary()

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm_8 (LSTM)               (None, 40, 512)           1167360   
                                                                 
 dropout_6 (Dropout)         (None, 40, 512)           0         
                                                                 
 lstm_9 (LSTM)               (None, 512)               2099200   
                                                                 
 dropout_7 (Dropout)         (None, 512)               0         
                                                                 
 dense_1 (Dense)             (None, 57)                29241     
                                                                 
Total params: 3,295,801
Trainable params: 3,295,801
Non-trainable params: 0
_________________________________________________________________


In [17]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [18]:
class PrintLoss(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, _):
        # Function invoked at end of each epoch. Prints generated text.
        print()
        print('----- Generating text after Epoch: %d' % epoch)

        start_index = random.randint(0, len(text) - maxlen - 1)
        for diversity in [0.5, 1.0]:
            print('----- diversity:', diversity)

            generated = ''
            sentence = text[start_index: start_index + maxlen]
            generated += sentence
            print('----- Generating with seed: "' + sentence + '"')
            sys.stdout.write(generated)

            for i in range(400):
                x_pred = np.zeros((1, maxlen, len(chars)))
                for t, char in enumerate(sentence):
                    x_pred[0, t, char_indices[char]] = 1.

                preds = model.predict(x_pred, verbose=0)[0]
                next_index = sample(preds, diversity)
                next_char = indices_char[next_index]

                sentence = sentence[1:] + next_char

                sys.stdout.write(next_char)
                sys.stdout.flush()
            print()

In [19]:
EPOCHS = 10
BATCH = 128

early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=2)

history = model.fit(x, y,
                    batch_size = BATCH,
                    epochs = EPOCHS,
                    validation_split = 0.2,
                    verbose = 1,
                    callbacks = [early_stop, PrintLoss()])

Epoch 1/10
----- Generating text after Epoch: 0
----- diversity: 0.5
----- Generating with seed: "ions under which, climatically and
hered"
ions under which, climatically and
hered the conderetice more the compile engiinly and the destition encemand for the pols the mank for the meyule the simed of the comfall, the filling the the be compouling of the susces of the ape the comsting the mare the mant is the
mance and sulest the misuling the secict of made meare
the his compally and a repearing the composion the his is the ale the suls the more a commany the sorle the sele th
----- diversity: 1.0
----- Generating with seed: "ions under which, climatically and
hered"
ions under which, climatically and
hered; in deorr more. im; as itseures ther, his so foray,-"there the canlcentive? fithon their belpent" the "mophand
freoth a kay oph his con-is the
memt, mod mishtul the evecal: lithars mederded lymiaces the.
tiensly te recomed
whomed, and
mears
wricatly necedolt mothir
comned
lorong themen

model is improving with increase in number of epochs and lstm units