# Long Short-term Memory for Text Generation

Long Short-term Memory for Text Generation

Use LSTM neural network to generate text from Nietzsche's writings

Training Dataset is available online at https://s3.amazonaws.com/text-datasets/nietzsche.txt



This notebook uses LSTM neural network to generate text from Nietzsche's writings.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import time
import random
import sys
import io


import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import optimizers
from tensorflow.keras.callbacks import LambdaCallback
from tensorflow.keras.utils import get_file

from keras.models import Sequential
from keras.layers import Activation, Dense, Dropout, LSTM,Flatten
from keras.layers import Conv2D,MaxPooling2D

## Dataset

### Get the data
Nietzsche's writing dataset is available online. The following code download the dataset.

In [None]:
path = get_file(
    'nietzsche.txt',
    origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
with io.open(path, encoding='utf-8') as f:
    text = f.read().lower()

Downloading data from https://s3.amazonaws.com/text-datasets/nietzsche.txt


### Visualize data

In [None]:
print('corpus length:', len(text))

corpus length: 600893


In [None]:
print(text[10:513])

supposing that truth is a woman--what then? is there not ground
for suspecting that all philosophers, in so far as they have been
dogmatists, have failed to understand women--that the terrible
seriousness and clumsy importunity with which they have usually paid
their addresses to truth, have been unskilled and unseemly methods for
winning a woman? certainly she has never allowed herself to be won; and
at present every kind of dogma stands with sad and discouraged mien--if,
indeed, it stands at all!


In [None]:
chars = sorted(list(set(text)))
# total nomber of characters
print('total chars:', len(chars))

total chars: 57


### Clean data

We cut the text in sequences of maxlen characters with a jump size of 3.
The features for each example is a matrix of size maxlen*num of chars.
The label for each example is a vector of size num of chars, which represents the next character.

In [None]:
# create (character, index) and (index, character) dictionary
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

In [None]:
# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

nb sequences: 200285


In [None]:
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Vectorization...


Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  y = np.zeros((len(sentences), len(chars)), dtype=np.bool)


## The model

### Build the model

we need a recurrent layer with input shape (maxlen, len(chars)) and a dense layer with output size  len(chars)

In [None]:
def build_lstm_model():
    lstm = Sequential()
    lstm.add(LSTM(128,input_shape=(40,len(chars))))
    lstm.add(layers.BatchNormalization())
    lstm.add(Dropout(0.1))
    lstm.add(Dense((len(chars)),activation = 'softmax'))
    lstm.compile(loss='categorical_crossentropy', metrics=['accuracy'])
    return lstm

In [None]:
optimizer = keras.optimizers.RMSprop(learning_rate=0.01)
model = build_lstm_model()

### Inspect the model

Use the `.summary` method to print a simple description of the model

In [None]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 128)               95232     
                                                                 
 batch_normalization (BatchN  (None, 128)              512       
 ormalization)                                                   
                                                                 
 dropout (Dropout)           (None, 128)               0         
                                                                 
 dense (Dense)               (None, 57)                7353      
                                                                 
Total params: 103,097
Trainable params: 102,841
Non-trainable params: 256
_________________________________________________________________


### Train the model

In [None]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [None]:
tf.random.set_seed(14)
class PrintLoss(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, _):
        # Function invoked at end of each epoch. Prints generated text.
        print()
        print('----- Generating text after Epoch: %d' % epoch)

        start_index = random.randint(0, len(text) - maxlen - 1)
        for diversity in [0.5, 1.0]:
            print('----- diversity:', diversity)

            generated = ''
            sentence = text[start_index: start_index + maxlen]
            generated += sentence
            print('----- Generating with seed: "' + sentence + '"')
            sys.stdout.write(generated)

            for i in range(400):
                x_pred = np.zeros((1, maxlen, len(chars)))
                for t, char in enumerate(sentence):
                    x_pred[0, t, char_indices[char]] = 1.

                preds = model.predict(x_pred, verbose=0)[0]
                next_index = sample(preds, diversity)
                next_char = indices_char[next_index]

                sentence = sentence[1:] + next_char

                sys.stdout.write(next_char)
                sys.stdout.flush()
            print()

In [None]:
EPOCHS = 10
BATCH = 128

early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=2)

history = model.fit(x, y,
                    batch_size = BATCH,
                    epochs = EPOCHS,
                    validation_split = 0.2,
                    verbose = 1,
                    callbacks = [early_stop, PrintLoss()])

Epoch 1/10
----- Generating text after Epoch: 0
----- diversity: 0.5
----- Generating with seed: "n
"prudently and apart." wisdom: that se"
n
"prudently and apart." wisdom: that sect
ch thon thelly
ch the
ch
the

full ns

xpllict ll
the th


xpllonthy

noll

xprelly
chall

nsthe

folly

hall

xplly
hinl

lf
ch the
ch
phinsthin the pr noplly
phis
ch
chimy lly
knon

hilly
nct

no
pllly
nctlly
blly
lf
the
plecthes

xthepliclly
thiclly

f
cullinclly
nolly; the hich

hin lly
chint

nthe thom

xplly
ch the ss no
lly ff
ch
phiclly

xpllinthin

xpsontint

fullly
thiclly,
thilly
n


----- diversity: 1.0
----- Generating with seed: "n
"prudently and apart." wisdom: that se"
n
"prudently and apart." wisdom: that se
sh ll
": fully tisn the
btelf-lfolly
ins lf rs lid

nllly ny

ha
llymplolly
thecly
hons ésf
llken phy
[5f
ll ghilléfinl oncull,
;
_
ch. lf ll

knont ly

xpliplonly
bllonchi(ll
pherysten
t; nl thely hirplyglett eht

furll hy ns
hy lly thot
chyce?; n

hophillsy

pllftunck
non: "s

pplinc