# Harshraj Jadeja

# Long Short-term Memory for Text Generation

This notebook uses LSTM neural network to generate text from Nietzsche's writings.

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import time
import random
import sys
import io
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import optimizers
from tensorflow.keras.callbacks import LambdaCallback
from tensorflow.keras.utils import get_file

## Dataset

### Get the data
Nietzsche's writing dataset is available online. The following code download the dataset.

In [2]:
path = get_file(
    'nietzsche.txt',
    origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
with io.open(path, encoding='utf-8') as f:
    text = f.read().lower()

### Visualize data

In [3]:
print('corpus length:', len(text))

corpus length: 600893


In [4]:
print(text[10:513])

supposing that truth is a woman--what then? is there not ground
for suspecting that all philosophers, in so far as they have been
dogmatists, have failed to understand women--that the terrible
seriousness and clumsy importunity with which they have usually paid
their addresses to truth, have been unskilled and unseemly methods for
winning a woman? certainly she has never allowed herself to be won; and
at present every kind of dogma stands with sad and discouraged mien--if,
indeed, it stands at all!


In [5]:
chars = sorted(list(set(text)))
# total nomber of characters
print('total chars:', len(chars))

total chars: 57


### Clean data

We cut the text in sequences of maxlen characters with a jump size of 3.
The features for each example is a matrix of size maxlen*num of chars.
The label for each example is a vector of size num of chars, which represents the next character.

In [6]:
# create (character, index) and (index, character) dictionary
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

In [7]:
# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

nb sequences: 200285


In [8]:
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool_)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool_)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Vectorization...


## The model

### Build the model - fill in this box

we need a recurrent layer with input shape (maxlen, len(chars)) and a dense layer with output size  len(chars)

In [9]:
# Define the number of units in the LSTM layer.
# This is a hyperparameter that represents the dimensionality of the output space.
# More units can allow the model to capture more complex patterns but also increases computational complexity.
lstm_units = 128  # Adjust this number based on the complexity of the task and computational constraints.

# Initialize the Sequential model
model = tf.keras.Sequential([
    # Add an LSTM layer as the first layer of the model
    # input_shape is required as the LSTM layer's first layer to let it know the shape of the input it should expect
    # Here, input_shape=(maxlen, len(chars)) means each input sequence will be of length 'maxlen'
    # and each character in the sequence is represented as a one-hot encoded vector of length 'len(chars)'
    tf.keras.layers.LSTM(lstm_units, input_shape=(maxlen, len(chars))),
    
    # Add a Dense output layer
    # The number of units equals the number of unique characters (len(chars))
    # This is because we want to output a probability distribution over all possible characters
    # Softmax activation function is used to output probabilities
    tf.keras.layers.Dense(len(chars), activation='softmax'),
])

# Compile the model
# 'categorical_crossentropy' is used as the loss function since this is a multi-class classification problem
# 'adam' optimizer is chosen for efficient stochastic gradient descent optimization
# Accuracy is monitored as a metric to observe the performance of the model during training
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Display the model's architecture
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 128)               95232     
                                                                 
 dense (Dense)               (None, 57)                7353      
                                                                 
Total params: 102585 (400.72 KB)
Trainable params: 102585 (400.72 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


### Inspect the model

Use the `.summary` method to print a simple description of the model

In [10]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 128)               95232     
                                                                 
 dense (Dense)               (None, 57)                7353      
                                                                 
Total params: 102585 (400.72 KB)
Trainable params: 102585 (400.72 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


### Train the model

In [11]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [12]:
class PrintLoss(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, _):
        # Function invoked at end of each epoch. Prints generated text.
        print()
        print('----- Generating text after Epoch: %d' % epoch)

        start_index = random.randint(0, len(text) - maxlen - 1)
        for diversity in [0.5, 1.0]:
            print('----- diversity:', diversity)

            generated = ''
            sentence = text[start_index: start_index + maxlen]
            generated += sentence
            print('----- Generating with seed: "' + sentence + '"')
            sys.stdout.write(generated)

            for i in range(400):
                x_pred = np.zeros((1, maxlen, len(chars)))
                for t, char in enumerate(sentence):
                    x_pred[0, t, char_indices[char]] = 1.

                preds = model.predict(x_pred, verbose=0)[0]
                next_index = sample(preds, diversity)
                next_char = indices_char[next_index]

                sentence = sentence[1:] + next_char

                sys.stdout.write(next_char)
                sys.stdout.flush()
            print()

In [13]:
EPOCHS = 60
BATCH = 128

early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=2)

history = model.fit(x, y,
                    batch_size = BATCH,
                    epochs = EPOCHS,
                    validation_split = 0.2,
                    verbose = 1,
                    callbacks = [early_stop, PrintLoss()])

Epoch 1/60
----- Generating text after Epoch: 0
----- diversity: 0.5
----- Generating with seed: "r rage i even took a fancy to them, thes"
r rage i even took a fancy to them, thes
ang ristion so it optill susteveve wes ace sore and tha the ce the peroce wher and an the the the cher an that ont as and the ching an sery the the are he than and of of the and and on the to tist ang mat the coma morte fithist and in the teres the pererin the the the the hathere fond what
eand than wan on ang of be pathe
the fof ant on an the thar prase and or an an whe ther the is erere the the
----- diversity: 1.0
----- Generating with seed: "r rage i even took a fancy to them, thes"
r rage i even took a fancy to them, thes tn coeg rngol6 unon fasity tue
s
aspronct, net riot wo a
d ovef er. tho wades micanderevenonsbire t: sf of bonesd. int arsyyand the
 rarese fe masts gare heap-pefingerot, an onke-
ehyeadn s ans  ana owoum inte,roagnodegisof an ys pore pawof maaandestixad that inve fiatf
ins. falet eng 

he depth as of the mould, something uncome they but the inderers, and every of the store prespors and radience and are the suble and question sull the poncention all to be the fact the arist the constice, and gerition of the mest of the care in any and mistion of the secistion of the called and and siffection on oplers and as of mate the seavious and the portion of which there ind
concertion of a been and who hought that you he wall not
----- diversity: 1.0
----- Generating with seed: "he depth as of the mould, something unco"
he depth as of the mould, something uncol hanger great so acterded to the every, thar all thak faker as conterwe,bre womod and worctised.=itivations-"it "hos prestivigatly be be sinte notce and beiphy to the kind hack---my belile wherm aros to dove with still of
pasious "buc no it ald be ginfusyons beevy
and is reforel margos dusun andighy of
that it pleation watherelos, muge to be amoriclly freal fuirally and peilusal-somensp poupo or 
Epoch 8/60
----- Generatin

d the same thing to them; such only has been all that the man believed in really the good of the more prance of the responstion which is even the self-mendered to the concertation and nothing to the respibely and reficent and detirntated the about the philosopher to the more all the good mankind make their more also and all the signal and the pertance, the sounted there is not such all the read despirity of the highest and present, the 
----- diversity: 1.0
----- Generating with seed: "d the same thing to them; such only has "
d the same thing to them; such only has predenaby, sapsess,"

280. the prysons the pospticame detion, the philationness. it
must-do withwathing that canned for under commanicy afferman pressions--itsilitation, but to
foin,
of the rudicated,
filly the
idees, beloghined
and peremptiblaterogy?--188. and to a claced time for abligit
blacter, agalus insolarality; in reaction of the abowit indighlung, remath asy they of leadmany for what even
Epoch 20/60
----- Generati