<a href="https://colab.research.google.com/github/mahn-bonnie/Generative-AI-Series/blob/main/LSTM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



**Text Generation using Recurrent Long Short Term Memory Network**

In [11]:
#The conceptual procedure of training the network is to first feed the network a mapping of each
#character present in the text on which the network is training to a unique number.
#Each character is then hot-encoded into a vector which is the required format for the network.

**Step 1: Importing the required libraries**

In [12]:
from __future__ import absolute_import, division, print_function, unicode_literals

import numpy as np
import tensorflow as tf

from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM

from keras.optimizers import RMSprop

from keras.callbacks import LambdaCallback
from keras.callbacks import ModelCheckpoint
from keras.callbacks import ReduceLROnPlateau
import random
import sys

**Step 2: Loading the data into a string**

In [13]:

# Reading the text file into a string
with open('article1.txt', 'r') as file:
	text = file.read()

# A preview of the text file
print(text)


﻿Adam Ferguson sensed the moment was right: "Everyone was simply waiting but there was an air of suspense, too, with people lo oking off in different directions. I made the decision, on the spot, that this was going to be it." Spanning four complete pages inside Sunday's special section, "The Road to Nowhere," is a single photograph, shot by Mr. Fergus on, depicting about 100 people waiting in line at a food distribution center outside Diffa, Niger. Each of the picture's subjec ts had fled from a series of horrors, landing in one of the many makeshift settlements scattered alongside National Route 1, a stretch of truncated highway that is now one of the only places in the region to offer refuge from the terrorist group Boko Har am. How did the 47-by-20-inch photograph, one of the largest if not the largest images ever published in The New York Times, co me together? "With great difficulty, is the short answer," Mr. Ferguson said. The Australian photographer was sent to Niger's Route 1 

**Step 3: Creating a mapping from each unique character in the text to a unique number**

In [14]:

# Storing all the unique characters present in the text
vocabulary = sorted(list(set(text)))

# Creating dictionaries to map each character to an index
char_to_indices = dict((c, i) for i, c in enumerate(vocabulary))
indices_to_char = dict((i, c) for i, c in enumerate(vocabulary))

print(vocabulary)

[' ', '"', "'", '(', ')', ',', '-', '.', '0', '1', '2', '4', '7', '8', ':', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'H', 'I', 'L', 'M', 'N', 'R', 'S', 'T', 'W', 'Y', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'y', 'z', '\ufeff']


**Step 4: Pre-processing the data**

In [15]:
import numpy as np

# Dividing the text into subsequences of length max_length
# So that at each time step the next max_length characters
# are fed into the network
max_length = 100
steps = 5
sentences = []
next_chars = []
for i in range(0, len(text) - max_length, steps):
    sentences.append(text[i: i + max_length])
    next_chars.append(text[i + max_length])

# Hot encoding each character into a boolean vector
X = np.zeros((len(sentences), max_length, len(vocabulary)), dtype = np.bool_)
y = np.zeros((len(sentences), len(vocabulary)), dtype = np.bool_)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        X[i, t, char_to_indices[char]] = 1
    y[i, char_to_indices[next_chars[i]]] = 1


**Step 5: Building the LSTM network**

In [16]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Activation
from tensorflow.keras.optimizers import RMSprop

# Building the LSTM network for the task
model = Sequential()
model.add(LSTM(128, input_shape=(max_length, len(vocabulary))))
model.add(Dense(len(vocabulary)))
model.add(Activation('softmax'))

# Using learning_rate instead of lr
optimizer = RMSprop(learning_rate=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)



**Step 6: Defining some helper functions which will be used during the training of the network**

In [17]:
#a) Helper function to sample the next character:

# Helper function to sample an index from a probability array
def sample_index(preds, temperature = 1.0):
	preds = np.asarray(preds).astype('float64')
	preds = np.log(preds) / temperature
	exp_preds = np.exp(preds)
	preds = exp_preds / np.sum(exp_preds)
	probas = np.random.multinomial(1, preds, 1)
	return np.argmax(probas)

#b) Helper function to generate text after each epoch

# Helper function to generate text after the end of each epoch
def on_epoch_end(epoch, logs):
	print()
	print('----- Generating text after Epoch: % d' % epoch)

	start_index = random.randint(0, len(text) - max_length - 1)
	for diversity in [0.2, 0.5, 1.0, 1.2]:
		print('----- diversity:', diversity)

		generated = ''
		sentence = text[start_index: start_index + max_length]
		generated += sentence
		print('----- Generating with seed: "' + sentence + '"')
		sys.stdout.write(generated)

		for i in range(400):
			x_pred = np.zeros((1, max_length, len(vocabulary)))
			for t, char in enumerate(sentence):
				x_pred[0, t, char_to_indices[char]] = 1.

			preds = model.predict(x_pred, verbose = 0)[0]
			next_index = sample_index(preds, diversity)
			next_char = indices_to_char[next_index]

			generated += next_char
			sentence = sentence[1:] + next_char

			sys.stdout.write(next_char)
			sys.stdout.flush()
		print()
print_callback = LambdaCallback(on_epoch_end = on_epoch_end)


In [18]:
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, LambdaCallback
import numpy as np
import random
import sys

# c) Helper function to save the model after each epoch in which loss decreases

# Defining a helper function to save the model after each epoch
# in which the loss decreases
filepath = "weights_epoch_{epoch:02d}_loss_{loss:.4f}.keras"
checkpoint = ModelCheckpoint(filepath, monitor='loss',
                             verbose=1, save_best_only=True,
                             mode='min')

# d) Helper function to reduce the learning rate each time the learning plateaus

# Defining a helper function to reduce the learning rate each time
# the learning plateaus
reduce_alpha = ReduceLROnPlateau(monitor='loss', factor=0.2,
                                 patience=1, min_lr=0.001)

callbacks = [print_callback, checkpoint, reduce_alpha]


**Step 7: Training the LSTM model**

In [21]:
# Training the LSTM model
model.fit(X, y, batch_size = 128, epochs = 20, callbacks = callbacks)


Epoch 1/20
[1m1/4[0m [32m━━━━━[0m[37m━━━━━━━━━━━━━━━[0m [1m0s[0m 31ms/step - loss: 3.1631
----- Generating text after Epoch:  0
----- diversity: 0.2
----- Generating with seed: "ls, we hope a reader can really start to connect with those people beyond a one-line caption that wo"
ls, we hope a reader can really start to connect with those people beyond a one-line caption that wo   s ss  e  s   oos  s e s  s  stes   ss  r  s a         is as ss  e    r o     s s       s ls r     o   ssrs  s s  se   s       e s  s  s     s    e       r s  e    s    n os   o o o   s     s o    s r  e seoe   o os  s    s s  ss s   ao      s  s es ios    o t e   a      ss   o  i   eo s  s  o o  s o s e  s  o  r e sss      s    s    s  s   s o s   s e s      s  e  o   o  sr    o r s     o as  s
----- diversity: 0.5
----- Generating with seed: "ls, we hope a reader can really start to connect with those people beyond a one-line caption that wo"
ls, we hope a reader can really start to connect with those

<keras.src.callbacks.history.History at 0x7e677c7260b0>

**Step 8: Generating new and random text**

In [22]:
# Defining a utility function to generate new and random text based on the
# network's learnings
def generate_text(length, diversity):
	# Get random starting text
	start_index = random.randint(0, len(text) - max_length - 1)
	generated = ''
	sentence = text[start_index: start_index + max_length]
	generated += sentence
	for i in range(length):
			x_pred = np.zeros((1, max_length, len(vocabulary)))
			for t, char in enumerate(sentence):
				x_pred[0, t, char_to_indices[char]] = 1.

			preds = model.predict(x_pred, verbose = 0)[0]
			next_index = sample_index(preds, diversity)
			next_char = indices_to_char[next_index]

			generated += next_char
			sentence = sentence[1:] + next_char
	return generated

print(generate_text(500, 0.2))


o separate trips to Niger and almost two weeks on the ground before Mr. Ferguson captured an appropre an o  te co s on  an on s an o e on o e  o e f  an of s s on  t  on an  t e o t an s o o  o  e o s an  e t o  o o e s on e as w an f e t an an o e e s e  o s e o s t o tian te s ge  te s e o o s an o e s o s o e an s s an s on s on o  oi an s  o  o o e an o he e on e on  o e co o s  o d an  te s an o  s s on  s e t o an an  e an s on s e s on  an  on e on  an  an s on o  s o  or an o e s on  o an s an e o o al e   on s o e t an o t on e s  t e e on s s g s a e  an e o t o s an e e s on s the a
