<a href="https://colab.research.google.com/github/tawounfouet/road-to-deeplearning-mastery/blob/main/Text_Generation_With_LSTM_Recurrent_Neural_Networks_in_Python_with_Keras.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Text Generation With LSTM Recurrent Neural Networks in Python with Keras


Recurrent neural networks can also be used as generative models.

Generative models like this are useful not only to study how well a model has learned a problem but also to learn more about the problem domain itself.


[Source du projet](https://machinelearningmastery.com/text-generation-lstm-recurrent-neural-networks-python-keras/)

## Problem Description: Project Gutenberg

Many of the classical texts are no longer protected under copyright.

This means you can download all the text for these books for free and use them in experiments, like creating generative models. Perhaps the best place to get access to free books that are no longer protected by copyright is Project Gutenberg.


In this tutorial, you will use a favorite book from childhood as the dataset: .

[Alice’s Adventures in Wonderland by Lewis Carroll](https://www.gutenberg.org/ebooks/11)

## Develop a Small LSTM Recurrent Neural Network
In this section, you will develop a **simple LSTM network** to learn sequences of characters from Alice in Wonderland. In the next section, you will use this model to generate new sequences of characters.

In [None]:
!wget https://www.gutenberg.org/cache/epub/11/pg11.txt

--2024-03-09 14:15:16--  https://www.gutenberg.org/cache/epub/11/pg11.txt
Resolving www.gutenberg.org (www.gutenberg.org)... 152.19.134.47, 2610:28:3090:3000:0:bad:cafe:47
Connecting to www.gutenberg.org (www.gutenberg.org)|152.19.134.47|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 174385 (170K) [text/plain]
Saving to: ‘pg11.txt’


2024-03-09 14:15:16 (2.39 MB/s) - ‘pg11.txt’ saved [174385/174385]



In [None]:
# rename the downloaded file
#mv [options] source_file destination_file
! mv pg11.txt wonderland.txt

In [None]:
# importing the classes and functions we will use to train your model.

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import LSTM
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.utils import to_categorical

In [None]:

# load ascii text and covert to lowercase
filename = "wonderland.txt"
raw_text = open(filename, 'r', encoding='utf-8').read()
raw_text = raw_text.lower()
print(raw_text[:200])

﻿the project gutenberg ebook of alice's adventures in wonderland
    
this ebook is for the use of anyone anywhere in the united states and
most other parts of the world at no cost and with almost no 


In [None]:
# create mapping of unique chars to integers
chars = sorted(list(set(raw_text)))
char_to_int = dict((c, i) for i, c in enumerate(chars))


In [None]:
list(char_to_int.items())[:10]

[('\n', 0),
 (' ', 1),
 ('!', 2),
 ('#', 3),
 ('$', 4),
 ('%', 5),
 ("'", 6),
 ('(', 7),
 (')', 8),
 ('*', 9)]

In [None]:
#. now that the book has been loaded and the mapping prepared, we can summarize the dataset.
n_chars = len(raw_text)
n_vocab = len(chars)
print(f"Total Characters : {n_chars}")
print(f"Total Vocab: , {n_vocab}")

Total Characters : 163947
Total Vocab: , 65


We can see the book has just around 160,000 characters, and when converted to lowercase, there are only 67 distinct characters in the vocabulary for the network to learn—much more than the 26 in the alphabet


We now need to define the training data for the network. There is a lot of flexibility in how you choose to break up the text and expose it to the network during training.

In this project, we will split the book text up into subsequences with a fixed length of 100 characters, an arbitrary length. we could just as easily split the data by sentences, padding the shorter sequences and truncating the longer ones.

Each training pattern of the network comprises 100 time steps of one character (X) followed by one character output (y). When creating these sequences, we slide this window along the whole book one character at a time, allowing each character a chance to be learned from the 100 characters that preceded it (except the first 100 characters, of course).

For example, if the sequence length is 5 (for simplicity), then the first two training patterns would be as follows:

- CHAPT -> E
- HAPTE -> R

In [None]:
# prepare the dataset of input to output pairs encoded as integers
seq_length = 100
dataX = []
dataY = []

for i in range(0, n_chars - seq_length, 1):
  seq_in = raw_text[i:i + seq_length]
  seq_out = raw_text[i + seq_length]
  dataX.append([char_to_int[char] for char in seq_in])
  dataY.append(char_to_int[seq_out])

n_patterns = len(dataX)
print(f"Total Patterns: , {n_patterns}")

Total Patterns: , 163847


In [None]:
print(dataX[:3])

[[64, 49, 37, 34, 1, 45, 47, 44, 39, 34, 32, 49, 1, 36, 50, 49, 34, 43, 31, 34, 47, 36, 1, 34, 31, 44, 44, 40, 1, 44, 35, 1, 30, 41, 38, 32, 34, 6, 48, 1, 30, 33, 51, 34, 43, 49, 50, 47, 34, 48, 1, 38, 43, 1, 52, 44, 43, 33, 34, 47, 41, 30, 43, 33, 0, 1, 1, 1, 1, 0, 49, 37, 38, 48, 1, 34, 31, 44, 44, 40, 1, 38, 48, 1, 35, 44, 47, 1, 49, 37, 34, 1, 50, 48, 34, 1, 44, 35, 1, 30], [49, 37, 34, 1, 45, 47, 44, 39, 34, 32, 49, 1, 36, 50, 49, 34, 43, 31, 34, 47, 36, 1, 34, 31, 44, 44, 40, 1, 44, 35, 1, 30, 41, 38, 32, 34, 6, 48, 1, 30, 33, 51, 34, 43, 49, 50, 47, 34, 48, 1, 38, 43, 1, 52, 44, 43, 33, 34, 47, 41, 30, 43, 33, 0, 1, 1, 1, 1, 0, 49, 37, 38, 48, 1, 34, 31, 44, 44, 40, 1, 38, 48, 1, 35, 44, 47, 1, 49, 37, 34, 1, 50, 48, 34, 1, 44, 35, 1, 30, 43], [37, 34, 1, 45, 47, 44, 39, 34, 32, 49, 1, 36, 50, 49, 34, 43, 31, 34, 47, 36, 1, 34, 31, 44, 44, 40, 1, 44, 35, 1, 30, 41, 38, 32, 34, 6, 48, 1, 30, 33, 51, 34, 43, 49, 50, 47, 34, 48, 1, 38, 43, 1, 52, 44, 43, 33, 34, 47, 41, 30, 43, 33,

In [None]:
# reshape X to be [samples, time steps, features]
X = np.reshape(dataX, (n_patterns, seq_length, 1))
# normalize
X = X / float(n_vocab)
# one hot encode the output variable
y = to_categorical(dataY)

In [None]:
# define the LSTM model
model = Sequential()
model.add(LSTM(256, input_shape=(X.shape[1], X.shape[2])))
model.add(Dropout(0.2))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')

In [None]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 256)               264192    
                                                                 
 dropout (Dropout)           (None, 256)               0         
                                                                 
 dense (Dense)               (None, 64)                16448     
                                                                 
Total params: 280640 (1.07 MB)
Trainable params: 280640 (1.07 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [None]:
# define the checkpoint
filepath="weights-improvement-{epoch:02d}-{loss:.4f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]

We can now fit your model to the data. Here, you use a modest number of 20 epochs and a large batch size of 128 patterns.

In [None]:
model.fit(X, y, epochs=20, batch_size=128, callbacks=callbacks_list)

Epoch 1/20
Epoch 1: loss improved from inf to 3.01410, saving model to weights-improvement-01-3.0141.hdf5
Epoch 2/20


  saving_api.save_model(


Epoch 2: loss improved from 3.01410 to 2.84742, saving model to weights-improvement-02-2.8474.hdf5
Epoch 3/20
Epoch 3: loss improved from 2.84742 to 2.76822, saving model to weights-improvement-03-2.7682.hdf5
Epoch 4/20
Epoch 4: loss improved from 2.76822 to 2.70493, saving model to weights-improvement-04-2.7049.hdf5
Epoch 5/20

In [None]:

# load the network weights
filename = "weights-improvement-19-1.9435.hdf5"
model.load_weights(filename)
model.compile(loss='categorical_crossentropy', optimizer='adam')
# pick a random seed
start = np.random.randint(0, len(dataX)-1)
pattern = dataX[start]
print("Seed:")
print("\"", ''.join([int_to_char[value] for value in pattern]), "\"")
# generate characters
for i in range(1000):
 x = np.reshape(pattern, (1, len(pattern), 1))
 x = x / float(n_vocab)
 prediction = model.predict(x, verbose=0)
 index = np.argmax(prediction)
 result = int_to_char[index]
 seq_in = [int_to_char[value] for value in pattern]
 sys.stdout.write(result)
 pattern.append(index)
 pattern = pattern[1:len(pattern)]
print("\nDone.")
