# Character-level text generation with LSTM

**Author:** [fchollet](https://twitter.com/fchollet)<br>
**Date created:** 2015/06/15<br>
**Last modified:** 2020/04/30<br>
**Description:** Generate text from Nietzsche's writings with a character-level LSTM.

## Introduction

This example demonstrates how to use a LSTM model to generate
text character-by-character.

At least 20 epochs are required before the generated text
starts sounding locally coherent.

It is recommended to run this script on GPU, as recurrent
networks are quite computationally intensive.

If you try this script on new data, make sure your corpus
has at least ~100k characters. ~1M is better.

## Setup

In [1]:
import keras
from keras import layers

import numpy as np
import random
import io

## Prepare the data

In [27]:
# path = keras.utils.get_file(
#     "nietzsche.txt",
#     origin="https://s3.amazonaws.com/text-datasets/nietzsche.txt",
# )
with io.open("input.txt", encoding="utf-8") as f:
    text = f.read().lower()
text = text.replace("\n", " ")  # We remove newlines chars for nicer display
print("Corpus length:", len(text))

chars = sorted(list(set(text)))
print("Total chars:", len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i : i + maxlen])
    next_chars.append(text[i + maxlen])
print("Number of sequences:", len(sentences))

x = np.zeros((len(sentences), maxlen, len(chars)), dtype="bool")
y = np.zeros((len(sentences), len(chars)), dtype="bool")
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Corpus length: 1115394
Total chars: 38
Number of sequences: 371785


## Build the model: a single LSTM layer

In [3]:
model = keras.Sequential(
    [
        keras.Input(shape=(maxlen, len(chars))),
        layers.LSTM(128),
        layers.Dense(len(chars), activation="softmax"),
    ]
)
optimizer = keras.optimizers.RMSprop(learning_rate=0.01)
model.compile(loss="categorical_crossentropy", optimizer=optimizer)

## Prepare the text sampling function

In [4]:

def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype("float64")
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)


## Train the model

In [24]:
batch_size = 128

model.fit(x, y, batch_size=batch_size, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x214eb45e560>

In [28]:
start_index = random.randint(0, len(text) - maxlen - 1)
for diversity in [0.2, 0.5, 1.0, 1.2]:
    print("...Diversity:", diversity)

    generated = ""
    sentence = text[start_index : start_index + maxlen]
    print('...Generating with seed: "' + sentence + '"')

    print()

    for i in range(400):
        x_pred = np.zeros((1, maxlen, len(chars)))
        for t, char in enumerate(sentence):
            x_pred[0, t, char_indices[char]] = 1.0
        preds = model.predict(x_pred, verbose=0)[0]
        next_index = sample(preds, diversity)
        next_char = indices_char[next_index]
        sentence = sentence[1:] + next_char
        generated += next_char

        # clear_output(wait=True)
        # print(generated)

    print("...Generated: ", generated)
    print("-")

...Diversity: 0.2
...Generating with seed: " be entreated: tell her i send to her my"

...Generated:   son the streed to him to the man then the strong the house of the prince to the master to the country to the strong the man to the son the rome, the love to the streed to my head to the man that the prince to the man the strend to the streem to the prosperous son that the streep and here is the streed of the stress, the strend to the man and the people to the man the strong to the people to the p
-
...Diversity: 0.5
...Generating with seed: " be entreated: tell her i send to her my"

...Generated:   soul to do been son here is the prince, there is the mere word of my soul, the poor the perpetious wit-battle, and come to the number to him can strive thee in the day thing in the streem to a streeder and seeming hath conferent with him to the summer and the blest, what had be so much banished with the sweets of her spirit of the infated by the strong for our son but such made him and hav

  preds = np.log(preds) / temperature


...Generated:   one, awinmifueast you will believe me, for that many trangmis, and while 't-mother.  execern: my tricking their'd by earlys and chefave, that bid the dfie'sh tuven thibsse, son will congod wellich nights me first in hustiminy: hot! thy aride; was not tealling for bynen to tell thee, he am a lane, i hear him! o woman! divy snablish time bidley's unnoldst edward, and answer about me fly in knockll,
-


## Save trained model

In [16]:
model.save('trained_text_gen')
model.save_weights('trained_text_gen_weights')



INFO:tensorflow:Assets written to: trained_text_gen\assets


INFO:tensorflow:Assets written to: trained_text_gen\assets


## New predictions

In [23]:
diversity = 0.2 #only using one diversity
text_len = 400 #how much new text to generate
generated = ""
sentence = "Forecast laptop ratings based on specifications or predict price points"
sentence =  sentence.lower() #format to lowercase
sentence = sentence[0:40] #make input be 40 characters long

for i in range(text_len):
    x_pred = np.zeros((1, maxlen, len(chars)))
    for t, char in enumerate(sentence):
        x_pred[0, t, char_indices[char]] = 1.0
    preds = model.predict(x_pred, verbose=0)[0]
    next_index = sample(preds, diversity)
    next_char = indices_char[next_index]
    sentence = sentence[1:] + next_char
    generated += next_char
print(generated)

ns, the man be the streets to the street, the bed, and the world and the prince the bed that then the stranger that i think the man be the string the street the rest the bed the contrary of the common and the prince of the princes of the marketh be the descontent the more than the street of the matter than the morn the stries the streets to the mark the more than the bed the street to the man ther
