# Character-level text generation with LSTM

**Author:** [fchollet](https://twitter.com/fchollet)<br>
**Date created:** 2015/06/15<br>
**Last modified:** 2020/04/30<br>
**Description:** Generate text from Nietzsche's writings with a character-level LSTM.

## Introduction

This example demonstrates how to use a LSTM model to generate
text character-by-character.

At least 20 epochs are required before the generated text
starts sounding locally coherent.

It is recommended to run this script on GPU, as recurrent
networks are quite computationally intensive.

If you try this script on new data, make sure your corpus
has at least ~100k characters. ~1M is better.


## Setup


In [1]:
from tensorflow import keras
from tensorflow.keras import layers

import numpy as np
import random
import io


## Prepare the data


In [2]:
path = keras.utils.get_file(
    "nietzsche.txt", origin="https://s3.amazonaws.com/text-datasets/nietzsche.txt"
)
with io.open(path, encoding="utf-8") as f:
    text = f.read().lower()
text = text.replace("\n", " ")  # We remove newlines chars for nicer display
print("Corpus length:", len(text))

chars = sorted(list(set(text)))
print("Total chars:", len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i : i + maxlen])
    next_chars.append(text[i + maxlen])
print("Number of sequences:", len(sentences))

x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1



Downloading data from https://s3.amazonaws.com/text-datasets/nietzsche.txt
Corpus length: 600893
Total chars: 56
Number of sequences: 200285


## Build the model: a single LSTM layer


In [3]:
model = keras.Sequential(
    [
        keras.Input(shape=(maxlen, len(chars))),
        layers.LSTM(128),
        layers.Dense(len(chars), activation="softmax"),
    ]
)
optimizer = keras.optimizers.RMSprop(learning_rate=0.01)
model.compile(loss="categorical_crossentropy", optimizer=optimizer)


## Prepare the text sampling function


In [4]:

def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype("float64")
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)



## Train the model


In [5]:
epochs = 40
batch_size = 128

for epoch in range(epochs):
    model.fit(x, y, batch_size=batch_size, epochs=1)
    print()
    print("Generating text after epoch: %d" % epoch)

    start_index = random.randint(0, len(text) - maxlen - 1)
    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print("...Diversity:", diversity)

        generated = ""
        sentence = text[start_index : start_index + maxlen]
        print('...Generating with seed: "' + sentence + '"')

        for i in range(400):
            x_pred = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x_pred[0, t, char_indices[char]] = 1.0
            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]
            sentence = sentence[1:] + next_char
            generated += next_char

        print("...Generated: ", generated)
        print()



Generating text after epoch: 0
...Diversity: 0.2
...Generating with seed: "ch has hitherto been the secret wish and"
...Generated:   the self-and in the will of the forthing in the of the presente of the self-and the self and the simely and still and the self-will of the more the see and the precessing the self-and the power the more the propers of the his of the seems of the seem and the self-and has in the forthing the self--and in the forthing of the morals of the presente of the seem the seem the his out of its for are the

...Diversity: 0.5
...Generating with seed: "ch has hitherto been the secret wish and"
...Generated:   and himself himself is the consermanness of a one himself is itself the intendent of an her of the for intime of the world be living for she world the world has the conticual the seen itself will have the respensed for an thereffrenom of the proncess of the perhaps as her his in the see not and and his has an in pirerest and are of will still the for the hisher

  """


...Generated:  enser? ever sustring. perhaps a fears incommon indivinity, ideal and tneil--the means, and it one honormouss, and concern forget typelous atcomsfrochted, conscioralization in affording thety is tune to trease of, when to credame willing; it cannot insaltanted the fundamable, as schoil tohile. the powerfunly deverie cistutifa, when threedor of its worm pro!nelianity must morough form suld principe 


Generating text after epoch: 11
...Diversity: 0.2
...Generating with seed: "" so he would like to have some hundreds"
...Generated:   of the striving to the contrast of the standards of the prompted and all the prompted in the other the more the present and the prompted and the and the comprehension of the ancient the contemple and the world of the standards of the philosophers and the state of the prompted to the more and the standards of the prompted and the contrast of the prompted and the such a still be the strivin of the 

...Diversity: 0.5
...Generating with seed: "" s