### Character-level text generation with LSTM

This notebook is based on [keras blog](https://keras.io/examples/generative/lstm_character_level_text_generation/). We are going to learn how to generate text character by character. This task is the same as the what we did in the previous notebook.

### Imports

In [1]:
from tensorflow import keras
import tensorflow as tf

import numpy as np
import random, time

tf.__version__

'2.5.0'

### Data preparation

In [None]:
path = keras.utils.get_file(
    "nietzsche.txt", origin="https://s3.amazonaws.com/text-datasets/nietzsche.txt"
)


### Loading the text.

In [4]:
text = open(path, encoding="utf-8").read().lower().replace("\n", " ") 
# removing new lines for nicer display

print("Corpus length:", len(text))

Corpus length: 600893


### Unique characters

In [6]:
chars = sorted(list(set(text)))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))
print(len(chars))

56


In [7]:
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i : i + maxlen])
    next_chars.append(text[i + maxlen])
    
print("Number of sequences:", len(sentences))

Number of sequences: 200285


### cut the text in semi-redundant sequences of maxlen characters

In [8]:
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

### Building the Model.

In [11]:
model = keras.Sequential(
    [
        keras.layers.Input(shape=(maxlen, len(chars))),
        keras.layers.LSTM(128),
        keras.layers.Dense(len(chars), activation="softmax"),
    ]
)
optimizer = keras.optimizers.Adam()
model.compile(loss="categorical_crossentropy", optimizer=optimizer, metrics=["acc"])
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 128)               94720     
_________________________________________________________________
dense_1 (Dense)              (None, 56)                7224      
Total params: 101,944
Trainable params: 101,944
Non-trainable params: 0
_________________________________________________________________


In [17]:
def sample(preds, temperature=1.0):
  # helper function to sample an index from a probability array
  preds = np.asarray(preds).astype("float64")
  preds = np.log(preds) / temperature
  exp_preds = np.exp(preds)
  preds = exp_preds / np.sum(exp_preds)
  probas = np.random.multinomial(1, preds, 1)
  return np.argmax(probas)

### Training the model.

In [None]:
epochs = 40
batch_size = 128
model.fit(x, y, batch_size=batch_size, epochs=epochs)

In [18]:
start_index = random.randint(0, len(text) - maxlen - 1)
for diversity in [0.2, 0.5, 1.0, 1.2]:
    print("...Diversity:", diversity)

    generated = ""
    sentence = text[start_index : start_index + maxlen]
    print('...Generating with seed: "' + sentence + '"')

    for i in range(400):
        x_pred = np.zeros((1, maxlen, len(chars)))
        for t, char in enumerate(sentence):
            x_pred[0, t, char_indices[char]] = 1.0
        preds = model.predict(x_pred, verbose=0)[0]
        next_index = sample(preds, diversity)
        next_char = indices_char[next_index]
        sentence = sentence[1:] + next_char
        generated += next_char

    print("...Generated: ", generated)
    print()

...Diversity: 0.2
...Generating with seed: "is stupid to do wrong"; while they accep"
...Generated:  t it is the soul of the same time of man in the moral of the same part of the present its life he who in which it is not in the same wis not be a morality of the experience of the sight of the senses and consciousness and individual and in the same impulse of the sentem in the senses and individual will to the artistic conception of the senses of the process of the senses of the senses in the prob

...Diversity: 0.5
...Generating with seed: "is stupid to do wrong"; while they accep"
...Generated:  t of the power and perhaps even and in conscience in the ammention to him the prilations, the same defities the individual relations of the pirituless and inveritable, in the sand himself in short, and honest as in the subject of comparisons of his certain conceptions and perhaps strange, even when the higher who who may be great soul of his extracte, and in the will to the proposition and mor

This code was found [here](https://keras.io/examples/generative/lstm_character_level_text_generation/)