In [1]:
import tensorflow
tensorflow.keras.__version__

'2.2.4-tf'

# Text generation with LSTM

This notebook contains the code samples found in Chapter 8, Section 1 of [Deep Learning with Python](https://www.manning.com/books/deep-learning-with-python?a_aid=keras&a_bid=76564dff). Note that the original text features far more content, in particular further explanations and figures: in this notebook, you will only find source code and related comments.

----

[...]

## Implementing character-level LSTM text generation


Let's put these ideas in practice in a Keras implementation. The first thing we need is a lot of text data that we can use to learn a 
language model. You could use any sufficiently large text file or set of text files -- Wikipedia, the Lord of the Rings, etc. In this 
example we will use some of the writings of Nietzsche, the late-19th century German philosopher (translated to English). The language model 
we will learn will thus be specifically a model of Nietzsche's writing style and topics of choice, rather than a more generic model of the 
English language.

## Preparing the data

Let's start by downloading the corpus and converting it to lowercase:

In [2]:
import tensorflow.keras
import numpy as np

path = tensorflow.keras.utils.get_file(
    'nietzsche.txt',
    origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
text = open(path).read().lower()
print('Corpus length:', len(text))

Corpus length: 600893


In [4]:
text[:100]

'preface\n\n\nsupposing that truth is a woman--what then? is there not ground\nfor suspecting that all ph'


Next, we will extract partially-overlapping sequences of length `maxlen`, one-hot encode them and pack them in a 3D Numpy array `x` of 
shape `(sequences, maxlen, unique_characters)`. Simultaneously, we prepare a array `y` containing the corresponding targets: the one-hot 
encoded characters that come right after each extracted sequence.

In [17]:
# Length of extracted character sequences
maxlen = 60

# We sample a new sequence every `step` characters
step = 3

# This holds our extracted sequences
sentences = []

# This holds the targets (the follow-up characters)
next_chars = []

for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('Number of sequences:', len(sentences))

# List of unique characters in the corpus
chars = sorted(list(set(text)))
print('Unique characters:', len(chars))
# Dictionary mapping unique characters to their index in `chars`
char_indices = dict((char, chars.index(char)) for char in chars)

# Next, one-hot encode the characters into binary arrays.
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Number of sequences: 200278
Unique characters: 57
Vectorization...


In [6]:
sentences[:4]

['preface\n\n\nsupposing that truth is a woman--what then? is the',
 'face\n\n\nsupposing that truth is a woman--what then? is there ',
 'e\n\n\nsupposing that truth is a woman--what then? is there not',
 '\nsupposing that truth is a woman--what then? is there not gr']

In [15]:
chars[0]

'\n'

In [16]:
char_indices['\n']

0

In [14]:
char_indices


{'\n': 0,
 ' ': 1,
 '!': 2,
 '"': 3,
 "'": 4,
 '(': 5,
 ')': 6,
 ',': 7,
 '-': 8,
 '.': 9,
 '0': 10,
 '1': 11,
 '2': 12,
 '3': 13,
 '4': 14,
 '5': 15,
 '6': 16,
 '7': 17,
 '8': 18,
 '9': 19,
 ':': 20,
 ';': 21,
 '=': 22,
 '?': 23,
 '[': 24,
 ']': 25,
 '_': 26,
 'a': 27,
 'b': 28,
 'c': 29,
 'd': 30,
 'e': 31,
 'f': 32,
 'g': 33,
 'h': 34,
 'i': 35,
 'j': 36,
 'k': 37,
 'l': 38,
 'm': 39,
 'n': 40,
 'o': 41,
 'p': 42,
 'q': 43,
 'r': 44,
 's': 45,
 't': 46,
 'u': 47,
 'v': 48,
 'w': 49,
 'x': 50,
 'y': 51,
 'z': 52,
 'ä': 53,
 'æ': 54,
 'é': 55,
 'ë': 56}

In [7]:
next_chars[:4]

['r', 'n', ' ', 'o']

## Building the network

Our network is a single `LSTM` layer followed by a `Dense` classifier and softmax over all possible characters. But let us note that 
recurrent neural networks are not the only way to do sequence data generation; 1D convnets also have proven extremely successful at it in 
recent times.

In [18]:
from tensorflow.keras import layers

model = tensorflow.keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

Since our targets are one-hot encoded, we will use `categorical_crossentropy` as the loss to train the model:

In [19]:
optimizer = tensorflow.keras.optimizers.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

## Training the language model and sampling from it


Given a trained model and a seed text snippet, we generate new text by repeatedly:

* 1) Drawing from the model a probability distribution over the next character given the text available so far
* 2) Reweighting the distribution to a certain "temperature"
* 3) Sampling the next character at random according to the reweighted distribution
* 4) Adding the new character at the end of the available text

This is the code we use to reweight the original probability distribution coming out of the model, 
and draw a character index from it (the "sampling function"):

In [20]:
def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)


Finally, this is the loop where we repeatedly train and generated text. We start generating text using a range of different temperatures 
after every epoch. This allows us to see how the generated text evolves as the model starts converging, as well as the impact of 
temperature in the sampling strategy.

In [7]:
import random
import sys

for epoch in range(1, 60):
    print('epoch', epoch)
    # Fit the model for 1 epoch on the available training data
    model.fit(x, y,
              batch_size=128,
              epochs=1)

    # Select a text seed at random
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text[start_index: start_index + maxlen]
    print('--- Generating with seed: "' + generated_text + '"')

    for temperature in [0.2, 0.5, 1.0, 1.2]:
        print('------ temperature:', temperature)
        sys.stdout.write(generated_text)

        # We generate 400 characters
        for i in range(400):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(generated_text):
                sampled[0, t, char_indices[char]] = 1.

            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]

            generated_text += next_char
            generated_text = generated_text[1:]

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

epoch 1
Train on 200278 samples
--- Generating with seed: "expression, l'art pour l'art, along
with numerous others, ha"
------ temperature: 0.2
expression, l'art pour l'art, along
with numerous others, hand and the sense of the reastion of the man an a man in the detion of the paration of the spiriture of the such the power of the way the sense and the present the strenge of the man and not the spirit of the present and sense of the morality of the man and sense of the spirit of the fals and the section of the person of the the man and the commont the sense of the for the strengent and the present
------ temperature: 0.5
e commont the sense of the for the strengent and the present "child become and the respless and man and streally man, as a suncentiened but the become and the present the reastion of our seives an an one must under of the make for the reself and persome
of deater by the incertained and proper, and who entionation, and the neture of the doman the morality has confiner 

same to intellectually had to solitude doer philosopher for their
exist fis spoul al even of the truthes) sundompand. he endoin acceant: just themward:
   for the master
tipeatively as yell-and have
dream ivelf nonly rest at kind to sciences- cure portantfully man can is, fell into grouts and how it receptianeds, very be danger, me old defeesic--by himself do homes, brubtly hat, a consider are noble
way 
------ temperature: 1.2
-by himself do homes, brubtly hat, a consider are noble
way sectesing.

  
 but timegs, how to which way, willeding somem we would niflicaiste contion. a
wrety assialt:
pertainlyunobour these spepticate
times to "wholle yet discleatsanting as there kreress in obdiginal must would not the
swaing; freedo-lw; for do at is
social suscecuusiot should,  at danger. on a could too commink there not person to anticitation mubjerage in
 strength so
nature aroung pul
epoch 9
Train on 200278 samples
--- Generating with seed: "an almost masculine
stupidity, of which a well-r

ecommended so naively; or the lowering of the emotions to another and the sense of the spirit of the spirit of the sense of the same taste of the such a such as a suffering and soul is a such a such a more and such a properation of the proper of the soul is a such a problem of the sense of the such a such a such a suffering and sufferer and man who has the soul of the soul is as the spirit of the sense of the world in the spirit of the most proper of the s
------ temperature: 0.5
sense of the world in the spirit of the most proper of the sense what is consciment he words and without the reason to aspect to at be even the soul, in something and seeks and of the soul, it is as the human free species of the conscience of the same torre of the side its event and pleted to in the morality as something how a sufferers every free spotis that is the sufficalt and discovered and for the sensuse of the emotions and one with a greates prise i
------ temperature: 1.0
r the sensuse of the emotions 

  This is separate from the ipykernel package so we can avoid doing imports until


comming from 
epoch 17
Train on 200278 samples
--- Generating with seed: "er for ourselves!

239. the weaker sex has in no previous ag"
------ temperature: 0.2
er for ourselves!

239. the weaker sex has in no previous again the problem of the present of the strength the same of the sense and the sense of the same that the spirit is a soul of the strength of the same the strength of the same and the same that it is the strength of the strength the far the religious and the superstitious and the sense of the end of the present his soul of the strength there is the respect to the strength and the art of the substora
------ temperature: 0.5
e is the respect to the strength and the art of the substoration and there is in the rests thought that which must be like a strong to incliging the the
some and the religious the consider with the concerning to the strength the strange, or with the relext from the same also to the entire becausfing the south, and for the rests the sort to lighten the a

1oveth us onces! all no discyse a older conceive as something manifictors to the predestructions, we dolms and in which the hourc is thought lied only
could ferrow, as a woves fhere severity lages
hander insirfiertant puritaly, who
hondar
astative,
accounder sor has for it as the immense" is generals; with four oft a man thle interruptious and  spority of one b
------ temperature: 1.2
with four oft a man thle interruptious and  spority of one believe asore however cases, a latter and of nivisroved pryoust hope, his eased mut and
what is warisial
pjessimu of acute , up comparangousness
to 'seals.

28reloks.aping kni(itnyen it maqur") threated by so imperion, but himal-and bringly and all
exa discompares could now but advance and woman: by pessimistic and frant eass than   t must vart, peshylly, inflicter lement regard to the coette orali
epoch 25
Train on 200278 samples
--- Generating with seed: " on a sudden, the entire movement of the world
stopped short"
------ temperature: 0.2
 on a

intellectual men of this age, which is swept more and more all the sense of the most and discovered to the same that he must be all the self-exception of the singular and discovered to the self-as the most soul of the most success of the senses of the souls of the most man is all the higher and the self-all the sense of the world and self-and the self-as a man and the fatherly and say that is a soul and the subtless" and the senses of the senses of the 
------ temperature: 0.5
 soul and the subtless" and the senses of the senses of the sense of its account in the soul also the predicate of life. he is not it is all the the conduct that is all the world is all the order to be to the individual intermol the expression of the problem that is not the spirit will be his sun and remarer and mistaken, the dealist which is also the other will for the serious more which is a man who had to one's self-all the truth of which
the man which 
------ temperature: 1.0
 who had to one's self-all the tr

season"; the really have perhaps the books; this matters of a pate-"thought!--but
it was a thengl".
gon itself
------ temperature: 1.2
atters of a pate-"thought!--but
it was a thengl".
gon itself too wrety edutcess. the heights with symphization is evil of of men: festes" a
pal perpalusal itself--and
                  thoreuluporable idea of the
othe persons in the do is only far, that with the cause is lived cat greed: just hitherto been
these
noble, however, a gods.thas assures, fortuned at all idea way of the unites
of the stregare their own
new ends, not-be tornunted is very nemility

epoch 40
Train on 200278 samples
--- Generating with seed: "unchangeable "i am this"; a thinker cannot learn anew about "
------ temperature: 0.2
unchangeable "i am this"; a thinker cannot learn anew about the soul of the strong of the sense of the same the strong the strong to the substrument and the strength, and the same the strong the strong to the sense of the strength, and the strong the state o

on of the sense of the most conscience of the present the stone of the same tranged and fear of consequently is the proper of
the fact that they has happens in mystelt, to the instincts to the case is at the fact to the sense of the complication of contempt a recipore and the contrary something which has no one has no desire and intelligen to german sacrificting to maning the complice in the sympathy, man who can no longer and to look only the regard to th
------ temperature: 1.0
thy, man who can no longer and to look only the regard to the correspons! how heirt a souls and herger, if i schildesized
yet we have non every to taltrites
of different of the "lains heegfunt self
contemplation of hard strangeness, youther been our graginates the belief that seemary has been the facta
obacable begins to peasants,ly kinds, themselves had god that it is defited of his spirit in -it same "in more languan poss evener in well, so much a more a
------ temperature: 1.2
 same "in more languan poss ev

unformarred to  scarr. "but soundeding, kept the masterful as to wing-byles no i"gnixity-like particl
restror
treates, feels is poine of aklids at crevivilsm if spi)ing ristopce and . one
language is it: so be
epoch 55
Train on 200278 samples
--- Generating with seed: "we have always fools and appearances against us!

227. hones"
------ temperature: 0.2
we have always fools and appearances against us!

227. hones are as and and because the consequently and in the senses of the same taste, and all the strength in the same time the strength in the most sense of the same the most successful and such a sounds of the senses of the same time of the same the best and the proper the same to be self-and and the same time and the same to any discovered and conscience of the spirit and stronger of the same and consc
------ temperature: 0.5
 conscience of the spirit and stronger of the same and conscience of all an artist--than to the "estimates of an inversable profound religious and merely the s


As you can see, a low temperature results in extremely repetitive and predictable text, but where local structure is highly realistic: in 
particular, all words (a word being a local pattern of characters) are real English words. With higher temperatures, the generated text 
becomes more interesting, surprising, even creative; it may sometimes invent completely new words that sound somewhat plausible (such as 
"eterned" or "troveration"). With a high temperature, the local structure starts breaking down and most words look like semi-random strings 
of characters. Without a doubt, here 0.5 is the most interesting temperature for text generation in this specific setup. Always experiment 
with multiple sampling strategies! A clever balance between learned structure and randomness is what makes generation interesting.

Note that by training a bigger model, longer, on more data, you can achieve generated samples that will look much more coherent and 
realistic than ours. But of course, don't expect to ever generate any meaningful text, other than by random chance: all we are doing is 
sampling data from a statistical model of which characters come after which characters. Language is a communication channel, and there is 
a distinction between what communications are about, and the statistical structure of the messages in which communications are encoded. To 
evidence this distinction, here is a thought experiment: what if human language did a better job at compressing communications, much like 
our computers do with most of our digital communications? Then language would be no less meaningful, yet it would lack any intrinsic 
statistical structure, thus making it impossible to learn a language model like we just did.


## Take aways

* We can generate discrete sequence data by training a model to predict the next tokens(s) given previous tokens.
* In the case of text, such a model is called a "language model" and could be based on either words or characters.
* Sampling the next token requires balance between adhering to what the model judges likely, and introducing randomness.
* One way to handle this is the notion of _softmax temperature_. Always experiment with different temperatures to find the "right" one.