In [1]:
import keras
keras.__version__

Using TensorFlow backend.


'2.2.4'

# Text generation with LSTM

This notebook contains the code samples found in Chapter 8, Section 1 of [Deep Learning with Python](https://www.manning.com/books/deep-learning-with-python?a_aid=keras&a_bid=76564dff). Note that the original text features far more content, in particular further explanations and figures: in this notebook, you will only find source code and related comments.

----

[...]

## Implementing character-level LSTM text generation


Let's put these ideas in practice in a Keras implementation. The first thing we need is a lot of text data that we can use to learn a 
language model. You could use any sufficiently large text file or set of text files -- Wikipedia, the Lord of the Rings, etc. In this 
example we will use some of the writings of Nietzsche, the late-19th century German philosopher (translated to English). The language model 
we will learn will thus be specifically a model of Nietzsche's writing style and topics of choice, rather than a more generic model of the 
English language.

## Preparing the data

Let's start by downloading the corpus and converting it to lowercase:

In [2]:
import keras
import numpy as np

path = keras.utils.get_file(
    'nietzsche.txt',
    origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
text = open(path).read().lower()
print('Corpus length:', len(text))

Downloading data from https://s3.amazonaws.com/text-datasets/nietzsche.txt
Corpus length: 600901



Next, we will extract partially-overlapping sequences of length `maxlen`, one-hot encode them and pack them in a 3D Numpy array `x` of 
shape `(sequences, maxlen, unique_characters)`. Simultaneously, we prepare a array `y` containing the corresponding targets: the one-hot 
encoded characters that come right after each extracted sequence.

In [3]:
# Length of extracted character sequences
maxlen = 60

# We sample a new sequence every `step` characters
step = 3

# This holds our extracted sequences
sentences = []

# This holds the targets (the follow-up characters)
next_chars = []

for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('Number of sequences:', len(sentences))

# List of unique characters in the corpus
chars = sorted(list(set(text)))
print('Unique characters:', len(chars))
# Dictionary mapping unique characters to their index in `chars`
char_indices = dict((char, chars.index(char)) for char in chars)

# Next, one-hot encode the characters into binary arrays.
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Number of sequences: 200281
Unique characters: 59
Vectorization...


## Building the network

Our network is a single `LSTM` layer followed by a `Dense` classifier and softmax over all possible characters. But let us note that 
recurrent neural networks are not the only way to do sequence data generation; 1D convnets also have proven extremely successful at it in 
recent times.

In [4]:
from keras import layers

model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

Since our targets are one-hot encoded, we will use `categorical_crossentropy` as the loss to train the model:

In [5]:
optimizer = keras.optimizers.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

## Training the language model and sampling from it


Given a trained model and a seed text snippet, we generate new text by repeatedly:

* 1) Drawing from the model a probability distribution over the next character given the text available so far
* 2) Reweighting the distribution to a certain "temperature"
* 3) Sampling the next character at random according to the reweighted distribution
* 4) Adding the new character at the end of the available text

This is the code we use to reweight the original probability distribution coming out of the model, 
and draw a character index from it (the "sampling function"):

In [6]:
def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)


Finally, this is the loop where we repeatedly train and generated text. We start generating text using a range of different temperatures 
after every epoch. This allows us to see how the generated text evolves as the model starts converging, as well as the impact of 
temperature in the sampling strategy.

In [7]:
import random
import sys

for epoch in range(1, 60):
    print('epoch', epoch)
    # Fit the model for 1 epoch on the available training data
    model.fit(x, y,
              batch_size=128,
              epochs=1)

    # Select a text seed at random
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text[start_index: start_index + maxlen]
    print('--- Generating with seed: "' + generated_text + '"')

    for temperature in [0.2, 0.5, 1.0, 1.2]:
        print('------ temperature:', temperature)
        sys.stdout.write(generated_text)

        # We generate 400 characters
        for i in range(400):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(generated_text):
                sampled[0, t, char_indices[char]] = 1.

            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]

            generated_text += next_char
            generated_text = generated_text[1:]

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

epoch 1
Epoch 1/1
--- Generating with seed: "m the
non-reasonable, the animate from the inanimate, the lo"
------ temperature: 0.2
m the
non-reasonable, the animate from the inanimate, the longer of the spects of the spects of the is the stand the spects of a the is the spection of the spects of the forment of the enders of the form of the more and in the estill in the emes of the spects of the for of the state the endaring the extent the eartion of the indificed the dome the who he selflity of the every the indifection of the destinct the spects the artion of the enders of the conces
------ temperature: 0.5
e destinct the spects the artion of the enders of the concess of a more and self of exterching person and the dispression of the indifience of the in it is the be must the dest no lof the world to the more in the tood in the form the enderstical the stand the and of when the emes in the action of the are courte that a more and the most the souls, and in spence the the fat the are f

pioulls love, philoison in the godr isiame, will beyever
time with of intelatist--as mqure if natural taste ouths and
deny in once let loves greak." hecervent feeling anatur, "kind innigh vent, t and than, mading, ounted
too being-"ner
it?
h1me woman is--this bad intelle., or gond grom
the ready
hagable--"what it, he mankwer and sciendien
bung than
ileding o
epoch 5
Epoch 1/1
--- Generating with seed: "with
his situation. the slave has an unfavourable eye for th"
------ temperature: 0.2
with
his situation. the slave has an unfavourable eye for the fact that is the superiosistic the precisely and something the such a consequently the such a such a still as the problem the same and the same and the spirit to the farr that it is the soul and a sacrifice of the suppose the superiosing the precisely and the same and and the such a still the sense that it is the problem that is a precisely the such a sacrifice of the stand to the spirit and the
------ temperature: 0.5
sely the such a sacrifi

too unkarly to one of accepter himself all there thorking pardens, here
implont must besoil without the nibul
to peops, othing betlusting in the fordtht him to obsciect
contrarism, repoasian arrsquition of only o
------ temperature: 1.2
 him to obsciect
contrarism, repoasian arrsquition of only on the usance, the innerstmuolma" while "transuality on the unsurprepten), in order fhis or
utsod metrulibilines of thing that the feelity of the most place-one and
noble after. yougnd to with thing as
own
immest bener
leaks to the lives successponow--suchest about cerere,
now proit a feither, and there oneirish is, aoriers" (were is no borarily morred he laters, extitugowsroou surringation:
self s
epoch 9
Epoch 1/1
--- Generating with seed: " the temptation to become a dilettante, a millepede, a
mille"
------ temperature: 0.2
 the temptation to become a dilettante, a millepede, a
millection of the same the perventary for the such a sure some only of the same to the same and the same and the pre

  This is separate from the ipykernel package so we can avoid doing imports until


the most constandish to be something of the most 
------ temperature: 0.5
d sublimes the most constandish to be something of the most desire that the new serious commands in men the free spirits of the extent be soul, and the first of her here about the conscience of the conceals into the promptation of once in the pathomes to secret former more far have been short, and the constant and ophals of the world and frantives terman descions the world from the shines of every whole is the state of the human scholard and a spring of th
------ temperature: 1.0
 whole is the state of the human scholard and a spring of thinghing thought, so desire attenting wepthoragy, in the spiriti
they, couraged
rolenters:"--the
old conceaved the
quatition than the pertations and on their fatheration: it is naturation already
instremints well ass resured, the
bricaten oy of its the most nature,
who
unscaid invert aunde, not the scavely. butly discovered, the wantrudhed
too helgveral
comcontemptainection is, i

sublimates itself in his case into circumspection and super--who has to the same to the same to the same and super--what is an and self-conscience of the sense of the sense of the same to service to be a sense of the sense that he is an and still and secure of the same to the sense of the sense of the same to the most desire that it was the super-entaculable and sense of the same present distrust and service to the sense of the same time that the sup
------ temperature: 0.5
trust and service to the sense of the same time that the super--in our opposite of the superyians is he even as the antimery severe that is actionation and the nations of an and seriousness of the most same and individual and care subternally concerning the master from and stupidities and made of the fear to pre.ute of conscience and ears has in europe of the hesitarily as the arting may be a sense of man and such
say and and self-reason and that a sense o
------ temperature: 1.0
e of man and such
say and and self-r

of emplines europeris of usifice ganger deceptionven-agation of
understo with they have nother? of conception and sensecy of their high it," essest our
judga
epoch 19
Epoch 1/1
--- Generating with seed: "for a phantom of him; he wishes first to be thoroughly, inde"
------ temperature: 0.2
for a phantom of him; he wishes first to be thoroughly, indeed, the sacrifice the science, the serious in the spirit of the spirit is the same time the morality and successful man the sought to be such a successical spirit and the sense of the supporing the morality and the sentiments of the sense of the same time the words of the fundamental process of the sense of the same time the conscience of the standed to mankind and such a supporing the sense of th
------ temperature: 0.5
 the standed to mankind and such a supporing the sense of the strong a philosophines should we has been the opposite that the commanding with the end a personality, and the most cause of our such the "good" in which more oppo

sper keptheratuque new renocsious were apyrt from a errob.
the comever inhuman as yet could luin
genration, he look of delunish, easelegers, begedation to bursful profrilty in-it ort? it oe liyws--dolusivangentty; this we innotive for il.ificals thousan
on the stivele maintaic dillew no agreent: thingls, anything wand with that
row utsent if the bland than it robvered,
should say mepour, (that
soulrs of which u
epoch 23
Epoch 1/1
--- Generating with seed: "heses--as
happens frequently to the clumsiness of naturalist"
------ temperature: 0.2
heses--as
happens frequently to the clumsiness of naturalistic and so much and the seems to the reasonable to the soul and the more the sense of the presence of the world of the soul and the suffers of the most conscience of the soul and his sense of the most consequently the consequently to the sense of the state of the world and the art of the soul and the consequently the consequently and the soul and the whole seems to the advance of the soul an

doestenseved after animaly; the stakes the pternty, as slamency, and promply to aliforiesroped of bes longer matcrecto the quite law
causance maname.


1eeeeteticartion
and
most hallow which
has neverthives which does it never serves to party to the instinces,
------ temperature: 1.2
thives which does it never serves to party to the instinces,.ster
little some up
in compared, even if partly by the streut maintekinge from predestly amies (frod does that
in signs of life
for threatests could edito
from thie
means
exermxar of thereby he civility too halcwit being an
own ut to what
manverial a human us in the vitility who would a cultiess," as learnis. everelit must hyse so.--his it, hey all
embperswarly to best compising on its worr godd 
epoch 27
Epoch 1/1
--- Generating with seed: "nity, it is
thought, acquires an ever deeper draught the mor"
------ temperature: 0.2
nity, it is
thought, acquires an ever deeper draught the more and sublimated the stand the end the same time the state of 

consideration of their influence of science of the fund it--is the longing as intelligence to him as the same the familations,
------ temperature: 1.0
 longing as intelligence to him as the same the familations, it (with whem has its
put its real lungded and amplat
and flawh suffering philosophee, enclanated
in ordinary interemony; paess them finally so
despulation, i think that privileation of is only or, that it is its thenking in the world of prophilosition of these
long "necessasing but
nowadays and recovers of a distortimates: "plays ethine this individual, of
logible appeal retantations ipkin bedrf
------ temperature: 1.2
 this individual, of
logible appeal retantations ipkin bedrfoved thothpebleritable matur halt that feeling assuccation--it be just as
definite, he
wire new obscencess accect
if it will so masum
in
my
tirlces
and aviing of the soul; has and of which these by avery heart
and thinkers that its "god:--all, him. in
the fundame perhaps cors necessary.
but all perhaps 

tatising the strength of the spirit of the fact of the constantly only in their palomes what has pleasure, in the will in the love of the band himself always here
spirit of his was much of sublimitation of moments of the opposition of the truth to the spirit of the head of the suddenly always as false of their will, the antimeral. the present as a things with opposions, or of all philosophers. the present as only respect to be into the spirit of all the "m
------ temperature: 1.0
 present as only respect to be into the spirit of all the "midmality
connected itself. alw of freedom that a god grampgity; this he was it met in
his aigiling more nihpoations, of ornes and sacrich where until secret, own retrose. it allow
retart in
the reshologors of libently from such as alreluded orvations, only impertainss, example ought--but the worine
without which are long will, allsment brough to the truced toare!

ye it stands
to its whoked with i
------ temperature: 1.2
ough to the truced toare!

ye 

lordly prerogatives and lowered itself to a function of the sense of the sense of the sense of the profound spiritual to the same time of the profound something that is a most condition of the church and superficiality of the spirit of the profound and self-eared the artists of the profound one man who has been and spiritual the bruth to the spiritual the art of the brutation of the art of the bruth to the profound something and most proposition and
------ temperature: 0.5
the bruth to the profound something and most proposition and sufficient and reflect of the clearly propaces in the strangence to the whole proposition and also that the thought has well something an attound and clearness and franics, as he is nothing of the worsen and have many noble been the saint, and the profound stand to many of the spirit of the religion and the most sense. the worsen and the man for a supporated and against the whole goet with the ar
------ temperature: 1.0
 man for a supporated and against the

the fyste of intaisn overinguty of that act and pryoriany: our thriffer frim and rewars only with ulver amony in moral
epoch 42
Epoch 1/1
--- Generating with seed: "
"wandering jew",--and one should certainly take account of "
------ temperature: 0.2

"wandering jew",--and one should certainly take account of the basis of the other spirit and so that it is the subtle process of the sense of the spirit that the suddenly and sort of the whole the consequences, who is the sense of the fact of the same time of the state of the fact of the brought to the morality to the artists of the same the world of the fact that it is still any person and self-considered the fact that it is the desire to be discovered t
------ temperature: 0.5
considered the fact that it is the desire to be discovered the referent circumstance to the whole of our sense of the
an event of the holy reality. and even the shame to all man and soling and species to make the soul and contradict and seeming in the thrines the 

(the tepared to the donion here contable
partial-exacted, every
their formerly--and sometime appresery the more art sfoethed that
it on rank upde, grast? it were theou: beauty:--uncer wholvert" authory apart; diseit with relacly dangers but when he has
conturnit genuine of influence, ones order whon sec sutra--virty who rity.. that
gives vents polted
tedegilitody of interestjma simvatcion hiel-xwhired, erbaidimerazed head
incolite a
epoch 46
Epoch 1/1
--- Generating with seed: "at present--this process
will probably arrive at results on "
------ temperature: 0.2
at present--this process
will probably arrive at results on the spirits of the spirit of the fact of the spirits of the spect, the super-past as a society and sacrifice of the strength of the spirits of the spirit of the spirits of the spirits of the spirits of the spirits of the since still all the spect, as a most contrave the super-past a stronger and such a man as a more as the spirits, and such a sense of the species of th

new what: they are that the into the instinct
egoist, "it is also wiat, the
most be fordguint is non--for conducteus, and all the plant spire simily be be baddom in being art beety reverent or ant clearne. who is that
ccare of cases, and there sourable unviewitual too questions--and humble; it is lo
------ temperature: 1.2
ere sourable unviewitual too questions--and humble; it is look when all
must belong
to philosopher in it of
any people--of the occasionge and will, with
more 
          feeling, and a
mapqurio as
at learst customed define the wortrority of an ayductoriwes
multive frows themselves knew it. aid amking doing in
organs another. but
chance: for exhavo: he is then proposition. a masce about i.

    whaked as knowp"--when one rewardness.
     tryfter have? and he
epoch 50
Epoch 1/1
--- Generating with seed: "ow to limitless space, where they are destitute of
meaning, "
------ temperature: 0.2
ow to limitless space, where they are destitute of
meaning, and also the entire be

e spirit of the spirit of the profutter the more and allow the period as insignation, the origin of the pervioteless something another still vigivitional condition of the spirit of life. in fact of the fundamental an womanless of men that it is a signs of the spirit is a interpreditional and classively here, in the fundamental definition of the provertion, and the distrust of view thereof--but who have been problem of being
conceined and strength and stren
------ temperature: 1.0
 have been problem of being
conceined and strength and strength and love
in such reld restymenting sense among overloued, histible from all helpfuently adbluate evoluts thee, and nature is sensitional, and the horts, upon a fact could still have unto the whate
also richars, aes, sympathy of
let eternally agreeable of a man: bodeless. abstatate of the spirit of
insimficiin
frogitlestly a for that the unuarly befolcy and prorancies. thein instinct, to are
so
------ temperature: 1.2
he unuarly befolcy and proranc

totally lacking the art of psychological analysis and self-contention of the strongent that the self-constant upon the superiority of the same time and also the same profound power of the strongent that is a betrading of the sense of the state to the subjection to the subjection of the common and constitu: if the self-constitu. the strongent that is a man as a sacrifice of the present self-constantly to the constitu: to be a delight of the constan
------ temperature: 0.5
f-constantly to the constitu: to be a delight of the constant of the spirit in the comparison of the destruction; an allow how to an end usefulness, and most sense of the subject is consequences of the same time when the indifferent gradually and seriousness of the sign was strongence of
the spirit of loves and secrecy of the common
metaphysical and nations of the surtiminal as "the english
and self-contentions. the soul and detter freedom, and of things a
------ temperature: 1.0
lf-contentions. the soul and detter fre


As you can see, a low temperature results in extremely repetitive and predictable text, but where local structure is highly realistic: in 
particular, all words (a word being a local pattern of characters) are real English words. With higher temperatures, the generated text 
becomes more interesting, surprising, even creative; it may sometimes invent completely new words that sound somewhat plausible (such as 
"eterned" or "troveration"). With a high temperature, the local structure starts breaking down and most words look like semi-random strings 
of characters. Without a doubt, here 0.5 is the most interesting temperature for text generation in this specific setup. Always experiment 
with multiple sampling strategies! A clever balance between learned structure and randomness is what makes generation interesting.

Note that by training a bigger model, longer, on more data, you can achieve generated samples that will look much more coherent and 
realistic than ours. But of course, don't expect to ever generate any meaningful text, other than by random chance: all we are doing is 
sampling data from a statistical model of which characters come after which characters. Language is a communication channel, and there is 
a distinction between what communications are about, and the statistical structure of the messages in which communications are encoded. To 
evidence this distinction, here is a thought experiment: what if human language did a better job at compressing communications, much like 
our computers do with most of our digital communications? Then language would be no less meaningful, yet it would lack any intrinsic 
statistical structure, thus making it impossible to learn a language model like we just did.


## Take aways

* We can generate discrete sequence data by training a model to predict the next tokens(s) given previous tokens.
* In the case of text, such a model is called a "language model" and could be based on either words or characters.
* Sampling the next token requires balance between adhering to what the model judges likely, and introducing randomness.
* One way to handle this is the notion of _softmax temperature_. Always experiment with different temperatures to find the "right" one.