In [1]:
import keras
keras.__version__

Using TensorFlow backend.


'2.2.0'

# Text generation with LSTM

This notebook contains the code samples found in Chapter 8, Section 1 of [Deep Learning with Python](https://www.manning.com/books/deep-learning-with-python?a_aid=keras&a_bid=76564dff). Note that the original text features far more content, in particular further explanations and figures: in this notebook, you will only find source code and related comments.

----

[...]

## Implementing character-level LSTM text generation


Let's put these ideas in practice in a Keras implementation. The first thing we need is a lot of text data that we can use to learn a 
language model. You could use any sufficiently large text file or set of text files -- Wikipedia, the Lord of the Rings, etc. In this 
example we will use some of the writings of Nietzsche, the late-19th century German philosopher (translated to English). The language model 
we will learn will thus be specifically a model of Nietzsche's writing style and topics of choice, rather than a more generic model of the 
English language.

## Preparing the data

Let's start by downloading the corpus and converting it to lowercase:

In [3]:
import keras
import numpy as np
import codecs

path = keras.utils.get_file(
    'nietzsche.txt',
    origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
text = codecs.open(path, "r", "utf-8").read().lower()
print('Corpus length:', len(text))

Corpus length: 600893



Next, we will extract partially-overlapping sequences of length `maxlen`, one-hot encode them and pack them in a 3D Numpy array `x` of 
shape `(sequences, maxlen, unique_characters)`. Simultaneously, we prepare a array `y` containing the corresponding targets: the one-hot 
encoded characters that come right after each extracted sequence.

In [4]:
# Length of extracted character sequences
maxlen = 60

# We sample a new sequence every `step` characters
step = 3

# This holds our extracted sequences
sentences = []

# This holds the targets (the follow-up characters)
next_chars = []

for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('Number of sequences:', len(sentences))

# List of unique characters in the corpus
chars = sorted(list(set(text)))
print('Unique characters:', len(chars))
# Dictionary mapping unique characters to their index in `chars`
char_indices = dict((char, chars.index(char)) for char in chars)

# Next, one-hot encode the characters into binary arrays.
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Number of sequences: 200278
Unique characters: 57
Vectorization...


## Building the network

Our network is a single `LSTM` layer followed by a `Dense` classifier and softmax over all possible characters. But let us note that 
recurrent neural networks are not the only way to do sequence data generation; 1D convnets also have proven extremely successful at it in 
recent times.

In [5]:
from keras import layers

model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

Since our targets are one-hot encoded, we will use `categorical_crossentropy` as the loss to train the model:

In [6]:
optimizer = keras.optimizers.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

## Training the language model and sampling from it


Given a trained model and a seed text snippet, we generate new text by repeatedly:

* 1) Drawing from the model a probability distribution over the next character given the text available so far
* 2) Reweighting the distribution to a certain "temperature"
* 3) Sampling the next character at random according to the reweighted distribution
* 4) Adding the new character at the end of the available text

This is the code we use to reweight the original probability distribution coming out of the model, 
and draw a character index from it (the "sampling function"):

In [7]:
def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)


Finally, this is the loop where we repeatedly train and generated text. We start generating text using a range of different temperatures 
after every epoch. This allows us to see how the generated text evolves as the model starts converging, as well as the impact of 
temperature in the sampling strategy.

In [8]:
import random
import sys

for epoch in range(1, 60):
    print('epoch', epoch)
    # Fit the model for 1 epoch on the available training data
    model.fit(x, y,
              batch_size=128,
              epochs=1)

    # Select a text seed at random
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text[start_index: start_index + maxlen]
    print('--- Generating with seed: "' + generated_text + '"')

    for temperature in [0.2, 0.5, 1.0, 1.2]:
        print('------ temperature:', temperature)
        sys.stdout.write(generated_text)

        # We generate 400 characters
        for i in range(400):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(generated_text):
                sampled[0, t, char_indices[char]] = 1.

            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]

            generated_text += next_char
            generated_text = generated_text[1:]

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

epoch 1
Epoch 1/1
--- Generating with seed: "t only
permitted to make this attempt, it is commanded by th"
------ temperature: 0.2
t only
permitted to make this attempt, it is commanded by the a consequence of the the greater and sense of the great the the consequence of the conscious and subjection of the the intermate of the morality there when which the great the the sense of the the serrity of the histerment the reader of him of the conscience of the besting the there is the interpater and soul, the the regarded the great the properved and sense of the consequent and the are of th
------ temperature: 0.5
 the properved and sense of the consequent and the are of the dengance of the presence of the great one of the pleasure of serrate, there senses of the serration ensens and in the beception of one's learnt the in the read them there
is good one and him estence of unlow to
stould all one frot are of serfing exception of the high ent of a reader of the dount that german and one has e

to sinklof menricly, as he seconbavoused sary, morals.
      kaptyy
servarl of valuatitad susts enjoyselner, of in alloract,ly fundgenessely,
that digdar"n
seek is ay no livisagary.=--an to-ttrud. could indowfe, with gultidly redulty has, burder are taste) "to orial" mumble exparliser
ideased fren
thest, or "goy in their harby past are voncew?
epoch 5
Epoch 1/1
--- Generating with seed: "eneration, which has inherited as it were different standard"
------ temperature: 0.2
eneration, which has inherited as it were different standard the sense of the stronger the sensition of the same the spirit of the spirit of the promise the belief in the are and and the sensition of the considered to a states of the stronger to language and all the present to the sense of the spirit the spirit and the feeling that the sense of the sense of the spirit in the sense of the success of the morality of the profore and success of the prover the 
------ temperature: 0.5
f the morality of the profore and succ

encomely, and finds wisled--and be concealed teacing origin, and recall under their  uss an enough and sate in the -flaity to a loug out our ersish human liess of period heaving and philosophical metaphysical man one, being itself.

entificulaten of to confeted to all
perhaps accumured itself that is self-aristof with answardmed and joy meties as
couming the conmust learned.


------ temperature: 1.2
 answardmed and joy meties as
couming the conmust learned.


1ihe him, too antair
noting now
still the suppocate
of a jest to's life; first n
which me happiness a ma law with the undenedent to his treevat viratinateons reputumsely away to ony vie
avoughs but
exonect,


uticlos oneuthable fins dsations clears, mediorkations)
fore, fragrrund hen--a hat affords dispoces. one been rety, (are degreemsely
gutter and cails--every confairy to the disful wonsens
sy
epoch 9
Epoch 1/1
--- Generating with seed: "mething dreadful also.

90. heavy, melancholy men turn light"
------ temperature: 0.2
meth

sentiments and finds as an accustomed as consequence is something of the earthled been the conditioned and state of the sense of over
consequence of the sense of the later, and the conscience of its politice is a desentions therefore an englished tha
------ temperature: 1.0
e of its politice is a desentions therefore an englished than poosises. how timent. let us nothing,--in his riched ials, they are really
sict and bey and hitherto
possernce is, that greater with traienance of sin into
me, as in this delicate fore; and he tale, how make certainly phenomenre is nom all thereoular one should otherw, understined and witny? achied siteds
it is cra"unce sonners for those and undivide, ontestrures and
growherers: threedism. may i
------ temperature: 1.2
e and undivide, ontestrures and
growherers: threedism. may it ishes no sweal-indirined moral rekgode origin, eay abifedf, in they curalting. 
  encessenifiincess and knew fore, no decided no slang, 


1treat


  This is separate from the ipykernel package so we can avoid doing imports until


a "nething
satisfifity emideser--refenschings in
truthqued" loved, long
upmoratic which aushow hartuin." of such only
if the self supposers. ther ourselvess a flein-sccting assumen useful natural;
they so that
ogrindance,
the colonc therewind
individua
epoch 13
Epoch 1/1
--- Generating with seed: "new
individual an unobjectionable opportunity for a new poss"
------ temperature: 0.2
new
individual an unobjectionable opportunity for a new possibilitic and the germans and super--as the consideration of the same the sense of the sense of the sense of the sense of the consideration of the will to the spirit and problem of the same many and problem and the sense of the same time that is not to be a morality of the same still and sound of the sense of the same time the sense of the same time the same the general and standed the sense of the
------ temperature: 0.5
 same time the same the general and standed the sense of the definite with a
characterism, not the and the same matter, in the fac

change, the lolower. but whoever not be the frectisving and world, unhapse its rexpect it. if the cases to blands, lickwish, believed, with man, how read has the stor master, amulal: the here--break for institutient.--him, and the general b
------ temperature: 1.2
l: the here--break for institutient.--him, and the general been sense": its way indiaved, or compariagious induld which rangs of sapisive
learned to 's philosofhalous partious firgs is do
pleenced to the hesure, dangers mlood which from those does
to temiragt ought
timility
new inays without disbady: the awher oraire closite and of what the schopenhauer, knowlrow to oneself what away an.. chad "germed apart
equal consequentlyd expeciatfund sensible us some
epoch 17
Epoch 1/1
--- Generating with seed: "ntempt: for, as regards his "freedom,"
thereby hangs a tale."
------ temperature: 0.2
ntempt: for, as regards his "freedom,"
thereby hangs a tale. the self-active and the present self-experience of the self-work of the same the 

individual and man who has a man of which it is a man in the sense of the fact to it would be religiously for the ferring the one whe
------ temperature: 1.0
 fact to it would be religiously for the ferring the one when he is even, and begone of yet judgm. it general, epraisation of the purests have do any time will wacteness, but at one longly its spiritw of this wild," is
even what they not would be arch bold suncome of putofulin, these for well--theyelves--fapould a stroke,-but of his
grand, a religioe that is arts to effect, to the treacing springed of alt) bre
knowan origin. ashide his vilthely, and is any
------ temperature: 1.2
d of alt) bre
knowan origin. ashide his vilthely, and is any favuldop
follichd, "clums! if just taste of disintated the itselves--in the
phince, the grants young, yet into into him according to heate bound, she nature of prty shroht the intiohs upnaty, thinked in here--they ,hous have understentry itself, is tranyon
it for niblessow whate and has not ad"w

nse of the same sure of the sure, the sure, and the strong and at specially we the encount so not be gare the such a desire in a superior, but and be the same bad conscience, and there is survived
there the men and brought and bructry has great make a god. they are and can be developed and very destiny in the way in the enemy the devolsitic
continually with a philosophy in the enough the terms
there is a definite to have a soul under the conduct of the des
------ temperature: 1.0
re is a definite to have a soul under the conduct of the desired":
"not there
at an admit upchrious, now cannot
skint to whom
origins, as the summoly for a young away in detired--evident to instincture there an expection in "her, prompatur 
and suffer wimble,
in "tobe man--frue sensual soul. a
prices, for a man shall
make such a aget all fastitive crupitates man--usubles are other sherlivents, we would found with a way, period defy
oppear cangofering for l
------ temperature: 1.2
would found with a way, period

have to extol his course of the senses the stronger the conscience of the senses the senses the subtle and the senses in the senses to which the senses to the same and the conscious and the fact of the faith in the same and the senses to consequently see the consequence of the senses to the senses to the sense of the conviction of the conflictions and and the confounds to the senses to the conscious and the conscience 
------ temperature: 0.5
confounds to the senses to the conscious and the conscience of the madness of science of the facts and soul--and what is at all the same all the
learned to entirely and destress"" of best and profoundly and single of the cause of a case of the "in the enough, and to discostully, to explanation and has not once at all the whole for origin of a word, as a hare sympathy is only even say in the last, and cause of the same enough, and belief to sensed and the s
------ temperature: 1.0
and cause of the same enough, and belief to sensed and the spirit pi

KeyboardInterrupt: 


As you can see, a low temperature results in extremely repetitive and predictable text, but where local structure is highly realistic: in 
particular, all words (a word being a local pattern of characters) are real English words. With higher temperatures, the generated text 
becomes more interesting, surprising, even creative; it may sometimes invent completely new words that sound somewhat plausible (such as 
"eterned" or "troveration"). With a high temperature, the local structure starts breaking down and most words look like semi-random strings 
of characters. Without a doubt, here 0.5 is the most interesting temperature for text generation in this specific setup. Always experiment 
with multiple sampling strategies! A clever balance between learned structure and randomness is what makes generation interesting.

Note that by training a bigger model, longer, on more data, you can achieve generated samples that will look much more coherent and 
realistic than ours. But of course, don't expect to ever generate any meaningful text, other than by random chance: all we are doing is 
sampling data from a statistical model of which characters come after which characters. Language is a communication channel, and there is 
a distinction between what communications are about, and the statistical structure of the messages in which communications are encoded. To 
evidence this distinction, here is a thought experiment: what if human language did a better job at compressing communications, much like 
our computers do with most of our digital communications? Then language would be no less meaningful, yet it would lack any intrinsic 
statistical structure, thus making it impossible to learn a language model like we just did.


## Take aways

* We can generate discrete sequence data by training a model to predict the next tokens(s) given previous tokens.
* In the case of text, such a model is called a "language model" and could be based on either words or characters.
* Sampling the next token requires balance between adhering to what the model judges likely, and introducing randomness.
* One way to handle this is the notion of _softmax temperature_. Always experiment with different temperatures to find the "right" one.

In [9]:
model.save("./modelTrain/model8.1.lstmseq")