In [1]:
import keras

Using TensorFlow backend.


In [2]:
keras.__version__

'2.2.4'

# Implementing character-level LSTM text generation
Let's put these ideas in practice in a Keras implementation. The first thing we need is a lot of text data that we can use to learn a language model. You could use any sufficiently large text file or set of text files -- Wikipedia, the Lord of the Rings, etc. In this example we will use some of the writings of Nietzsche, the late-19th century German philosopher (translated to English). The language model we will learn will thus be specifically a model of Nietzsche's writing style and topics of choice, rather than a more generic model of the English language.


In [3]:
import keras
import numpy as np

path = keras.utils.get_file(
    'nietzsche.txt',
    origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
text = open(path).read().lower()
print('Corpus length:', len(text))

Downloading data from https://s3.amazonaws.com/text-datasets/nietzsche.txt
Corpus length: 600893


In [9]:
maxlen=60
step=3
sentences=[]
next_chars=[]
for i in range(0,len(text)-maxlen,step):
    sentences.append(text[i:i+maxlen])
    next_chars.append(text[i+maxlen])
print(len(sentences))

chars=sorted(list(set(text)))
print(len(chars))
char_indices=dict((char,chars.index(char))for char in chars)
x=np.zeros((len(sentences),maxlen,len(chars)),dtype=np.bool)
y=np.zeros((len(sentences),len(chars)),dtype=np.bool)
for i,sentence in enumerate(sentences):
    for t,char in enumerate(sentence):
        x[i,t,char_indices[char]]=1
    y[i,char_indices[next_chars[i]]]=1

200278
58


In [10]:
from keras import layers

In [11]:
from keras.layers import LSTM,Dense
from keras.models import Sequential


In [12]:
model = Sequential()
model.add(LSTM(128,input_shape=(maxlen,len(chars))))
model.add(Dense(len(chars),activation='softmax'))

In [14]:
model.compile(loss='categorical_crossentropy',optimizer=keras.optimizers.RMSprop(lr=0.01))

In [15]:
def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

Given a trained model and a seed text snippet, we generate new text by repeatedly:

1) Drawing from the model a probability distribution over the next character given the text available so far
2) Reweighting the distribution to a certain "temperature"
3) Sampling the next character at random according to the reweighted distribution
4) Adding the new character at the end of the available text
This is the code we use to reweight the original probability distribution coming out of the model, and draw a character index from it (the "sampling function"):

In [16]:
import random
import sys
for epoch in range(1,60):
    print('epoch',epoch)
    model.fit(x,y,batch_size=128,epochs=1)
    start_index=random.randint(0,len(text)-maxlen-1)
    generated_text=text[start_index:start_index+maxlen]

    for temperature in [0.2,0.5,1.0,1.2]:
        sys.stdout.write(generated_text)
        
        for i in range(400):
            sampled=np.zeros((1,maxlen,len(chars)))
            for t,char in enumerate(generated_text):
                sampled[0,t,char_indices[char]]=1.
            preds=model.predict(sampled,verbose=0)[0]
            next_index=sample(preds,temperature)
            next_char=chars[next_index]
            gemerate_text+=next_char
            generated_text=generated_text[1:]
            

SyntaxError: unexpected EOF while parsing (<ipython-input-16-df93336ea1de>, line 13)