## Generate text from Nietzsche's writings using LSTM in Keras
* Author: Gao Yang
* Note:
    * 3 minutes per epoch on Intel i5 CPU
    * Recommend: run on GPU (eg. Floydhub)
    * At least 20 epochs are required, before generated text sounds coherent.
    * If you use this on new data, make sure your corpus has at least ~100k characters, ~1M is better.

#### Setup the Keras enviorment

In [5]:
from __future__ import print_function, division
import keras
from keras.models import Sequential
from keras.layers import Dense, LSTM
from keras.optimizers import RMSprop
from keras.callbacks import LambdaCallback

import numpy as np
import io
import sys
import random

#### Get Nietzsche's writings and prepare character datasets

In [7]:
path = '/Users/Yang/Projects/keras-examples/nietzsche.txt'
with io.open(path, encoding='utf-8') as f:
    text = f.read().lower()
print('Corpus length:',len(text))

Corpus length: 600902


In [18]:
chars = sorted(list(set(text))) # all characters appeared in text
print('character number in total:',len(chars))
char_indices = dict((c,i) for i, c in enumerate(chars))
indices_char = dict((i,c) for i, c in enumerate(chars))

character number in total: 60


In [25]:
# cut the text in sequences of characters, length = max_len
max_len = 40
step = 3
sentences = []
next_char = []

for i in range(0, len(text)-max_len, step): # i=0,3,6,9,...,until 600902-40=600862
    sentences.append(text[i:i+max_len]) # chop a piece of text
    next_char.append(text[i+max_len]) # the character that immediately follows that piece of text

print('total sequence number:',len(sentence))

total sequence number: 200288


In [33]:
x = np.zeros((len(sentences), max_len, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)

for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1 # label where a char appears in the right position in each sentence
    y[i, char_indices[next_char[i]]] = 1 # label where the first char of next sentence appears

##### helper functions

In [41]:
def sample(prob_array, temp=1.0):
    
    "samples an index from a probability array"
    
    prob_array = np.asarray(prob_array).astype('float64')
    prob_array = np.log(prob_array) / temp
    exp_prob_array = np.exp(prob_array)
    prob_array = exp_prob_array / np.sum(exp_prob_array)
    probabilities = np.random.multinomial(1, prob_array, 1)
    
    return np.argmax(probabilities)

def on_epoch_end(epoch, logs):
    
    "prints generated text at end of each epoch"
    
    print()
    print('------ Generate text after epoch: {:02d}'.format(epoch))
    
    start_index = np.random.randint(0, len(text)-max_len-1)
    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print('diversity:{}'.format(diversity))
        
        generated = ''
        sentence = text[start_index:start_index+max_len]
        generated += sentence
        sys.stdout.write(generated)
        # or store into a file
        
        for i in range(400):
            x_prediction = np.zeros((1,max_len,len(chars)))
            for t, char in enumerate(sentence):
                x_prediction[0, t, char_indices[char]] = 1
                
            predictions = model.predict(x_prediction, verbose=0)[0]
            next_index = sample(predictions, diversity)
            next_char = indices_char[next_index]
            
            generated += next_char
            sentence = sentence[1:] + next_char
            
            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

#### Build a LSTM model

In [42]:
model = Sequential()
model.add(LSTM(128, input_shape=(max_len,len(chars)))) # input shape = shape of each slice of x
model.add(Dense(len(chars), activation='softmax'))

model.summary()

optimizer = RMSprop(lr=0.01)

model.compile(loss='categorical_crossentropy',
             optimizer=optimizer)


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_5 (LSTM)                (None, 128)               96768     
_________________________________________________________________
dense_5 (Dense)              (None, 60)                7740      
Total params: 104,508
Trainable params: 104,508
Non-trainable params: 0
_________________________________________________________________


#### Train

In [44]:
print_callback = LambdaCallback(on_epoch_end=on_epoch_end)

model.fit(x,y,
         batch_size=128,
         epochs=20,
         callbacks=[print_callback])

Epoch 1/20

------ Generate text after epoch: 00
diversity:0.2
elief in "immediate certainties" is a morality of the soul and some of the soul and all the superious of the sense of the soul of the soul the serious of the same to the still the superious of the self-another and something the superioming of the superious of the still the soul of the soul of the superious of the soul and something of the still be some of the still the supposition of the still the soul of the soul the soul and all the s
diversity:0.5
elief in "immediate certainties" is a moral only from the contraction. an avily and extrint themsy the spiritual and seuratic only the morality of the sense of the spirit we place himself an its of the werves, and possible in the soul of the rest to the contract election that has his look the sciention as the belieful with one all the soul any would a such one the contrary of the soul and morality, and required in the soul of the last m
diversity:1.0
elief in "immediate certain

as anybody does who no longer feeling and such the strength, the suchience of the such a strong to the strongers of the sentiment the such the such a religion of the such a man as it is the sure of the such a child some present the such a conscience of the suchieme of the famer the suchieved and the states of the such a here the strongers of the such a strong to the stronger the stronger the suchiement and such a man is the such
diversity:0.5
enough,
as anybody does who no longer feelings of the happiness of the in the german the strength, one must not in the strength of the sentiments and same to the inderemolation, would be in the great sceep and sure of the subject them what spectiles of the indainest function of the proves for the subtle in the assiment to the intellectual the basis of the prestion of his artives to himself or to himself there is the regard himself when 
diversity:1.0
enough,
as anybody does who no longer ferlony ourselves ground of timive
which over formil idial p

  


erence,
the folly very indutaance,
.
ytal
party is qualimintw] if theosaten of transely knesath"--demins sometrifications." bruge
rock
werist
exception of "the suspicive at an
Epoch 6/20

------ Generate text after epoch: 05
diversity:0.2
each this
morality of mediocrity! it can not be strength of the sure of the profound that he seems to anything and the strength, and the stands the stands have nothing of the strength, and the profound, and what is the master of the strength, and the strength, and also the strength and profounds and such a sure of the pressions of the profound themselves of the strength and successived and such a surprising to the strength. the strength
diversity:0.5
each this
morality of mediocrity! it can not be the great contract that we have not the look and contravied of life of the sure of the same subjuct of the such and that is now with the case in the "such a society, and should that a religionity, as an accurration, and the man and also be look also itself, 

oever has done this, has perhaps just the strength of the fashing the develop and soul the soul of the such a man are the sense and the sense and self-best in a consideration of the and sense and conscience of the super of the present to the sense and soul of the same discording to the or and the same nothing the such a man is a desire and soul in the present to the more and souls and desire to the sense of the present of the same astic
diversity:0.5
oever has done this, has perhaps just the amort of the former and looks of the evil open, and in the experience is in the happiness and make the morality in the significance that the said also into every problem of a contemplental conditions of the more spectful of the necessary and influence of a man in the free more which nothing of man in the sense and species elements of the the conflict and state of the conditions of life and the s
diversity:1.0
oever has done this, has perhaps just the fashs, self-pushancting
thing looks of faver con

germans there is alternately the supposition of the superficial superficial and supposition and supposition of the principle of the supposition of the condition of the superficial and something of the sure has a subject and the superficial the standard and supposition of the same time and supposition and supposition of the subject and subject in the great the soul and supposition of the sure and in the spirit in the subject
diversity:0.5
 present-day
germans there is alternately the condition of the spirit the sense the way may swere in the moral nature of his taste, in the moral heart of the virtue in his except the point of the point in a conscience of a desire of the soul of a comperhaple commonity of the degrain of the spirit to a scaie the feeling of the succession and said the belief which that in the superficiality of the general of an inclination of the treat of
diversity:1.0
 present-day
germans there is alternateligurish as except the standards
ruble a developing,
unegoist ab

which from time to time the provencal and in the morality, the spirit of the continually the soul and man in the personality, the world the sense of the success and the whole present to the sense of the sense of the fashion that the world of the soul and the soul of the strength of the personality to the morality, and the heart that the art of the sense of the some the some individual and the success of the success and the world the sen
diversity:0.5
which from time to time the provencal and also a struggle has a character of the soul and strength, it would be means of the world place as a solemness of the remark the more to be often innocents and the continuality of the every an had how the intermres shar be desire to every antithest to persons in the man the world the pains of many order men and has a made nothing
the independence of the belief the remark in the morality, the cons
diversity:1.0
which from time to time the provencal and sud. "dreams.--this author and bided sugalt, he


<keras.callbacks.History at 0x125240c88>