<a href="https://colab.research.google.com/github/j-buss/text_gen_lstm/blob/master/text_gen_lstm_detail_description.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Load libraries

In [0]:
import numpy as np
import keras
keras.__version__

Using TensorFlow backend.


'2.2.4'

### Get the Data

This example uses text from the works of Friedrich Nietzsche.

In [0]:
path = keras.utils.get_file(
    'nietzsche.txt',
    origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
text = open(path).read().lower()
print('Corpus length:', len(text))

Downloading data from https://s3.amazonaws.com/text-datasets/nietzsche.txt
Corpus length: 600893


### Prepare the text sentences

PICTURE:


*   length of sentence (maxlen)
*   sentence shifted by (step)
*   next character is the actual next character from the input data
*   sentences list
*   next_char list



In [0]:
# Length of extracted character sequences
maxlen = 60

# We sample a new sequence every `step` characters
step = 3

# This holds our extracted sequences
sentences = []

# This holds the targets (the follow-up characters)
next_chars = []

for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('Number of sequences:', len(sentences))

Number of sequences: 200278


In [0]:
sentences[:10]

['preface\n\n\nsupposing that truth is a woman--what then? is the',
 'face\n\n\nsupposing that truth is a woman--what then? is there ',
 'e\n\n\nsupposing that truth is a woman--what then? is there not',
 '\nsupposing that truth is a woman--what then? is there not gr',
 'pposing that truth is a woman--what then? is there not groun',
 'sing that truth is a woman--what then? is there not ground\nf',
 'g that truth is a woman--what then? is there not ground\nfor ',
 'hat truth is a woman--what then? is there not ground\nfor sus',
 ' truth is a woman--what then? is there not ground\nfor suspec',
 'uth is a woman--what then? is there not ground\nfor suspectin']

In [0]:
my_data.x

NameError: ignored

In [0]:
next_chars[:10]

['r', 'n', ' ', 'o', 'd', 'o', 's', 'p', 't', 'g']

In [0]:
# List of unique characters in the corpus
chars = sorted(list(set(text)))
print('Unique characters:', len(chars))
# Dictionary mapping unique characters to their index in `chars`
char_indices = dict((char, chars.index(char)) for char in chars)

Unique characters: 57


In [0]:
# Next, one-hot encode the characters into binary arrays.
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Vectorization...


### Visualize the preprocessed data

In [0]:
x.shape[0]

200278

In [0]:
char_indices.keys()

dict_keys(['\n', ' ', '!', '"', "'", '(', ')', ',', '-', '.', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '=', '?', '[', ']', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'ä', 'æ', 'é', 'ë'])

In [0]:
import pandas as pd
from IPython.display import display
pd.options.display.max_columns = None
pd.set_option('display.html.table_schema', True)
from IPython.display import HTML

make a pandas dataframe with the Dictionary Keys

In [0]:
df = pd.DataFrame(data=x[0],columns=char_indices.keys())

To enhance readability change all the True/False values to 1s and 0s

In [0]:
df = df.applymap(lambda x: 1 if x == True else x)
df = df.applymap(lambda x: 0 if x == False else x)

Add in the first sentence as the first column - to match with the character columns

In [0]:
df['1st Sentence'] = list(sentences[0])

In [0]:
result = pd.concat([df.loc[:,'1st Sentence'],df.loc[:,'a':'z']],axis=1)

In [0]:
result[0:7]

Unnamed: 0,1st Sentence,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z
0,p,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
1,r,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
2,e,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,f,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,a,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,c,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,e,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


# Perform the LSTM

## Define the model

In [0]:
from keras import layers

model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))
optimizer = keras.optimizers.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

In [0]:
def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

The sampling function will add some variability to the output data based on the "temperature". 

With a temperature of 1 ...; However with a smaller value of temperature ...

## Execute the Model

The execution of the model will fit the model based on the input data. In addition the following function will generate sample text. 

### Sample Text Generation

In [0]:
import random

In [0]:
## Select text at random from the text
start_index = random.randint(0, len(text) - maxlen - 1)
seed_text = text[start_index: start_index + maxlen]
print('--- Generating with seed: "' + seed_text + '"')

--- Generating with seed: "
it there. the stone must have moved itself there. that is t"


In [0]:
## create array to store vectorized seed text
sampled = np.zeros((1, maxlen, len(chars)),dtype='int')
sampled

array([[[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]]])

In [0]:
### Encode the Seed Text
for t, char in enumerate(seed_text):
    sampled[0, t, char_indices[char]] = 1

In [0]:
sampled.shape
# The shape of (1, 60, 57) is for encoding 1 seed text, 60 characters long and 57 possible characters

(1, 60, 57)

Let's add a few other libraries to help show the encoding of the seed text

In [0]:
import pandas as pd
from IPython.display import display
pd.options.display.max_columns = None

In [0]:
## Create a dataframe which will contain the vectorized Seed Text.
## It has the Character Indices as the column heads and the values in the 
##      columns are the encoded seed text found in "Sampled"
df = pd.DataFrame(data=sampled[0],columns=char_indices.keys())

In [0]:
## Add the characters of the seed text as a column of the Data Frame
df['seed text'] = list(seed_text)

In [0]:
## create a new dataframe "result" which contains the characters from the seed text
## as well as the vectorized values from a to z
## Note: For this excercise we have identified only the 26 character values from a->z
##    as we know from above there are other characters which are found in the corpus
result = pd.concat([df.loc[:,'seed text'],df.loc[:,'a':'z']],axis=1)
result

Unnamed: 0,seed text,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z
0,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,s,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
2,p,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
3,i,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,r,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
5,i,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,t,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
7,"""",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,-,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,-,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


## Predict the next values

### Train the model

Before we can predict the next values of text we need to actually run and train the model.

In [0]:
# Define the model:
from keras import layers

model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))
optimizer = keras.optimizers.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

In [0]:
# Fit the model on the corpus of text:
model.fit(x, y,
          batch_size=128,
          epochs=1)

Epoch 1/1


<keras.callbacks.History at 0x7fe85a3bc668>

In [0]:
# Now leverage the model that was created and train above in line # xxx
# to predict the next character
preds = model.predict(sampled, verbose=0)[0]

In [0]:
# if we display the matrix of the predicted values we see the relative likelihood 
# for each of the characters as being the next letter in the string
preds

array([4.13491465e-02, 1.59794554e-01, 5.63299400e-04, 1.81057979e-03,
       3.50241200e-04, 1.35766546e-04, 3.16551632e-05, 2.95479549e-03,
       3.66532803e-03, 1.72107294e-03, 2.80760886e-15, 3.57212302e-05,
       6.65135929e-07, 6.74497672e-15, 4.62266076e-13, 4.39577357e-16,
       4.25214758e-18, 8.73243654e-17, 1.14059795e-09, 9.58926909e-19,
       4.27863415e-04, 7.96821143e-04, 1.63371675e-04, 2.29900354e-03,
       9.13479849e-19, 6.50109037e-17, 1.48621621e-13, 3.17209698e-02,
       3.21182003e-03, 4.93428018e-03, 1.95456669e-02, 4.70403470e-02,
       6.05899747e-03, 1.06031066e-02, 4.51016165e-02, 9.91097465e-02,
       1.00319623e-03, 9.85843828e-04, 1.44484844e-02, 1.75545420e-02,
       3.59608866e-02, 1.63418680e-01, 4.12609661e-03, 2.33757732e-04,
       6.73919544e-02, 1.23338886e-01, 5.10327034e-02, 1.46309603e-02,
       2.85846693e-03, 6.76764641e-03, 7.22476281e-03, 5.41723100e-03,
       1.79443072e-04, 1.05838266e-18, 1.14502207e-18, 9.78052669e-19,
      

In [0]:
np.argmax(preds)

NameError: ignored

In [0]:
preds[41]

0.16341868

In [0]:
## Do a 'reverse' lookup in the dictionary for the character associated with the 41st value
for character, value in char_indices.items():    # for name, age in dictionary.iteritems():  (for Python 2.x)
    if  value == 41:
        print("Character predicted is: " + character)

Character predicted is: o


While the predicted value is "o". There may be a desire to add some variability in the predicted value as this will result in more "interesting" output. 

The function sample accomplishes this task. With a different "temperature" as the measure of variability.

Let's decompose the function a bit to really understand what it is doing.

This function takes the prediction array as input.

In [0]:
preds

array([4.13491465e-02, 1.59794554e-01, 5.63299400e-04, 1.81057979e-03,
       3.50241200e-04, 1.35766546e-04, 3.16551632e-05, 2.95479549e-03,
       3.66532803e-03, 1.72107294e-03, 2.80760886e-15, 3.57212302e-05,
       6.65135929e-07, 6.74497672e-15, 4.62266076e-13, 4.39577357e-16,
       4.25214758e-18, 8.73243654e-17, 1.14059795e-09, 9.58926909e-19,
       4.27863415e-04, 7.96821143e-04, 1.63371675e-04, 2.29900354e-03,
       9.13479849e-19, 6.50109037e-17, 1.48621621e-13, 3.17209698e-02,
       3.21182003e-03, 4.93428018e-03, 1.95456669e-02, 4.70403470e-02,
       6.05899747e-03, 1.06031066e-02, 4.51016165e-02, 9.91097465e-02,
       1.00319623e-03, 9.85843828e-04, 1.44484844e-02, 1.75545420e-02,
       3.59608866e-02, 1.63418680e-01, 4.12609661e-03, 2.33757732e-04,
       6.73919544e-02, 1.23338886e-01, 5.10327034e-02, 1.46309603e-02,
       2.85846693e-03, 6.76764641e-03, 7.22476281e-03, 5.41723100e-03,
       1.79443072e-04, 1.05838266e-18, 1.14502207e-18, 9.78052669e-19,
      

What is the preds array? It is the percentage of one character being the next letter in line. As such we took the argmax of the preds array to extract the value with the highest percentage.

As the values in the array are percentages they sum to 1

In [0]:
np.sum(preds)

1.0

In order to drive the point home of the actions of the function let's use a smaller array, but with the same characteristics:


*   It represents percentages
*   The sum of the percentages add to 1



In [0]:
# No special values in this array other than they already add to 1.
# we will do one more "random" example after this very simple one.
input = np.array([0.02673774, 0.87052092, 0.10274135])
input

array([0.02673774, 0.87052092, 0.10274135])

Set the temperature varaible to 1 and ensure that the data types for the Numpy Array are set to Float to accommodate the calculations later on.

In [0]:
temperature = 1
prediction = np.asarray(input).astype('float64')

In [0]:
prediction_log = np.log(prediction)
prediction_log

array([-3.62167923, -0.13866349, -2.27554061])

In [0]:
prediction_log / temperature

array([-3.62167923, -0.13866349, -2.27554061])

In [0]:
exp_prediction = np.exp(prediction_log / temperature)
print(exp_prediction)
print(np.sum(exp_prediction))

[0.02673774 0.87052092 0.10274135]
1.00000001


With the temperature of "1" the values are left un-changed

In [0]:
np.set_printoptions(precision=4)
print('beginning prediction array: \t\t{0}'.format(input))
temperature = 0.9
prediction = np.asarray(input).astype('float64')
prediction_log = np.log(prediction)
#print('prediction_log: \t\t\t{0}'.format(prediction_log))
pred_log_by_temp = prediction_log / temperature
#print('prediction_log / temperature: \t\t{0}'.format(pred_log_by_temp))

pred_exp = np.exp(pred_log_by_temp)
#print('new prediction before normalization: \t{0}'.format(pred_exp))

new_preds = pred_exp / np.sum(pred_exp)

print('new prediction: \t\t\t{0}'.format(new_preds))

beginning prediction array: 		[0.0267 0.8705 0.1027]
new prediction: 			[0.0187 0.8977 0.0836]


In the previous example we see that a value of temperature of 0.9 instead of 1; introduces even less variability in the numbers.

In [0]:
for i in np.linspace(1,0.1,10):
  print(i)

1.0
0.9
0.8
0.7
0.6
0.5
0.3999999999999999
0.29999999999999993
0.19999999999999996
0.1


In [0]:
np.set_printoptions(precision=4)
def test_sampling(input_array):
  for temp in np.linspace(1,0.1,10):
    prediction = np.asarray(input_array).astype('float64')
    prediction_log = np.log(prediction)
    #print('prediction_log: \t\t\t{0}'.format(prediction_log))
    pred_log_by_temp = prediction_log / temp
    #print('prediction_log / temperature: \t\t{0}'.format(pred_log_by_temp))

    pred_exp = np.exp(pred_log_by_temp)
    #print('new prediction before normalization: \t{0}'.format(pred_exp))

    new_preds = pred_exp / np.sum(pred_exp)  
    print('\n')
    print('temperature: \t\t\t\t{0}'.format(temp))
    print('beginning prediction: \t\t\t{0}'.format(input_array))
    print('new prediction: \t\t\t{0}'.format(new_preds))


In these examples we see that the spread of percentages become more concentrated on a specific value and even less spread out.

In [0]:
input = np.array([0.10, 0.25, 0.65])
test_sampling(input)
input = np.array([0.25, 0.35, 0.40])
test_sampling(input)



temperature: 				1.0
beginning prediction: 			[0.1  0.25 0.65]
new prediction: 			[0.1  0.25 0.65]


temperature: 				0.9
beginning prediction: 			[0.1  0.25 0.65]
new prediction: 			[0.085  0.2352 0.6799]


temperature: 				0.8
beginning prediction: 			[0.1  0.25 0.65]
new prediction: 			[0.0689 0.2165 0.7147]


temperature: 				0.7
beginning prediction: 			[0.1  0.25 0.65]
new prediction: 			[0.0521 0.1928 0.7551]


temperature: 				0.6
beginning prediction: 			[0.1  0.25 0.65]
new prediction: 			[0.0354 0.163  0.8015]


temperature: 				0.5
beginning prediction: 			[0.1  0.25 0.65]
new prediction: 			[0.0202 0.1263 0.8535]


temperature: 				0.3999999999999999
beginning prediction: 			[0.1  0.25 0.65]
new prediction: 			[0.0084 0.0833 0.9082]


temperature: 				0.29999999999999993
beginning prediction: 			[0.1  0.25 0.65]
new prediction: 			[0.0019 0.0397 0.9585]


temperature: 				0.19999999999999996
beginning prediction: 			[0.1  0.25 0.65]
new prediction: 			[8.5459e-05 8.3456e

as the temperature value decreases it consolidates the probability onto the highest values

The remaining component of the code in the LSTM then is the multinomial distribution. This is simply the simluation of the possible outcomes given the probabilities.

In [0]:
np.random.multinomial(1, input, 1)

array([[1, 0, 0]])

**INSERT GRAPHIC**



*   1: [2.67%, 87.05%, 10.27%]
*  0.9: [1.78%, 85.72%, 7.9%] 
List item



In [0]:
def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    #pred_log_by_temp
    
    exp_preds = np.exp(preds)
    #pred_exp
    
    preds = exp_preds / np.sum(exp_preds)
    #new_preds
    
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [0]:
import random
import sys

In [0]:
def Execute_Model(x, y, model, text, maxlen, chars, char_indices, 
                  num_of_epochs=60, batch_size=128, num_chr_to_create=400,
                 print_output=True):
  for epoch in range(1, num_of_epochs):
    if print_output:
      print('epoch', epoch)
    
    callbacks_list = [
        keras.callbacks.ModelCheckpoint(
          filepath='my_model_' + f'{epoch:03}' + '.h5'
        )
    ]
    
    # Fit the model for 1 epoch on the available training data
    model.fit(x, y,
              batch_size=128,
              epochs=1,
             callbacks=callbacks_list)
    # Select a text seed at random
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text[start_index: start_index + maxlen]
    if print_output:
      print('--- Generating with seed: "' + generated_text + '"')
    for temperature in [0.2, 0.5, 1.0, 1.2]:
        if print_output:
          print('------ temperature:', temperature)
          sys.stdout.write(generated_text)

        # We generate 400 characters
        for i in range(num_chr_to_create):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(generated_text):
                sampled[0, t, char_indices[char]] = 1.

            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]

            generated_text += next_char
            generated_text = generated_text[1:]

            if print_output:
              sys.stdout.write(next_char)
              sys.stdout.flush()
        if print_output:
          print()

In [0]:
x_subset = x[:10000,:,:]

In [0]:
x_subset.shape

(10000, 60, 57)

In [0]:
y_subset = y[:10000]

In [0]:
y_subset.shape

(10000, 57)

In [0]:
Execute_Model (x_subset, y_subset, model, text, maxlen, chars, char_indices, 
              num_of_epochs=2, batch_size=128, num_chr_to_create=40, print_output=False)

Epoch 1/1


In [0]:
import timeit
def wrapper(func, *args, **kwargs):
  def wrapped():
    return func(*args, **kwargs)
  return wrapped

In [0]:
wrapped = wrapper(Execute_Model, x_subset, y_subset, model, text, maxlen, chars, char_indices, 
              num_of_epochs=2, batch_size=128, num_chr_to_create=40, print_output=False)
timeit.timeit(wrapped, number=10)

NameError: ignored

In [0]:
Execute_Model(x_subset, y_subset, model, text, maxlen, chars, char_indices, 
              num_of_epochs=3, batch_size=128, num_chr_to_create=40)

epoch 1
Epoch 1/1
--- Generating with seed: "ll great facts.

[19] "der moralische mensch, sagt er, steht"
------ temperature: 0.2
ll great facts.

[19] "der moralische mensch, sagt er, stehttttttttttttttttttttttttttttttttttttttttt
------ temperature: 0.5
nsch, sagt er, stehttttttttttttttttttttttttttttttttttttttttttttttttbttkttatttttttttttadtttt tttttota
------ temperature: 1.0
tttttttttttttttttttttttttttbttkttatttttttttttadtttt tttttota f
  tttdco tbfbotototftttaibe fni ttiyt
------ temperature: 1.2
tttttadtttt tttttota f
  tttdco tbfbotototftttaibe fni ttiytbcbt tdd
s.kotbtbiat t otd
atsmeitthuttl
epoch 2
Epoch 1/1
--- Generating with seed: "o any distance, from a depth up
to any height, from a nook i"
------ temperature: 0.2
o any distance, from a depth up
to any height, from a nook illllllllllllllllllllllllllllllllllllllll
------ temperature: 0.5
eight, from a nook illllllllllllllllllllllllllllllllllllllllaatsl llllllllattt   llllllllllalllnlltt
------ temperature: 1.0
llllllllllll

In [0]:
%%time


#for epoch in range(1, 60):
for epoch in range(1, 5):
    print('epoch', epoch)
    # Fit the model for 1 epoch on the available training data
    callbacks_list = [
        keras.callbacks.ModelCheckpoint(
          filepath='my_model.h5'
        )
    ]
    model.fit(x, y,
              batch_size=128,
              epochs=1,
              callbacks=callbacks_list)

    # Select a text seed at random
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text[start_index: start_index + maxlen]
    print('--- Generating with seed: "' + generated_text + '"')

    for temperature in [0.2, 0.5, 1.0, 1.2]:
        print('------ temperature:', temperature)
        sys.stdout.write(generated_text)

        # We generate 400 characters
        for i in range(400):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(generated_text):
                sampled[0, t, char_indices[char]] = 1.

            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]

            generated_text += next_char
            generated_text = generated_text[1:]

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

epoch 1
Epoch 1/1
--- Generating with seed: " are higher problems than
the problems of pleasure and pain "
------ temperature: 0.2
 are higher problems than
the problems of pleasure and pain iiioiiii i iiiiiiii ii i

  This is separate from the ipykernel package so we can avoid doing imports until


oioiiii  oiiii i ii oiiieiie iiii iiiiiiii ii oe iiiii  iiiiiiiiiii    ii  iiii  ii iaiiiiii ii o iiiiiii  ii iiii iiiii oiii  io i iiiioii iioiiiii oiiii   iiii iiibii iiiiiiiiii  iiiiiiiii i ii o iiiio iiiiiii nii iiiiii   i iiiii iiiii i iiiiiiii i eiiiioiioi i iiieiioiiiiiiiiiiii  i iii ii i oiiiioi iii ii iiiiii oioiii iiii i ii iii ii iiiiiii tiiii i iii iiii ioii oii
------ temperature: 0.5
ii oioiii iiii i ii iii ii iiiiiii tiiii i iii iiii ioii oii iai  r eioiitoiiiini i i ii  nioii ieiiiooiii oe ia oeioi in aiieb ir eiioii oiiiuoo ooi  
irieiiriooriren e  aio t iio io ioi iioi iineete ibio   aionea o iii toiito ii inooi ei itni    ite  i snoaiioiio   iiaini  to  iiii oi io i oitoto  onioa   ibo  ineiii  i t iii iiui ini a     i iiei eionee iiimeiioi a or i iio iaori iiooiiinio  i  aiooi iiieii eiriiooeioneloioiiob ii  niiii n oi rioitooie 
------ temperature: 1.0
 aiooi iiieii eiriiooeioneloioiiob ii  niiii n oi rioitooie iariwneii acimi u inrrhieou ntnmbi foiirgoi o  ai,ki o

In [0]:
#for epoch in range(1, 60):
for epoch in range(1, 5):
    print('epoch', epoch)
    # Fit the model for 1 epoch on the available training data
    model.fit(x, y,
              batch_size=128,
              epochs=1)

    # Select a text seed at random
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text[start_index: start_index + maxlen]
    print('--- Generating with seed: "' + generated_text + '"')

    for temperature in [0.2, 0.5, 1.0, 1.2]:
        print('------ temperature:', temperature)
        sys.stdout.write(generated_text)

        # We generate 400 characters
        for i in range(400):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(generated_text):
                sampled[0, t, char_indices[char]] = 1.

            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]

            generated_text += next_char
            generated_text = generated_text[1:]

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()