[View in Colaboratory](https://colab.research.google.com/github/Hamahmi/CIFAR-10/blob/master/txtgen.ipynb)

#Introduction
>In all the neural networks we previously implemented we used to treat each input independently, that is each input image is not related to other input images. However when dealing with text, the case differs. If we want our model to predict or generate a word that makes sense, it has to have knowledge about the  words that came before it and this is where [recurrent neural networks (RNNs)](https://en.wikipedia.org/wiki/Recurrent_neural_network) in general and [LSTM](https://en.wikipedia.org/wiki/Long_short-term_memory) in particular play a role. RNNs are called recurrent because each of its outputs is dependent on previous computations. They can "remember" information about what has been calculated so far and hence the name long short term memory (LSTM), a special kind of RNN.
![alt text](https://www.kdnuggets.com/wp-content/uploads/reccurrent-network-arch.png)

>>The above image shows how each out depends on previous outputs as well as the input to the RNN.

In [1]:
# importing necessary libraries

import sys
import numpy

from keras.utils      import np_utils
from keras.models     import Sequential
from keras.models     import load_model
from keras.layers     import Dense
from keras.layers     import Dropout
from keras.layers     import LSTM
from keras.callbacks  import ModelCheckpoint
from keras.optimizers import RMSprop


print("\nImporting ✓\n")

Using TensorFlow backend.



Importing ✓



In [3]:
# ON HOSTED RUNTIME YES
from google.colab import files
uploaded = files.upload()



Saving warpeace_input.txt to warpeace_input.txt


In [4]:
# loading the data from input file 

raw_text = open("warpeace_input.txt").read()
raw_text = raw_text.lower()

print("\nLoading ✓\n")


Loading ✓



## Data Preparation
>After loading [Leo Tolstoy's War and Peace](https://cs.stanford.edu/people/karpathy/char-rnn/warpeace_input.txt), we have to transform the words of the book to a format our LSTM can understand. First we create a sorted set of all the distinct characters in the book. Then we create two dictionaries, one for mapping each character to a distinct number to feed to our model, and the other to map the output of the network to letters in order to generate the text. 

In [0]:
chars = sorted(list(set(raw_text)))
char_to_int = dict((c, i) for i, c in enumerate(chars))
int_to_char = dict((i, c) for i, c in enumerate(chars))

In [6]:
# exploring the input data

n_chars = len(raw_text)
n_vocab = len(chars)
print("The number of characters in Leo Tolstoy's War and Peace:", n_chars)
print("Number of different characters in the book: ", n_vocab)
print("Which are: ", chars)


print("\nExploring ✓\n")

The number of characters in Leo Tolstoy's War and Peace: 3196213
Number of different characters in the book:  57
Which are:  ['\n', ' ', '!', '"', "'", '(', ')', '*', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '=', '?', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'à', 'ä', 'é', 'ê', '\ufeff']

Exploring ✓



## Defining the Training Data
>The following piece of code loops over the book to create the training input and its corresponding label. For every 100 characters the output is the character that follows those 100 characters. As an example for a seq_length of 3 the input and corresponding output for the text 'hello' would be:

>>>>hel-->l

>>>>ell-->o

In [7]:
# encoding the dataset as integers

seq_length = 100
x = []
y = []
for i in range(0, n_chars - seq_length, 1):
	seq_in = raw_text[i:i + seq_length]
	seq_out = raw_text[i + seq_length]
	x.append([char_to_int[char] for char in seq_in])
	y.append(char_to_int[seq_out])
n_patterns = len(x)
print("Total Patterns: ", n_patterns)

Total Patterns:  3196113


## Data Transformation
>The code below reshapes the input into the form that keras' LSTM layer expects. The number of samples (n_patterns), in our case 3196132, the size of each sample which is 100 (seq_length), and the number of features. We also normalize our input (giving it a range from 0-1) by dividing it by the number of unique characters in the book. We also transform the output to one hot encoding where the corresponding index of the output character is 1 and all other indexes are 0.

In [8]:

X = numpy.reshape(x, (n_patterns, seq_length, 1))
X = X / float(n_vocab)
Y = np_utils.to_categorical(y)


print("\nReshaping ✓\n")


Reshaping ✓



## LSTM 
>Because training LSTM is very slow, whenever there is an improvement in the loss we will save the weights which we will later use to generate text. Our model consists of 2 LSTM layers of 256 memory units, the dimensionality of the output space.  Each layer is followed by a dropout layer which turns off 20 percent of the neurons coming from the previous layer. The output layer  is a Dense layer that has an output of size 60 equivalent to the one hot encoding mentioned earlier.

In [0]:
# define RMSprop optimizer
optimizer = RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0)

# define the checkpoint
filepath="weights-improvement-06.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=1)

In [14]:
# define the LSTM model

model = Sequential()
model.add(LSTM(256, input_shape=(X.shape[1], X.shape[2]), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(256))
model.add(Dropout(0.2))
model.add(Dense(Y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])


print("\nCompiling ✓\n")


Compiling ✓



In [1]:
from google.colab import files
uploaded = files.upload()

Saving weights-improvement-02-1.8121.hdf5 to weights-improvement-02-1.8121.hdf5


In [10]:
# fitting the model
# 5 done so far
# del model
model = load_model('weights-improvement-05-2.0932.hdf5')
model.fit(X, Y, epochs=1, batch_size=128, callbacks=[checkpoint])

from google.colab import files
files.download('weights-improvement-06.hdf5')

Epoch 1/1
  37888/3196113 [..............................] - ETA: 15:08:40 - loss: 2.2357 - acc: 0.3608

KeyboardInterrupt: ignored

----

## Text Generation
>After training our model, we load the weights from our checkpoint that had the minimal loss. After that we choose a random phrase  from the book to which we feed to the model. The below for loop generates 1000 characters  by first feeding the trained model the random seed chosen after applying the same transformations that were applied on the training data. Then selecting the number that has the highest probabilty and converting it to its corresponding character using the previously defined dictionary we created at the beginning. Lastly we append this newly generated character to the input of the model and remove the first letter of the original input which is equivalent to the process mentioned in the example in 'Defining the Training Data' section. There we have it! Our model "speaking" its own words inspired by only one of the greatest authors of all time, Leo Tolstoy. 

In [0]:
# load the network weights

model.load_weights("weights-improvement-02-1.6481.hdf5")
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

In [22]:
# pick a random seed
start = numpy.random.randint(0, len(x)-1)
pattern = x[start]
print(''.join([int_to_char[value] for value in pattern]))

# generate characters
for i in range(1000):
  s = numpy.reshape(pattern, (1, len(pattern), 1))
  s = s / float(n_vocab)
  prediction = model.predict(s, verbose=2)
  index = numpy.argmax(prediction)
  result = int_to_char[index]
  seq_in = [int_to_char[value] for value in pattern]
  sys.stdout.write(result)
  pattern.append(index)
  pattern = pattern[1:len(pattern)]

t you are on foot? and where are you going, please?"

"oh, yes!" said pierre.

the soldiers stopped.
 
"i was all the soldiers and the service of the soldiers and the soldiers and the soldiers and the soldiers was all the sound of the soldiers

KeyboardInterrupt: ignored