The goal of thsi notebook is to implement a simple text generation using LSTM's and Tensorflow. 

## The data

We will be using a bunch of books from the Gutenberg website, this will be located in the folder data. The data will be proccess via the library utils, more specifically the method process_data() there will take care of getting the data ready. 

In [1]:
from utils import *

To illustrate what it does let's assume that our goal is to look at the last 5 characters and deduce the next so to create text and that the whole book consist of

In [2]:
corpus="This is the whole book, it really doesn't have tons of stuff, but it should work fine for a simpel example."

We want first to create token for each character, this can be done easily in many ways, for example

In [3]:
vocab=sorted(set(corpus))
char_to_token={char:i for i,char in enumerate(vocab)}
token_to_char={i:char for i,char in enumerate(vocab)}

we can count the characters by 

In [4]:
len(vocab)

26

Next, we create the tokens associated to the data

In [5]:
tokens_data=[char_to_token[char] for char in corpus]
print(tokens_data)

[4, 10, 11, 19, 0, 11, 19, 0, 20, 10, 8, 0, 23, 10, 16, 13, 8, 0, 6, 16, 16, 12, 2, 0, 11, 20, 0, 18, 8, 5, 13, 13, 25, 0, 7, 16, 8, 19, 15, 1, 20, 0, 10, 5, 22, 8, 0, 20, 16, 15, 19, 0, 16, 9, 0, 19, 20, 21, 9, 9, 2, 0, 6, 21, 20, 0, 11, 20, 0, 19, 10, 16, 21, 13, 7, 0, 23, 16, 18, 12, 0, 9, 11, 15, 8, 0, 9, 16, 18, 0, 5, 0, 19, 11, 14, 17, 8, 13, 0, 8, 24, 5, 14, 17, 13, 8, 3]


as we want to have sequences of five elements we break the tokens_data into lenghts of five 

In [6]:
tokens_data_seq=[tokens_data[i:i+5] for i in range(len(tokens_data)-5)]
print(tokens_data_seq[:5])
print("\nThere are %d sequences in the corpus."%len(tokens_data_seq))

[[4, 10, 11, 19, 0], [10, 11, 19, 0, 11], [11, 19, 0, 11, 19], [19, 0, 11, 19, 0], [0, 11, 19, 0, 20]]

There are 102 sequences in the corpus.


We can find the X and y values from this 

In [7]:
X=tokens_data_seq[:-1]
y=tokens_data_seq[1:]

and we can make two batches out of this

In [8]:
X_batches=np.array([X[:50],X[50:-1]])
y_batches=np.array([y[:50],y[50:-1]])

In [9]:
X_batches.shape

(2, 50, 5)

That is, there are two batches of 50 sequences each.This, together with the dicts, is (almost) what the method process_data returns. We now import Tensorflow.

In [10]:
import tensorflow as tf

Equipped with the data we are ready to build our graph, as always we start with the placeholders.

In [11]:
input_tokens=tf.placeholder(shape=(None,5), dtype=tf.float32,name='input_tokens')
output_tokens=tf.placeholder(shape=(None,5),dtype=tf.int32,name='output_tokens')

This will be feed to a RNN made of LSTM's, hence the first step 

<img src="lstmRNN.png">



In [12]:
with tf.name_scope("Recurrent_layers"):
    
    lstms=[tf.contrib.rnn.LSTMCell(100) for i in range(3)]
    staked_lstm = tf.contrib.rnn.MultiRNNCell(lstms)
    initial_state=state=staked_lstm.zero_state(50,dtype=tf.float32)
    
    #we must reshape the inputs since RNN take vectors not scalars
    
    output = tf.reshape(input_tokens, (-1,5,1),name="reshaped_input")
    output,_ = tf.nn.dynamic_rnn(staked_lstm,output,dtype=tf.float32)
    
    #now the output is a rank 3 tensor, 
    #we want to make it a matrix so we can use a softmax layer
    
    output = tf.reshape(output,[-1,100])

next we create the softmax layer

In [13]:
with tf.name_scope("Softmax_layer"):
    W = tf.Variable(tf.random_normal(shape=(100,26)))
    b= tf.Variable(tf.zeros(26))
    
    output = tf.matmul(output,W)+b

Note the lack of activation function in this layer, we incorporated this inside the loss op

In [14]:
with tf.name_scope("loss"):
    one_hot=tf.nn.embedding_lookup(tf.eye(26),ids=output_tokens)
    loss=tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(
            logits=output,labels=one_hot
        )
    )


let's use Adam optimizer for the gradient descend

In [15]:
with tf.name_scope("optimizer"):
    global_step=tf.Variable(0,dtype=tf.int32,trainable=False,name="global_step")
    optimizer=tf.train.AdamOptimizer().minimize(loss,global_step=global_step)

and as we want to be able to see how we are doing, we create some summary operations.

In [16]:
with tf.name_scope("summaries"):
    loss_summary =tf.summary.scalar("Loss",loss)
    loss_histogram = tf.summary.histogram("hist_loss",loss)
    summary_op=tf.summary.merge((loss_summary,loss_histogram))

we are ready for the training part

In [17]:
sess=tf.Session()

we initialize the variables

In [18]:
sess.run(tf.global_variables_initializer())

In [19]:
writer = tf.summary.FileWriter("./graph_notebook",sess.graph)

and run it for a number of epochs

In [20]:
for i in range(200):
    print("Epoch %d"%(i+1), end="\r")
    for X_batch,y_batch in zip(X_batches,y_batches):
        feed_dict={input_tokens:X_batch,output_tokens:y_batch}
        _,summary=sess.run([optimizer,summary_op],feed_dict=feed_dict)
        writer.add_summary(summary,global_step=sess.run(global_step))

Epoch 200

let's see how we did, to check on that we need to create a story, we start with 5 characters

In [21]:
story='this '

then, we add a character at the time.

In [22]:
story_tokens=[char_to_token[x] for x in story]

and now we create a loop for finding say 200 characters.

In [23]:
for i in range(1000):
    X=[story_tokens[-5:]]
    story_tokens.append(np.argmax(sess.run(output,feed_dict={input_tokens:X})[-1]))
story=''.join([token_to_char[i] for i in story_tokens])

In [24]:
story

"this is the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book, it really doesn't have tons of stuff, but it the whole book,

In [25]:
sess.close()

**Exercise:** Modify the code so it works for some other text, or any text.