<div style="    font-variant: small-caps;
    font-weight: normal;
    font-size: 30px;
    text-align: center;
    padding: 15px;
margin: 10px;">Long Short Term Memory (LSTM) in a Recurrent Neural Network (RNN) for Language Modeling</div>
<div style="    font-variant: small-caps;
    font-weight: normal;
    font-size: 20px;
    text-align: center;
    padding: 15px;">Deep Learning</div>
<div style="  float:right;
    font-size: 12px;
    line-height: 12px;
padding: 10px 15px 8px;">Luca BENEDETTO | Alberto IBARRONDO</div>

<div style=" display: inline-block; font-family: 'Lato', sans-serif; font-size: 12px; font-weight: bold; line-height: 12px; letter-spacing: 1px; padding: 10px 15px 8px; ">07/06/2017</div>
<h1 style="text-align:center"></h1>

# Summary
This notebook builds and trains a Recurrent Neural Network, based on Long Short-Term Memory (LSTM) units for next word prediction task. 

#  1. Introduction

We will train a LSTM to predict the next word using a sample short story. The LSTM will learn to predict the next item of a sentence from the 3 previous items (given as input). Ponctuation marks are considered as dictionnary items so they can be predicted too. Figure 1 shows the LSTM and the process of next word prediction. 

<img src="lstm.png" height="370" width="370"> 


Each word (and ponctuation) from text sentences is encoded by a unique integer. The integer value corresponds to the index of the corresponding word (or punctuation mark) in the dictionary. The network output is a one-hot-vector indicating the index of the predicted word in the reversed dictionnary (Section 1.2). For example if the prediction is 86, the predicted word will be "company". 



We use a sample short story from Aesop’s Fables (http://www.taleswithmorals.com/) to train the model. 


<div class="alert alert-warning">"There was once a young Shepherd Boy who tended his sheep at the foot of a mountain near a dark forest.
It was rather lonely for him all day, so he thought upon a plan by which he could get a little company and some excitement.
He rushed down towards the village calling out "Wolf, Wolf," and the villagers came out to meet him, and some of them stopped with him for a considerable time.
This pleased the boy so much that a few days afterwards he tried the same trick, and again the villagers came to his help.
But shortly after this a Wolf actually did come out from the forest, and began to worry the sheep, and the boy of course cried out "Wolf, Wolf," still louder than before.
But this time the villagers, who had been fooled twice before, thought the boy was again deceiving them, and nobody stirred to come to his help.
So the Wolf made a good meal off the boy's flock, and when the boy complained, the wise man of the village said:
"A liar will not be believed, even when he speaks the truth".</div>







## 1.1 Loading libraries
We start by loading the necessary libraries and resetting the default computational graph. For more details about the rnn packages we import, we suggest https://www.tensorflow.org/api_guides/python/contrib.rnn

In [1]:
import numpy as np
import collections # used to build the dictionary
import random
import time
import pickle # may be used to save your model 
import matplotlib.pyplot as plt
#Import Tensorflow and rnn
import tensorflow as tf
from tensorflow.contrib import rnn  

# Target log path
logs_path = 'lstm_words'
writer = tf.summary.FileWriter(logs_path)

## 1.2. Loading data

Let's load and split the text of our story

In [2]:
def load_data(filename):
    with open(filename) as f:
        data = f.readlines()
    data = [x.strip().lower() for x in data]
    data = [data[i].split() for i in range(len(data))]
    data = np.array(data)
    data = np.reshape(data, [-1, ])
    print(data)
    return data

#Run the cell 
train_file ='data/story.txt'
train_data = load_data(train_file)
print("Loaded training data...")
print("Length training data: %d" %len(train_data))
print("Number of unique elements: %d" %len(set(train_data)))

['there' 'was' 'once' 'a' 'young' 'shepherd' 'boy' 'who' 'tended' 'his'
 'sheep' 'at' 'the' 'foot' 'of' 'a' 'mountain' 'near' 'a' 'dark' 'forest'
 '.' 'it' 'was' 'rather' 'lonely' 'for' 'him' 'all' 'day' ',' 'so' 'he'
 'thought' 'upon' 'a' 'plan' 'by' 'which' 'he' 'could' 'get' 'a' 'little'
 'company' 'and' 'some' 'excitement' '.' 'he' 'rushed' 'down' 'towards'
 'the' 'village' 'calling' 'out' 'wolf' ',' 'wolf' ',' 'and' 'the'
 'villagers' 'came' 'out' 'to' 'meet' 'him' ',' 'and' 'some' 'of' 'them'
 'stopped' 'with' 'him' 'for' 'a' 'considerable' 'time' '.' 'this'
 'pleased' 'the' 'boy' 'so' 'much' 'that' 'a' 'few' 'days' 'afterwards'
 'he' 'tried' 'the' 'same' 'trick' ',' 'and' 'again' 'the' 'villagers'
 'came' 'to' 'his' 'help' '.' 'but' 'shortly' 'after' 'this' 'a' 'wolf'
 'actually' 'did' 'come' 'out' 'from' 'the' 'forest' ',' 'and' 'began' 'to'
 'worry' 'the' 'sheep,' 'and' 'the' 'boy' 'of' 'course' 'cried' 'out'
 'wolf' ',' 'wolf' ',' 'still' 'louder' 'than' 'before' '.' 'but' 't

## 1.2 Symbols encoding

The LSTM input's can only be numbers. A way to convert words (symbols or any items) to numbers is to assign a unique integer to each word. This process is often based on frequency of occurrence for efficient coding purpose.

Here, we define a function to build an indexed word dictionary (word->number). The "build_vocabulary" function builds both:

- Dictionary : used for encoding words to numbers for the LSTM inputs 
- Reverted dictionnary : used for decoding the outputs of the LSTM into words (and punctuation).

For example, in the story above, we have **113** individual words. The "build_vocabulary" function builds a dictionary with the following entries ['the': 0], [',': 1], ['company': 85],...


In [3]:
def build_vocabulary(words):
    count = collections.Counter(words).most_common()
    dic= dict()
    for word, _ in count:
        dic[word] = len(dic)
    reverse_dic= dict(zip(dic.values(), dic.keys()))
    return dic, reverse_dic


Let's display the vocabulary

In [4]:
dictionary, reverse_dictionary = build_vocabulary(train_data)
vocabulary_size= len(dictionary) 
print("Vocabulary_size = ", vocabulary_size)
print("\n")
print("Dictionary : \n")
print(dictionary)
print("\n")
print("Reverted Dictionary : \n" )
print(reverse_dictionary)

Vocabulary_size =  113


Dictionary : 

{'all': 32, 'liar': 33, 'help': 17, 'cried': 34, 'course': 35, 'still': 36, 'pleased': 37, 'before': 18, 'excitement': 91, 'deceiving': 38, 'had': 39, 'young': 69, 'actually': 40, 'to': 6, 'villagers': 11, 'shepherd': 41, 'them': 19, 'lonely': 42, 'get': 44, 'dark': 45, 'not': 64, 'day': 47, 'did': 48, 'calling': 49, 'twice': 50, 'good': 51, 'stopped': 52, 'truth': 53, 'meal': 54, 'sheep,': 55, 'some': 20, 'tended': 56, 'louder': 57, 'flock': 58, 'out': 9, 'even': 59, 'trick': 60, 'said': 61, 'for': 21, 'be': 62, 'after': 63, 'come': 22, 'by': 65, 'boy': 7, 'of': 10, 'could': 66, 'days': 67, 'wolf': 5, 'afterwards': 68, ',': 1, 'down': 70, 'village': 23, 'sheep': 72, 'little': 73, 'from': 74, 'rushed': 75, 'there': 76, 'been': 77, '.': 4, 'few': 78, 'much': 79, "boy's": 80, ':': 81, 'was': 12, 'a': 2, 'him': 13, 'that': 83, 'company': 84, 'nobody': 85, 'but': 24, 'fooled': 86, 'with': 87, 'than': 43, 'he': 8, 'made': 89, 'wise': 90, 'this': 14, '

# 2. LSTM Model in TensorFlow

We are now to develop an LSTM model to predict the word of following a sequence of 3 words. 

### 2.1. Model definition

We are defining a 2-layers LSTM model

In [1]:
def LSTMModel(x, n_input, weights, biases):

    # reshape to [1, n_input]
    x = tf.reshape(x, [-1, n_input])

    # Generate a n_input-element sequence of inputs
    # (eg. [had] [a] [general] -> [20] [6] [33])
    x = tf.split(x,n_input,1)

    # 1-layer LSTM with n_hidden units.
    rnn_cell = rnn.BasicLSTMCell(n_hidden)
    
    # 2-layer LSTM with n_hidden units.
    rnn_cell = rnn.MultiRNNCell([rnn_cell]*2)
    
    # generate prediction
    outputs, states = rnn.static_rnn(rnn_cell, x, dtype=tf.float32)

    # there are n_input outputs but
    # we only want the last output
    return tf.matmul(outputs[-1], weights['out']) + biases['out']


## 2.2 Training Parameters and Graph creation

In [7]:
# Training Parameters
learning_rate = 0.001
epochs = 50000
display_step = 1000
n_input = 3

#For each LSTM cell that you initialise, supply a value for the hidden dimension, number of units in LSTM cell
n_hidden = 64

## 2.3 Graph creation

In [8]:
# LSTM  weights and biases
weights = { 'out': tf.Variable(tf.random_normal([n_hidden, vocabulary_size]))}
biases = {'out': tf.Variable(tf.random_normal([vocabulary_size])) }

# tf Graph input
x = tf.placeholder("float", [None, n_input, 1])
y = tf.placeholder("float", [None, vocabulary_size])

#build the model
pred = RNN(x, n_input, weights, biases)

# Loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.RMSPropOptimizer(learning_rate=learning_rate).minimize(cost)

# Model evaluation
correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

## 2.4 Test Function

In [3]:
def LSTMtest(sentence, session, verbose=False):
    sentence = sentence.strip()
    words = sentence.split(' ')
    if len(words) != n_input:
        print("sentence length should be equal to", n_input, "!")
    try:
        symbols_inputs = [dictionary[str(words[i - n_input])] for i in range(n_input)]
        keys = np.reshape(np.array(symbols_inputs), [-1, n_input, 1])
        onehot_pred = session.run(pred, feed_dict={x: keys})
        onehot_pred_index = int(tf.argmax(onehot_pred, 1).eval())
        words.append(reverse_dictionary[onehot_pred_index])
        sentence = " ".join(words)
        if verbose:
            print(sentence)
        return reverse_dictionary[onehot_pred_index]
    except:
        print(["Word", words[i - n_input], "not in dictionary"])

# 3. Training the LSTM

In the Training process, at each epoch, 3 words are taken from the training data, encoded to integer to form the input vector. The training labels are one-hot vector encoding the word that comes after the 3 inputs words. We display the loss and the training accuracy every 1000 iteration and save the model at the end of training in the **lstm_model** folder

In [11]:
model_saver = tf.train.Saver()

In [14]:
def LSTMTrain(name):
    # Initializing the variables
    start_time = time.time()
    init = tf.global_variables_initializer()

    print("Start Training")
    ##############################################

    with tf.Session() as session:
        session.run(init)
        step = 0
        offset = random.randint(0,n_input+1)
        end_offset = n_input + 1
        acc_total = 0
        loss_total = 0

        writer.add_graph(session.graph)

        while step < epochs:
            # Generate a minibatch. Add some randomness on selection process.
            if offset > (len(train_data) - end_offset):
                offset = random.randint(0, n_input+1)

            symbols_in_keys = [ [dictionary[ str(train_data[i])]] for i in range(offset, offset+n_input) ]
            symbols_in_keys = np.reshape(np.array(symbols_in_keys), [-1, n_input, 1])

            symbols_out_onehot = np.zeros([vocabulary_size], dtype=float)
            symbols_out_onehot[dictionary[str(train_data[offset+n_input])]] = 1.0
            symbols_out_onehot = np.reshape(symbols_out_onehot,[1,-1])

            _, acc, loss, onehot_pred = session.run([optimizer, accuracy, cost, pred], \
                                                    feed_dict={x: symbols_in_keys, y: symbols_out_onehot})
            loss_total += loss
            acc_total += acc

            if (step+1) % display_step == 0:

                symbols_in = [train_data[i] for i in range(offset, offset + n_input)]
                symbols_out = train_data[offset + n_input]
                symbols_out_pred = reverse_dictionary[int(tf.argmax(onehot_pred, 1).eval())]

                print("Iter= " + str(step+1) + ", Loss= " + \
                      "{:.6f}".format(loss_total/display_step) + ", TrAcc= " + \
                      "{:.2f}%".format(100*acc_total/display_step)),
                print("%s - [%s] vs [%s]" % (symbols_in,symbols_out,symbols_out_pred))

                acc_total = 0
                loss_total = 0

            step += 1
            offset += (n_input+1)

    ##############################################
        print("Optimization Finished!")
        print("Elapsed time:%.1f s" %(time.time() - start_time))
        print("Run on command line: tensorboard --logdir=%s" % (logs_path))
        print("Point your web browser to the returned link")
    ##############################################
        model_saver.save(sess=session, save_path='lstm_model/'+name)
    ##############################################
        print("Model saved")       

In [16]:
LSTMTrainModel('LSTMmodel_3')

Start Training
Iter= 1000, Loss= 4.619733, TrAcc= 6.50% ['speaks', 'the', 'truth'] - [.] vs [he]
Iter= 2000, Loss= 3.736615, TrAcc= 13.80% ['he', 'speaks', 'the'] - [truth] vs [forest]
Iter= 3000, Loss= 3.441579, TrAcc= 18.10% ['a', 'liar', 'will'] - [not] vs [:]
Iter= 4000, Loss= 2.886903, TrAcc= 32.50% ['believed', ',', 'even'] - [when] vs [shepherd]
Iter= 5000, Loss= 2.466733, TrAcc= 39.40% ['a', 'liar', 'will'] - [not] vs [:]
Iter= 6000, Loss= 2.355309, TrAcc= 41.70% ['be', 'believed', ','] - [even] vs [boy's]
Iter= 7000, Loss= 1.837935, TrAcc= 54.80% ['the', 'wise', 'man'] - [of] vs [of]
Iter= 8000, Loss= 1.723356, TrAcc= 55.60% [',', 'and', 'when'] - [the] vs [of]
Iter= 9000, Loss= 1.680242, TrAcc= 55.10% [',', 'the', 'wise'] - [man] vs [man]
Iter= 10000, Loss= 1.190535, TrAcc= 68.40% ['a', 'liar', 'will'] - [not] vs [not]
Iter= 11000, Loss= 1.245663, TrAcc= 65.70% ['young', 'shepherd', 'boy'] - [who] vs [who]
Iter= 12000, Loss= 0.923475, TrAcc= 76.30% ['once', 'a', 'young'] - [s

# 4 Testing the RNN

## 4.1 Next word prediction

We load the model (using the model_saved variable given in the training session) and test the sentences :
- 'get a little' 
- 'nobody tried to'
- Trying with other sentences using words from the story's vocabulary. 

In [17]:
with tf.Session() as session:
    # Restore variables from disk.
    model_saver.restore(session, "lstm_model/LSTMmodel_3")
    print("Model restored.")
    print("\nTesting the model")
    test_sentences = ['get a little', 'nobody tried to', 'he rushed down', 'a wolf actually', 'a liar will']
    for sentence in test_sentences:
        test(sentence, session, verbose=True)

Model restored.

Testing the model
get a little company
nobody tried to come
he rushed down towards
a wolf actually did
a liar will not


## 4.2 More fun with the Story Writer

Let's use the RNN/LSTM model learned in the previous question to create a
new story/fable.
For this we choose 3 words from the dictionary which will start the
story and initialize the network. Using those 3 words the RNN will generate
the next word or the story. Using the last 3 words (the newly predicted one
and the last 2 from the input) we will use the network to predict the 5
word of the story... and so on until your story is 5 sentences long. 

In [34]:
def RNNcreate_story(sentence, session, numOfSentences=5, n_input=3, verbose=False):
    cnt_sent = 0
    end_sent = ['.',',',':',';','!','?']
    
    sentence = sentence.strip()
    words = sentence.split(' ')
    while cnt_sent < numOfSentences or (cnt_sent == numOfSentences and sentence[-1] not in ['.','!','?']):
        symbols_inputs = [dictionary[str(words[i - n_input])] for i in range(n_input)]
        keys = np.reshape(np.array(symbols_inputs), [-1, n_input, 1])
        onehot_pred = session.run(pred, feed_dict={x: keys})
        onehot_pred_index = int(tf.argmax(onehot_pred, 1).eval())
        new_word = reverse_dictionary[onehot_pred_index]
        words.append(new_word)
        sentence += " " + words[-1]
        words = words[-n_input:]
        if verbose:
            print(sentence)
        if new_word in end_sent:
            cnt_sent += 1
    print(sentence)

In [19]:
#Your implementation goes here 
with tf.Session() as session:
    # Restore variables from disk.
    model_saver.restore(session, "lstm_model/LSTMmodel_3")
    print("Model restored.\n")
    RNNcreate_story('a wolf actually', session, numOfSentences=5)

Model restored.

a wolf actually did come out from the forest , and began stirred to come to his help . so the wolf made a good will again a this pleased the boy so much that a few days afterwards he tried the truth of a mountain near a dark will not twice before . but the villagers came out to meet him , and some of them stopped with him for a considerable time .


## 4.3 Playing with number of inputs

The number of input in our example is 3, now we are gonna see what happens when we use other number (1, 2 and 5)

In [45]:
tf.reset_default_graph()

# Training Parameters
learning_rate = 0.001
epochs = 50000
display_step = 1000
n_input = 1
n_hidden = 64

weights = { 'out': tf.Variable(tf.random_normal([n_hidden, vocabulary_size]))}
biases = {'out': tf.Variable(tf.random_normal([vocabulary_size])) }

x = tf.placeholder("float", [None, n_input, 1])
y = tf.placeholder("float", [None, vocabulary_size])

pred = RNNModel(x, n_input, weights, biases)

# Loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.RMSPropOptimizer(learning_rate=learning_rate).minimize(cost)

# Model evaluation
correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

model_saver = tf.train.Saver()

In [24]:
RNNTrain('LSTMmodel_1')

Start Training
Iter= 1000, Loss= 4.519836, TrAcc= 6.10% ['time'] - [.] vs [.]
Iter= 2000, Loss= 4.327969, TrAcc= 8.20% ['stirred'] - [to] vs [the]
Iter= 3000, Loss= 4.211221, TrAcc= 8.50% ['company'] - [and] vs [the]
Iter= 4000, Loss= 4.007069, TrAcc= 10.60% ['of'] - [course] vs [came]
Iter= 5000, Loss= 3.964375, TrAcc= 10.60% ['a'] - [young] vs [and]
Iter= 6000, Loss= 4.051782, TrAcc= 10.10% ['a'] - [few] vs [and]
Iter= 7000, Loss= 3.840868, TrAcc= 13.30% ['made'] - [a] vs [the]
Iter= 8000, Loss= 3.962126, TrAcc= 11.90% ['rushed'] - [down] vs [a]
Iter= 9000, Loss= 3.617404, TrAcc= 13.80% [','] - [wolf] vs [and]
Iter= 10000, Loss= 3.657220, TrAcc= 14.50% ['his'] - [sheep] vs [.]
Iter= 11000, Loss= 3.966666, TrAcc= 12.30% ['he'] - [tried] vs [,]
Iter= 12000, Loss= 3.806246, TrAcc= 12.50% ['off'] - [the] vs [to]
Iter= 13000, Loss= 3.816383, TrAcc= 12.50% ['calling'] - [out] vs [a]
Iter= 14000, Loss= 3.806111, TrAcc= 12.10% ['but'] - [this] vs [.]
Iter= 15000, Loss= 3.783292, TrAcc= 13.10

In [51]:
#Your implementation goes here 
with tf.Session() as session:
    # Restore variables from disk.
    model_saver.restore(session, "lstm_model/LSTMmodel_1")
    print("Model restored.\n")
    RNNcreate_story('foot', session, numOfSentences=5, n_input=1)

Model restored.

foot he wolf , and some . he wolf , and some . he wolf , and some .


<div class='alert alert-info'>
We can see that with only one input the model enters almost immediately into a loop and it will not generate anything different. The commas and the dots are very frequent in the output; this us due to the high relative frequency of these symbols in the original text compared to any other word.
</div>

In [56]:
tf.reset_default_graph()

# Training Parameters
learning_rate = 0.001
epochs = 50000
display_step = 1000
n_input = 2
n_hidden = 64

weights = { 'out': tf.Variable(tf.random_normal([n_hidden, vocabulary_size]))}
biases = {'out': tf.Variable(tf.random_normal([vocabulary_size])) }

x = tf.placeholder("float", [None, n_input, 1])
y = tf.placeholder("float", [None, vocabulary_size])

pred = RNNModel(x, n_input, weights, biases)

# Loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.RMSPropOptimizer(learning_rate=learning_rate).minimize(cost)

# Model evaluation
correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

model_saver = tf.train.Saver()

In [57]:
RNNTrain('LSTMmodel_2')

Start Training
Iter= 1000, Loss= 4.336366, TrAcc= 9.80% ['by', 'which'] - [he] vs [a]
Iter= 2000, Loss= 3.687833, TrAcc= 18.00% ['a', 'considerable'] - [time] vs [the]
Iter= 3000, Loss= 3.523067, TrAcc= 19.00% ['come', 'out'] - [from] vs [who]
Iter= 4000, Loss= 2.776961, TrAcc= 25.70% [',', 'thought'] - [the] vs [the]
Iter= 5000, Loss= 3.306874, TrAcc= 19.50% ['complained', ','] - [the] vs [and]
Iter= 6000, Loss= 3.188090, TrAcc= 20.70% ['.', 'it'] - [was] vs [the]
Iter= 7000, Loss= 2.942333, TrAcc= 22.80% ['the', 'villagers'] - [came] vs [,]
Iter= 8000, Loss= 2.827688, TrAcc= 25.50% ['.', 'but'] - [shortly] vs [this]
Iter= 9000, Loss= 2.517571, TrAcc= 29.40% ['than', 'before'] - [.] vs [,]
Iter= 10000, Loss= 2.511359, TrAcc= 31.00% [',', 'and'] - [when] vs [again]
Iter= 11000, Loss= 2.387003, TrAcc= 34.50% ['the', 'foot'] - [of] vs [calling]
Iter= 12000, Loss= 2.246355, TrAcc= 38.80% [',', 'wolf'] - [,] vs [when]
Iter= 13000, Loss= 2.363784, TrAcc= 36.90% ['the', 'villagers'] - [came]

In [58]:
#Your implementation goes here 
with tf.Session() as session:
    # Restore variables from disk.
    model_saver.restore(session, "lstm_model/LSTMmodel_2")
    print("Model restored.\n")
    RNNcreate_story('a wolf', session, numOfSentences=5, n_input=2)

Model restored.

a wolf actually did come . considerable was for stopped out a wolf actually did come . considerable was for stopped out a wolf actually did come . considerable was for stopped out a wolf actually did come . considerable was for stopped out a wolf actually did come .


<div class='alert alert-info'>
Considerig the last two words the situation is similar to the previous one, the difference is that reasonably the loop is a bit longer than the previous one.
</div>

In [52]:
tf.reset_default_graph()

# Training Parameters
learning_rate = 0.001
epochs = 50000
display_step = 1000
n_input = 5
n_hidden = 64

weights = { 'out': tf.Variable(tf.random_normal([n_hidden, vocabulary_size]))}
biases = {'out': tf.Variable(tf.random_normal([vocabulary_size])) }

x = tf.placeholder("float", [None, n_input, 1])
y = tf.placeholder("float", [None, vocabulary_size])

pred = RNNModel(x, n_input, weights, biases)

# Loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.RMSPropOptimizer(learning_rate=learning_rate).minimize(cost)

# Model evaluation
correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

model_saver = tf.train.Saver()

In [38]:
RNNTrain('LSTMmodel_5')

Start Training
Iter= 1000, Loss= 4.470892, TrAcc= 8.50% ['come', 'to', 'his', 'help', '.'] - [so] vs [,]
Iter= 2000, Loss= 3.527248, TrAcc= 20.40% ['from', 'the', 'forest', ',', 'and'] - [began] vs [of]
Iter= 3000, Loss= 2.446506, TrAcc= 44.30% ['again', 'the', 'villagers', 'came', 'to'] - [his] vs [deceiving]
Iter= 4000, Loss= 1.936984, TrAcc= 53.30% ['towards', 'the', 'village', 'calling', 'out'] - [wolf] vs [a]
Iter= 5000, Loss= 1.706365, TrAcc= 56.40% ['village', 'said', ':', 'a', 'liar'] - [will] vs [much]
Iter= 6000, Loss= 0.857457, TrAcc= 77.30% [',', 'still', 'louder', 'than', 'before'] - [.] vs [.]
Iter= 7000, Loss= 0.514233, TrAcc= 87.40% ['the', 'villagers', 'came', 'to', 'his'] - [help] vs [help]
Iter= 8000, Loss= 0.455893, TrAcc= 87.90% ['villagers', 'came', 'out', 'to', 'meet'] - [him] vs [him]
Iter= 9000, Loss= 0.274175, TrAcc= 93.10% ['a', 'little', 'company', 'and', 'some'] - [excitement] vs [excitement]
Iter= 10000, Loss= 0.193359, TrAcc= 94.90% ['rather', 'lonely', '

In [55]:
#Your implementation goes here 
with tf.Session() as session:
    # Restore variables from disk.
    model_saver.restore(session, "lstm_model/LSTMmodel_5")
    print("Model restored.\n")
    RNNcreate_story('a boy a wolf and', session, numOfSentences=5, n_input=5)

Model restored.

a boy a wolf and villagers came out to so him time with with him for a considerable time . this pleased the boy so much that a few days afterwards he tried the same trick , and again the boy complained , the wise man of the village said : a liar will not be believed , even when he speaks the truth .


<div class='alert alert-info'>
In this case the number of words that we are considering is too big compared to the length of the text we have: indeed, the model tends to recreate the same exact sentences that where present in the original text, even though we started from a sentence ('a boy a wolf and') that was not there.
We could consider this as overfitting our model.
</div>