# LSTM with TFlow
This notebook will follow the steps in this article by [Rowel Atienza](https://towardsdatascience.com/lstm-by-example-using-tensorflow-feb0c1968537) on building a Long-Short Term Memory NN with Tensorflow.

The key outcome would be a model that can generate new snippets of text based on some new words.

As in the article linked, I will use the following short story

```long ago , the mice had a general council to consider what measures they could take to outwit their common enemy , the cat . some said this , and some said that but at last a young mouse got up and said he had a proposal to make , which he thought would meet the case . you will all agree , said he , that our chief danger consists in the sly and treacherous manner in which the enemy approaches us . now , if we could receive some signal of her approach , we could easily escape from her . i venture , therefore , to propose that a small bell be procured , and attached by a ribbon round the neck of the cat . by this means we should always know when she was about , and could easily retire while she was in the neighbourhood . this proposal met with general applause , until an old mouse got up and said that is all very well , but who is to bell the cat ? the mice looked at one another and nobody spoke . then the old mouse said it is easy to propose impossible remedies .
```

To start the process we need to create out dictionary of words

In [1]:
from collections import Counter
import tensorflow as tf
from tensorflow.contrib import rnn
import random
import numpy as np
import time
# Target log path
logs_path = '/tmp/tensorflow/rnn_words'
writer = tf.summary.FileWriter(logs_path)

In [2]:
def build_dataset(documents):
    """
    Generate the mappings that we will be training/predicting from our LSTM
    """
    count = Counter(words).most_common()
    dictionary = {word: i for i, (word, _) in enumerate(count)}
    reverse_dictionary = {i: word for i, (word, _) in enumerate(count)}
    return dictionary, reverse_dictionary    

In [3]:
story = """
long ago , the mice had a general council to consider what measures they could take to outwit their common enemy , the cat . some said this , and some said that but at last a young mouse got up and said he had a proposal to make , which he thought would meet the case . you will all agree , said he , that our chief danger consists in the sly and treacherous manner in which the enemy approaches us . now , if we could receive some signal of her approach , we could easily escape from her . i venture , therefore , to propose that a small bell be procured , and attached by a ribbon round the neck of the cat . by this means we should always know when she was about , and could easily retire while she was in the neighbourhood . this proposal met with general applause , until an old mouse got up and said that is all very well , but who is to bell the cat ? the mice looked at one another and nobody spoke . then the old mouse said it is easy to propose impossible remedies .
"""
words = list(map(lambda x: x.strip(),story.split(" ")) )

dictionary, reverse_dictionary = build_dataset(words)


## Creating and initialising LSTM
We are going to use tensorflow to create our NN

In [4]:
def RNN(x, weights, biases):
    """
    This is our recurrent neural network (RNN) that we will be traiing
    """

    # reshape to [1, n_input]
    x = tf.reshape(x, [-1, n_input])

    # Generate a n_input-element sequence of inputs
    # (eg. [had] [a] [general] -> [20] [6] [33])
    x = tf.split(x,n_input,1)

    # 1-layer LSTM with n_hidden units.
    rnn_cell = rnn.BasicLSTMCell(n_hidden)

    # generate prediction
    outputs, states = rnn.static_rnn(rnn_cell, x, dtype=tf.float32)

    # there are n_input outputs but
    # we only want the last output
    return tf.matmul(outputs[-1], weights['out']) + biases['out']

### Initialise our parameters

In [5]:
vocab_size = len(dictionary)

# Parameters
learning_rate = 0.001
training_iters = 50000
display_step = 1000
n_input = 3

# tf Graph input
x = tf.placeholder("float", [None, n_input, 1])
y = tf.placeholder("float", [None, vocab_size])

n_input = 3
# number of units in RNN cell
n_hidden = 512
# RNN output node weights and biases
weights = {
    'out': tf.Variable(tf.random_normal([n_hidden, vocab_size]))
}
biases = {
    'out': tf.Variable(tf.random_normal([vocab_size]))
}


It should be noted that `n_input` determines how many words we are feeding in to predict the next word in the sequence i.e, `long ago ,` are the first three bits to try and infer `the` 

In [6]:
pred = RNN(x, weights, biases)

# Loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.RMSPropOptimizer(learning_rate=learning_rate).minimize(cost)

# Model evaluation
correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.global_variables_initializer()


Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See @{tf.nn.softmax_cross_entropy_with_logits_v2}.



# Train Model

In [None]:
# Launch the graph
start_time = time.time()
with tf.Session() as session:
    session.run(init)
    step = 0
    offset = random.randint(0,n_input+1)
    end_offset = n_input + 1
    acc_total = 0
    loss_total = 0

    writer.add_graph(session.graph)

    while step < training_iters:
        # Generate a minibatch. Add some randomness on selection process.
        if offset > (len(words)-end_offset):
            offset = random.randint(0, n_input+1)

        symbols_in_keys = [ dictionary[ str(words[i])] for i in range(offset, offset + n_input) ]
        symbols_in_keys = np.reshape(np.array(symbols_in_keys), [-1, n_input, 1])
        # generate the one hot encoding
        symbols_out_onehot = [int(dictionary[str(words[offset+n_input])] == i) for i in range(vocab_size)]
        symbols_out_onehot = np.reshape(symbols_out_onehot,[1,-1])

        _, acc, loss, onehot_pred = session.run([optimizer, accuracy, cost, pred], \
                                                feed_dict={x: symbols_in_keys, y: symbols_out_onehot})
        loss_total += loss
        acc_total += acc
        if (step + 1) % display_step == 0:
            print("Iter= " + str(step+1) + ", Average Loss= " + \
                  "{:.6f}".format(loss_total/display_step) + ", Average Accuracy= " + \
                  "{:.2f}%".format(100*acc_total/display_step))
            acc_total = 0
            loss_total = 0
            symbols_in = [words[i] for i in range(offset, offset + n_input)]
            symbols_out = words[offset + n_input]
            symbols_out_pred = reverse_dictionary[int(tf.argmax(onehot_pred, 1).eval())]
            print("%s - [%s] vs [%s]" % (symbols_in,symbols_out,symbols_out_pred))
        step += 1
        offset += (n_input + 1)
    print("Optimization Finished!")
    print("Elapsed time: {} minutes".format( (time.time() - start_time)/60.0 ) )
    print("Run on command line.")
    print("\ttensorboard --logdir=%s" % (logs_path))
    print("Point your web browser to: http://localhost:6006/")
    while True:
        prompt = "%s words: " % n_input
        sentence = input(prompt)
        sentence = sentence.strip()
        new_words = sentence.split(' ')
        if len(new_words) != n_input:
            continue
        try:
            symbols_in_keys = [dictionary[str(new_words[i])] for i in range(len(new_words))]
            for i in range(32):
                keys = np.reshape(np.array(symbols_in_keys), [-1, n_input, 1])
                onehot_pred = session.run(pred, feed_dict={x: keys})
                onehot_pred_index = int(tf.argmax(onehot_pred, 1).eval())
                sentence = "%s %s" % (sentence,reverse_dictionary[onehot_pred_index])
                symbols_in_keys = symbols_in_keys[1:]
                symbols_in_keys.append(onehot_pred_index)
            print(sentence)
        except:
            print("Word not in dictionary")


Iter= 1000, Average Loss= 6.033603, Average Accuracy= 2.90%
['one', 'another', 'and'] - [nobody] vs [to]
Iter= 2000, Average Loss= 3.840894, Average Accuracy= 8.70%
['mice', 'looked', 'at'] - [one] vs [and]
Iter= 3000, Average Loss= 3.018023, Average Accuracy= 20.70%
['that', 'is', 'all'] - [very] vs [at]
Iter= 4000, Average Loss= 2.598684, Average Accuracy= 29.60%
['met', 'with', 'general'] - [applause] vs [applause]
Iter= 5000, Average Loss= 2.240356, Average Accuracy= 37.00%
['we', 'should', 'always'] - [know] vs [.]
Iter= 6000, Average Loss= 1.744312, Average Accuracy= 51.60%
['therefore', ',', 'to'] - [propose] vs [propose]
Iter= 7000, Average Loss= 1.827516, Average Accuracy= 51.70%
['could', 'receive', 'some'] - [signal] vs [signal]
Iter= 8000, Average Loss= 1.540800, Average Accuracy= 59.00%
['in', 'which', 'the'] - [enemy] vs [enemy]
Iter= 9000, Average Loss= 1.480212, Average Accuracy= 62.40%
['make', ',', 'which'] - [he] vs [he]
Iter= 10000, Average Loss= 1.203859, Average A