Importing simple things and creating function for the time elapsed

In [4]:
import numpy as np
import tensorflow as tf
from tensorflow.contrib import rnn
import random 
import collections
import time

start_time = time.time()

def elapsed(sec):
    if sec<60:
        return str(sec) + " sec"
    elif sec<(60*60):
        return str(sec/60) + " min"

I had many problems in this, paths should be defined as './'

In [5]:
log_path = 'tensorflow/logs'
writer = tf.summary.FileWriter(log_path)
training_file = 'input/paragraph.txt'

As training_file contains the text file, it is preprocessed.
1. content contains the lines of the paragraph
2. strip() in-built function of Python is used to remove all the leading and trailing spaces from a string.
   eg. "   geeks for geeks" split() this we will beg "geeks for geeks"
3. split() method returns a list of strings after breaking the given string by the specified separator.
4. added the content to the numpy array
5. what the reshape([-1,]) does is whatever the dimension of the content is, it converts it into any possible value replacing -1, here 12 replaces -1 satisfying total 12 elements. if reshape([-1, 2]) then -1 would be 6.

In [6]:
def read_data(fname):
    with open(fname) as f:
        content = f.readlines()
    content = [x.strip() for x in content]
    content = [content[i].split() for i in range(len(content))]
    content = np.array(content)
    content = np.reshape(content, [-1, ])
    return content

training_data = read_data(training_file)
print("Loaded training data....")

Loaded training data....


The most important requirement of the RNN is dictionary,
one which maps word to value, and another one reverse of it.
1. Counter(words) creates a dictonary which contains the words as key and frequency as value
2. word -> key, _ -> value
3. we are making a dictionary and giving word a value which is unique.
4. reverse dictionary for reversing the things back.

In [7]:
def build_dataset(words):
    count = collections.Counter(words).most_common()
    dictionary = dict()
    for word, _ in count:
        dictionary[word] = len(dictionary)
    reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
    return dictionary, reverse_dictionary

dictionary, reverse_dictionary = build_dataset(training_data)

Here we are defining stuff.
1. vocab_size: our length of knowledge about words
2. input of 3 words
3. total hidden layer nodes: 512
4. X, Y are our tensorflow variables, input layer, output layer.
5. I had a slight issue ahead because of graphs of tensorflow, make sure you reset them.

Tensorflow Variables:
1. tf.placeholder("float", [None, n_input, 1] this tells -> "I dont know the number of rows but I know one row should have a shape of (n_input, 1)"
2. eg for images: (None, 128, 128, 3) -> "I dont know the rows but each row should be of (128,128,3) shape.

Note:
1. Our output y is of vocab size that is it will be of 112 words vector.
2. We will use one hot encoding using this vector as a output vector
3. input will be of 3 words i.e. 3 words from the training set will be chosen and feed to the RNN which goes on computing the sequence and for the accuracy what we want is the last output of the RNN, now point to be noted, which frustrated me the most when I was learning this during btp is that,
    1. the output of the RNN is h(t) which is a hidden layer of 512, now we need to know what word it indicates.. 
    2. here comes the output nodes and biases,
    3. y = (h(t).weights['out]) + biases['out']
    4. and this will be 
    5. here y will be a 112 vector.
    
 

<img src="LSTM.jpg" style="width: 500px; float: left; height: 600px;"/>

In [8]:
vocab_size = len(dictionary)
learning_rate = 1e-3
training_iters = 50000
display_step = 5000
n_input = 3
n_hidden = 512
tf.reset_default_graph()
x = tf.placeholder("float", [None, n_input, 1])
y = tf.placeholder("float", [None, vocab_size])

#RNN output node weights and biases
weights = {
    'out' : tf.Variable(tf.random_normal([n_hidden, vocab_size]))
}
biases = {
    'out' : tf.Variable(tf.random_normal([vocab_size]))
}


In [9]:
def RNN(x, weights, biases):
    #reshape to [1, n_input]
    x = tf.reshape(x, [-1, n_input])
    x = tf.split(x, n_input, 1)
    rnn_cell = rnn.MultiRNNCell([rnn.GRUCell(n_hidden), rnn.GRUCell(n_hidden)])
    
    outputs, states = rnn.static_rnn(rnn_cell, x, dtype = tf.float32)
    return tf.matmul(outputs[-1], weights['out']) + biases['out']

In [10]:
pred = RNN(x, weights, biases)

In [11]:
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.RMSPropOptimizer(learning_rate=learning_rate).minimize(cost)

Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See @{tf.nn.softmax_cross_entropy_with_logits_v2}.



In [12]:
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

In [13]:
init = tf.global_variables_initializer()

In [14]:
# This code will train 50k iters, if session is saved already dont run, instead run the next block of code
from tqdm import tqdm as tqdm
with tf.Session() as session:
    session.run(init)
    offset = random.randint(0, n_input+1)
    end_offset = n_input+1
    acc_total = 0
    loss_total = 0
    
    writer.add_graph(session.graph)
    for step in tqdm(range(training_iters)):
        if offset > (len(training_data)-end_offset):
            offset = random.randint(0, n_input+1)
        symbols_in_keys = [ [dictionary[str(training_data[i])]] for i in range(offset, offset+n_input)]
        symbols_in_keys = np.reshape(np.array(symbols_in_keys), [-1, n_input, 1])
        
        symbols_out_onehot = np.zeros([vocab_size], dtype = float)
        symbols_out_onehot[dictionary[str(training_data[offset+n_input])]] = 1.0
        symbols_out_onehot = np.reshape(symbols_out_onehot, [1, -1])
        
        _, acc, loss, onehot_pred = session.run([optimizer, accuracy, cost, pred], \
                                               feed_dict = {x: symbols_in_keys, y: symbols_out_onehot})
    
        loss_total += loss
        acc_total += acc
        if (step+1)%display_step == 0:
            print("Iter= " + str(step+1) + ", Average loss= " + \
                 "{:.6f}".format(loss_total/display_step) + ", Average Accuracy= " + \
                 "{:.2f}%".format(100*acc_total/display_step))
            acc_total = 0
            loss_total = 0
            symbols_in = [training_data[i] for i in range(offset, offset+n_input)]
            symbols_out = training_data[offset+n_input]
            symbols_out_pred = reverse_dictionary[int(tf.argmax(onehot_pred, 1).eval())]
            print("%s - [%s] vs [%s]" % (symbols_in, symbols_out, symbols_out_pred))
        offset += (n_input+1)
    saver = tf.train.Saver()
    save_path = saver.save(session, "./sessions/model.ckpt")
    print("Optimization Finished!")
    print("Elapsed time: ", elapsed(time.time() - start_time))
    print("Run on command line.")
    print("\ttensorboard --logdir=%s" % (log_path))
    print("Point your web browser to: http://localhost:6006/")

  2%|▏         | 1003/50000 [00:40<32:37, 25.04it/s]

Iter= 1000, Average loss= 6.020145, Average Accuracy= 6.10%
['at', 'one', 'another'] - [and] vs [.]


  4%|▍         | 2003/50000 [01:17<31:02, 25.77it/s]

Iter= 2000, Average loss= 3.541041, Average Accuracy= 17.50%
['got', 'up', 'and'] - [said] vs [round]


  6%|▌         | 3001/50000 [01:57<30:37, 25.57it/s]

Iter= 3000, Average loss= 3.141578, Average Accuracy= 24.40%
[',', 'and', 'could'] - [easily] vs [said]


  8%|▊         | 4001/50000 [02:42<31:08, 24.61it/s]

Iter= 4000, Average loss= 3.092450, Average Accuracy= 30.70%
['neck', 'of', 'the'] - [cat] vs [i]


 10%|█         | 5002/50000 [03:26<30:54, 24.27it/s]

Iter= 5000, Average loss= 2.261444, Average Accuracy= 43.60%
['we', 'could', 'easily'] - [escape] vs [to]


 12%|█▏        | 6002/50000 [04:09<30:26, 24.09it/s]

Iter= 6000, Average loss= 2.582534, Average Accuracy= 40.50%
['of', 'her', 'approach'] - [,] vs [,]


 14%|█▍        | 7004/50000 [04:52<29:54, 23.96it/s]

Iter= 7000, Average loss= 2.445378, Average Accuracy= 42.80%
['us', '.', 'now'] - [,] vs [the]


 16%|█▌        | 8004/50000 [05:35<29:20, 23.85it/s]

Iter= 8000, Average loss= 1.973058, Average Accuracy= 49.90%
['in', 'which', 'the'] - [enemy] vs [enemy]


 18%|█▊        | 9003/50000 [06:18<28:44, 23.78it/s]

Iter= 9000, Average loss= 1.758262, Average Accuracy= 56.70%
['agree', ',', 'said'] - [he] vs [he]


 20%|██        | 10003/50000 [07:01<28:06, 23.71it/s]

Iter= 10000, Average loss= 1.580687, Average Accuracy= 60.90%
['and', 'said', 'he'] - [had] vs [is]


 22%|██▏       | 11002/50000 [07:39<27:08, 23.94it/s]

Iter= 11000, Average loss= 1.347442, Average Accuracy= 65.70%
['to', 'outwit', 'their'] - [common] vs [common]


 24%|██▍       | 12003/50000 [08:14<26:04, 24.29it/s]

Iter= 12000, Average loss= 1.335453, Average Accuracy= 68.10%
['another', 'and', 'nobody'] - [spoke] vs [of]


 26%|██▌       | 13005/50000 [08:49<25:05, 24.58it/s]

Iter= 13000, Average loss= 1.427956, Average Accuracy= 64.20%
['and', 'said', 'that'] - [is] vs [is]


 28%|██▊       | 14002/50000 [09:23<24:09, 24.84it/s]

Iter= 14000, Average loss= 1.460478, Average Accuracy= 66.20%
['applause', ',', 'until'] - [an] vs [of]


 30%|███       | 15003/50000 [09:58<23:15, 25.07it/s]

Iter= 15000, Average loss= 1.195041, Average Accuracy= 70.60%
['the', 'neighbourhood', '.'] - [this] vs [this]


 32%|███▏      | 16004/50000 [10:33<22:24, 25.28it/s]

Iter= 16000, Average loss= 1.231290, Average Accuracy= 70.60%
['cat', '.', 'by'] - [this] vs [this]


 34%|███▍      | 17001/50000 [11:13<21:46, 25.26it/s]

Iter= 17000, Average loss= 1.207707, Average Accuracy= 71.10%
['the', 'neck', 'of'] - [the] vs [the]


 36%|███▌      | 18001/50000 [11:56<21:14, 25.11it/s]

Iter= 18000, Average loss= 1.258300, Average Accuracy= 70.50%
[',', 'to', 'propose'] - [that] vs [that]


 38%|███▊      | 19003/50000 [12:39<20:39, 25.01it/s]

Iter= 19000, Average loss= 1.051942, Average Accuracy= 74.20%
['her', 'approach', ','] - [we] vs [said]


 40%|████      | 20001/50000 [13:23<20:05, 24.89it/s]

Iter= 20000, Average loss= 0.766569, Average Accuracy= 81.10%
['that', 'our', 'chief'] - [danger] vs [danger]


 42%|████▏     | 21002/50000 [14:02<19:23, 24.93it/s]

Iter= 21000, Average loss= 0.748996, Average Accuracy= 81.90%
['case', '.', 'you'] - [will] vs [,]


 44%|████▍     | 22004/50000 [14:39<18:39, 25.01it/s]

Iter= 22000, Average loss= 1.057588, Average Accuracy= 78.00%
['had', 'a', 'proposal'] - [to] vs [council]


 46%|████▌     | 23004/50000 [15:19<17:58, 25.03it/s]

Iter= 23000, Average loss= 0.806153, Average Accuracy= 79.90%
['could', 'take', 'to'] - [outwit] vs [outwit]


 48%|████▊     | 24003/50000 [15:56<17:15, 25.10it/s]

Iter= 24000, Average loss= 0.719810, Average Accuracy= 81.60%
['to', 'bell', 'the'] - [cat] vs [cat]


 50%|█████     | 25002/50000 [16:34<16:34, 25.14it/s]

Iter= 25000, Average loss= 0.892054, Average Accuracy= 80.30%
['well', ',', 'but'] - [who] vs [could]


 52%|█████▏    | 26002/50000 [17:17<15:57, 25.07it/s]

Iter= 26000, Average loss= 0.944023, Average Accuracy= 79.10%
['until', 'an', 'old'] - [mouse] vs [was]


 54%|█████▍    | 27000/50000 [17:59<15:19, 25.01it/s]

Iter= 27000, Average loss= 0.825691, Average Accuracy= 81.70%
['retire', 'while', 'she'] - [was] vs [was]


 56%|█████▌    | 28004/50000 [18:45<14:43, 24.89it/s]

Iter= 28000, Average loss= 0.641051, Average Accuracy= 85.60%
['the', 'neck', 'of'] - [the] vs [the]


 58%|█████▊    | 29003/50000 [19:27<14:05, 24.83it/s]

Iter= 29000, Average loss= 0.704618, Average Accuracy= 84.20%
['i', 'venture', ','] - [therefore] vs [and]


 60%|██████    | 30003/50000 [20:10<13:27, 24.78it/s]

Iter= 30000, Average loss= 0.659248, Average Accuracy= 85.90%
['receive', 'some', 'signal'] - [of] vs [always]


 62%|██████▏   | 31003/50000 [20:54<12:48, 24.71it/s]

Iter= 31000, Average loss= 0.934160, Average Accuracy= 80.10%
['enemy', 'approaches', 'us'] - [.] vs [.]


 64%|██████▍   | 32002/50000 [21:38<12:10, 24.64it/s]

Iter= 32000, Average loss= 0.701973, Average Accuracy= 85.20%
['said', 'he', ','] - [that] vs [that]


 66%|██████▌   | 33004/50000 [22:16<11:28, 24.70it/s]

Iter= 33000, Average loss= 0.760093, Average Accuracy= 84.00%
['thought', 'would', 'meet'] - [the] vs [the]


 68%|██████▊   | 34003/50000 [22:56<10:47, 24.70it/s]

Iter= 34000, Average loss= 0.626493, Average Accuracy= 85.50%
['this', ',', 'and'] - [some] vs [some]


 70%|███████   | 35002/50000 [23:35<10:06, 24.74it/s]

Iter= 35000, Average loss= 0.712144, Average Accuracy= 84.20%
['what', 'measures', 'they'] - [could] vs [they]


 72%|███████▏  | 36001/50000 [24:17<09:26, 24.70it/s]

Iter= 36000, Average loss= 0.722714, Average Accuracy= 84.70%
['to', 'consider', 'what'] - [measures] vs [common]


 74%|███████▍  | 37002/50000 [25:01<08:47, 24.64it/s]

Iter= 37000, Average loss= 0.732080, Average Accuracy= 84.60%
['general', 'council', 'to'] - [consider] vs [consider]


 76%|███████▌  | 38001/50000 [25:45<08:07, 24.59it/s]

Iter= 38000, Average loss= 0.721269, Average Accuracy= 84.20%
['it', 'is', 'easy'] - [to] vs [to]


 78%|███████▊  | 39001/50000 [26:29<07:28, 24.54it/s]

Iter= 39000, Average loss= 0.642090, Average Accuracy= 86.50%
['mice', 'looked', 'at'] - [one] vs [one]


 80%|████████  | 40001/50000 [27:13<06:48, 24.49it/s]

Iter= 40000, Average loss= 0.787285, Average Accuracy= 82.80%
['is', 'all', 'very'] - [well] vs [well]


 82%|████████▏ | 41003/50000 [27:55<06:07, 24.47it/s]

Iter= 41000, Average loss= 0.587612, Average Accuracy= 87.80%
['in', 'the', 'neighbourhood'] - [.] vs [.]


 84%|████████▍ | 42003/50000 [28:39<05:27, 24.43it/s]

Iter= 42000, Average loss= 0.589213, Average Accuracy= 86.80%
['ribbon', 'round', 'the'] - [neck] vs [neck]


 86%|████████▌ | 43003/50000 [29:16<04:45, 24.49it/s]

Iter= 43000, Average loss= 0.642746, Average Accuracy= 86.80%
['therefore', ',', 'to'] - [propose] vs [attached]


 88%|████████▊ | 44003/50000 [29:56<04:04, 24.49it/s]

Iter= 44000, Average loss= 0.743872, Average Accuracy= 84.20%
['her', 'approach', ','] - [we] vs [we]


 90%|█████████ | 45002/50000 [30:39<03:24, 24.46it/s]

Iter= 45000, Average loss= 0.543556, Average Accuracy= 88.10%
['the', 'enemy', 'approaches'] - [us] vs [us]


 92%|█████████▏| 46001/50000 [31:23<02:43, 24.43it/s]

Iter= 46000, Average loss= 0.712971, Average Accuracy= 85.30%
['in', 'the', 'sly'] - [and] vs [and]


 94%|█████████▍| 47003/50000 [32:07<02:02, 24.39it/s]

Iter= 47000, Average loss= 0.657937, Average Accuracy= 85.80%
['proposal', 'to', 'make'] - [,] vs [,]


 96%|█████████▌| 48001/50000 [32:50<01:22, 24.36it/s]

Iter= 48000, Average loss= 0.552095, Average Accuracy= 86.90%
[',', 'and', 'some'] - [said] vs [said]


 98%|█████████▊| 49001/50000 [33:35<00:41, 24.32it/s]

Iter= 49000, Average loss= 0.606203, Average Accuracy= 86.90%
['council', 'to', 'consider'] - [what] vs [their]


100%|██████████| 50000/50000 [34:11<00:00, 24.37it/s]


Iter= 50000, Average loss= 0.527941, Average Accuracy= 87.00%
['said', 'it', 'is'] - [easy] vs [easy]
Optimization Finished!
Elapsed time:  34.49983052810033 min
Run on command line.
	tensorboard --logdir=tensorflow/logs
Point your web browser to: http://localhost:6006/


In [15]:
with tf.Session() as session:
    saver.restore(session, "./sessions/model.ckpt")
    while True:
        prompt = "%s words: " % n_input
        sentence = input(prompt)
        if sentence == "exit":
            break
        sentence = sentence.strip()
        words = sentence.split(' ')
        if len(words) != n_input:
            continue
        try:
            symbols_in_keys = [dictionary[str(words[i])] for i in range(len(words))]
            for i in range(5):
                keys = np.reshape(np.array(symbols_in_keys), [-1, n_input, 1])
                onehot_pred = session.run(pred, feed_dict={x: keys})
                onehot_pred_index = int(tf.argmax(onehot_pred, 1).eval())
                sentence = "%s %s" % (sentence, reverse_dictionary[onehot_pred_index])
                symbols_in_keys = symbols_in_keys[1:]
                symbols_in_keys.append(onehot_pred_index)
            print(sentence)
        except:
            print("Word not in the dictionary!")

INFO:tensorflow:Restoring parameters from ./sessions/model.ckpt
3 words: council to consider
council to consider their measures could take mouse
3 words: proposal to make
proposal to make , which he thought would
3 words: got up and
got up and said that is all very
3 words: mice looked at
mice looked at one another . then the
3 words: in the background
Word not in the dictionary!
3 words: in the backgroun
Word not in the dictionary!
3 words: in the
3 words: ok
3 words: her approach ,
her approach , and could easily retire while
3 words: ok google here
Word not in the dictionary!
3 words: what measures they
what measures they could take mouse got up
3 words: exit
