# 5. WordRNN - RNN Language Model with Words

We will construct a Language Model with Recurrent Neural Networks. In this notebook, we will only cover one-way RNN/GRU/LSTM language model but it will be possible to expand one way network with one layer to bi-directional deep recurrent neural networks. Also, it is also possible to use CNN with RNN in order to construct a language model. We will cover it in a near future.

Also, We will do some funny sentence generating from tinyshakespeare dataset. Although the primary goal of WordRNN.py is to build a language model and train/test on a PTB dataset, I've added some methods for sentence generating. But it's not optimized for sentence generating since `train()` method requires valid set for evaluating currently, which is not needed for sampling. I'm planning to fix it but it's not in my priority.

### References
- [CS224n: Natural Language Processing with Deep Learning - Lecture 8](http://web.stanford.edu/class/cs224n/lectures/lecture8.pdf)
- [CS224n: Natural Language Processing with Deep Learning - Lecture 9](http://web.stanford.edu/class/cs224n/lectures/lecture9.pdf)
- [mkroutikov/tf-lstm-char-cnn](https://github.com/mkroutikov/tf-lstm-char-cnn)

In [1]:
from models import WordRNN

In [2]:
def read_corpus(data_dir):
    corpus = []
    with open(data_dir, "r") as f:
        for line in f.readlines():
            tmp_line = line.strip().split(' ')
            if len(tmp_line) == 1:
                continue
            corpus.append(tmp_line)
    return corpus

## With Penn Tree Bank data

In [3]:
train_corpus = read_corpus("data/ptb/ptb.train.txt")
valid_corpus = read_corpus("data/ptb/ptb.valid.txt")
test_corpus = read_corpus("data/ptb/ptb.test.txt")

If you modify some parts of RNN_LM.py, you will be able to use pretrained word embeddings for this model and compare the performance of the cases when you use pretrained embeddings or not. I'll use pretrained embedding in future models, but in this model i'll leave it to readers.

In [4]:
model = WordRNN.WordRNN(word_embedding_size=128,
                        hidden_size=512,
                        cell="LSTM",
                        num_unroll_steps=30,
                        learning_rate=0.01,
                        batch_size=64,
                        num_layers=2)

DEBUG: 04152210


In [5]:
model.fit_to_corpus(train_corpus, valid_corpus)

Instructions for updating:
Use the retry module or similar alternatives.


In [6]:
model.train(10, save_dir="save/05_rnn_lm", print_every=200)

--------------------------------------------------------------------------------
Created and Initialized fresh model. Size: 9821968
--------------------------------------------------------------------------------
000200: 1 [00200/00484], train_loss/perplexity = 6.77542877/876.0549316 secs/batch = 0.0493
000400: 1 [00400/00484], train_loss/perplexity = 6.61267948/744.4751587 secs/batch = 0.0534
Epoch training time: 24.902023315429688

Evaluating..

Finished Epoch 1
train_loss = 6.96030980, perflexity = 1053.96002249
validation_loss = 9.05610900, perflexity = 8570.73696744

000684: 2 [00200/00484], train_loss/perplexity = 7.03914165/1140.4083252 secs/batch = 0.0501
000884: 2 [00400/00484], train_loss/perplexity = 6.64665222/770.2015381 secs/batch = 0.0510
Epoch training time: 24.878760814666748

Evaluating..

Finished Epoch 2
train_loss = 7.07707315, perflexity = 1184.49659557
validation_loss = 6.71638093, perflexity = 825.82338824

001168: 3 [00200/00484], train_loss/perplexity = 6.8253

In [7]:
model.test(test_corpus, load_dir="save/05_rnn_lm")

INFO:tensorflow:Restoring parameters from save/05_rnn_lm/epoch010_5.4634.model
--------------------------------------------------------------------------------
Restored model from checkpoint for testing. Size: 9821968
--------------------------------------------------------------------------------
test loss = 5.39853896, perplexity = 221.08316968
test samples: 002688, time elapsed: 1.0095, time per one batch: 0.0240


Let's try sampling with PTB dataset.

In [8]:
model.sample(30, load_dir="save/05_rnn_lm", starter_word="stock")

INFO:tensorflow:Restoring parameters from save/05_rnn_lm/epoch010_5.4634.model


"stock and and the N years old the N to N million shares and a $ N million of N N of its stock market 's new york stock exchange composite trading"

## With tinyshakespeare Data
We will use tinyshakespeare dataset to make more 'plausible' sentence with RNN. We will compare the result from this WordRNN to CharRNN in the next notebook.

In [48]:
train_corpus = read_corpus("data/rnn/input.txt")
valid_corpus = read_corpus("data/rnn/input.txt")

In [50]:
model2 = RNN_LM.RNN_LM(word_embedding_size=256,
                      hidden_size=512,
                      cell="LSTM",
                      num_unroll_steps=30,
                      learning_rate=0.005,
                      batch_size=64,
                      num_layers=2)

DEBUG: 04152210


In [51]:
model2.fit_to_corpus(train_corpus, valid_corpus)

In [52]:
model2.train(50, save_dir="tmp", print_every=200)

--------------------------------------------------------------------------------
Created and Initialized fresh model. Size: 23236703
--------------------------------------------------------------------------------
Epoch training time: 9.39647626876831

Evaluating..

Finished Epoch 1
train_loss = 7.98915776, perflexity = 2948.81230240
validation_loss = 7.65021506, perflexity = 2101.09739967

Epoch training time: 9.324654579162598

Evaluating..

Finished Epoch 2
train_loss = 7.95522100, perflexity = 2850.41818842
validation_loss = 8.17054488, perflexity = 3535.26971930

Epoch training time: 9.602049112319946

Evaluating..

Finished Epoch 3
train_loss = 8.32056282, perflexity = 4107.47113733
validation_loss = 8.22334928, perflexity = 3726.96416135

Epoch training time: 9.568662881851196

Evaluating..

Finished Epoch 4
train_loss = 8.63934223, perflexity = 5649.61243285
validation_loss = 8.40822631, perflexity = 4483.80055992

Epoch training time: 9.521775484085083

Evaluating..

Finished 

Epoch training time: 8.92335033416748

Evaluating..

Finished Epoch 46
train_loss = 3.69393020, perflexity = 40.20254091
validation_loss = 3.29468464, perflexity = 26.96890788

Epoch training time: 8.883531332015991

Evaluating..

Finished Epoch 47
train_loss = 3.65591117, perflexity = 38.70276994
validation_loss = 3.28405757, perflexity = 26.68382477

Epoch training time: 8.856118440628052

Evaluating..

Finished Epoch 48
train_loss = 3.61137724, perflexity = 37.01699891
validation_loss = 3.16153612, perflexity = 23.60683112

Epoch training time: 8.85993766784668

Evaluating..

Finished Epoch 49
train_loss = 3.55849476, perflexity = 35.11030788
validation_loss = 3.11998596, perflexity = 22.64606180

Epoch training time: 8.856158256530762

Evaluating..

Finished Epoch 50
train_loss = 3.52563918, perflexity = 33.97548314
validation_loss = 3.00872398, perflexity = 20.26152931



In [53]:
model2.sample(50, load_dir="tmp", starter_word="the")

INFO:tensorflow:Restoring parameters from tmp/epoch050_3.0087.model


'the gods know I would have not be a good man, my lord, my lord, you have the king. I am not at the duke? You are not so. You will not be so. good villain! You must be whipt. I have done to a husband as you sleep? you shall not'