# 5. WordRNN - RNN Language Model with Words

We will construct a Language Model with Recurrent Neural Networks. In this notebook, we will only cover one-way RNN/GRU/LSTM language model but it will be possible to expand one way network with one layer to bi-directional deep recurrent neural networks. Also, it is also possible to use CNN with RNN in order to construct a language model. We will cover it in a near future.

Also, We will do some funny sentence generating from tinyshakespeare dataset. Although the primary goal of WordRNN.py is to build a language model and train/test on a PTB dataset, I've added some methods for sentence generating. But it's not optimized for sentence generating since `train()` method requires valid set for evaluating currently, which is not needed for sampling. I'm planning to fix it but it's not in my priority.

### References
- [CS224n: Natural Language Processing with Deep Learning - Lecture 8](http://web.stanford.edu/class/cs224n/lectures/lecture8.pdf)
- [CS224n: Natural Language Processing with Deep Learning - Lecture 9](http://web.stanford.edu/class/cs224n/lectures/lecture9.pdf)
- [mkroutikov/tf-lstm-char-cnn](https://github.com/mkroutikov/tf-lstm-char-cnn)

In [1]:
from models import WordRNN

In [2]:
def read_corpus(data_dir):
    corpus = []
    with open(data_dir, "r") as f:
        for line in f.readlines():
            tmp_line = line.strip().split(' ')
            if len(tmp_line) == 1:
                continue
            corpus.append(tmp_line)
    return corpus

## With Penn Tree Bank data

In [4]:
train_corpus = read_corpus("data/rnnlm_datasets/ptb/train.txt")
valid_corpus = read_corpus("data/rnnlm_datasets/ptb/valid.txt")
test_corpus = read_corpus("data/rnnlm_datasets/ptb/test.txt")

If you modify some parts of RNN_LM.py, you will be able to use pretrained word embeddings for this model and compare the performance of the cases when you use pretrained embeddings or not. I'll use pretrained embedding in future models, but in this model i'll leave it to readers.

In [8]:
model = WordRNN.WordRNN(word_embedding_size=128,
                        hidden_size=512,
                        cell="LSTM",
                        num_unroll_steps=30,
                        learning_rate=0.001,
                        batch_size=64,
                        num_layers=2)

DEBUG: 04152210


In [9]:
model.fit_to_corpus(train_corpus, valid_corpus)

In [10]:
model.train(10, save_dir="save/05_rnn_lm", print_every=200)

--------------------------------------------------------------------------------
Created and Initialized fresh model. Size: 9821327
--------------------------------------------------------------------------------
000200: 1 [00200/00484], train_loss/perplexity = 6.63781738/763.4268799 secs/batch = 0.0433
000400: 1 [00400/00484], train_loss/perplexity = 6.59713650/732.9932251 secs/batch = 0.0452
Epoch training time: 21.694878101348877

Evaluating..

Finished Epoch 1
train_loss = 6.74412218, perplexity = 849.05348673
validation_loss = 6.65829144, perplexity = 779.21845786

000684: 2 [00200/00484], train_loss/perplexity = 6.55697107/704.1356812 secs/batch = 0.0490
000884: 2 [00400/00484], train_loss/perplexity = 6.24998665/518.0059204 secs/batch = 0.0464
Epoch training time: 22.208510875701904

Evaluating..

Finished Epoch 2
train_loss = 6.50202824, perplexity = 666.49206807
validation_loss = 6.18816230, perplexity = 486.95041552

001168: 3 [00200/00484], train_loss/perplexity = 5.91213751

In [11]:
model.test(test_corpus, load_dir="save/05_rnn_lm")

INFO:tensorflow:Restoring parameters from save/05_rnn_lm/epoch010_5.0937.model
--------------------------------------------------------------------------------
Restored model from checkpoint for testing. Size: 9821327
--------------------------------------------------------------------------------
test loss = 5.17648795, perplexity = 177.05987379
test samples: 002688, time elapsed: 0.7923, time per one batch: 0.0189


Let's try sampling with PTB dataset.

In [14]:
model.sample(30, load_dir="save/05_rnn_lm", starter_word="stock")

INFO:tensorflow:Restoring parameters from save/05_rnn_lm/epoch010_5.0937.model


"stock rothschilds fleischmann benefit-seeking a N million 's a the N to the and the the a in N of the the a the the a the the the N in N"

## With tinyshakespeare Data
We will use tinyshakespeare dataset for sampling. We will compare the result from this WordRNN to CharRNN in the notebook.

In [15]:
train_corpus = read_corpus("data/rnn/input.txt")
valid_corpus = read_corpus("data/rnn/input.txt")

In [17]:
model2 = WordRNN.WordRNN(word_embedding_size=256,
                        hidden_size=512,
                        cell="LSTM",
                        num_unroll_steps=30,
                        learning_rate=0.005,
                        batch_size=64,
                        num_layers=2)

DEBUG: 04152210


In [18]:
model2.fit_to_corpus(train_corpus, valid_corpus)

In [19]:
model2.train(50, save_dir="tmp", print_every=200)

--------------------------------------------------------------------------------
Created and Initialized fresh model. Size: 23236703
--------------------------------------------------------------------------------
Epoch training time: 8.462949514389038

Evaluating..

Finished Epoch 1
train_loss = 7.86568061, perplexity = 2606.28368003
validation_loss = 7.60973909, perplexity = 2017.75158543

Epoch training time: 9.069615602493286

Evaluating..

Finished Epoch 2
train_loss = 7.95595207, perplexity = 2852.50281576
validation_loss = 7.78833448, perplexity = 2412.29649810

Epoch training time: 9.387078523635864

Evaluating..

Finished Epoch 3
train_loss = 8.27326415, perplexity = 3917.71616978
validation_loss = 7.98781574, perplexity = 2944.85758807

Epoch training time: 10.45639181137085

Evaluating..

Finished Epoch 4
train_loss = 8.57017626, perplexity = 5272.05894093
validation_loss = 8.18246222, perplexity = 3577.65279150

Epoch training time: 10.32453727722168

Evaluating..

Finished

Epoch training time: 9.158948421478271

Evaluating..

Finished Epoch 46
train_loss = 3.69821189, perplexity = 40.37504491
validation_loss = 3.06578425, perplexity = 21.45127849

Epoch training time: 8.781265020370483

Evaluating..

Finished Epoch 47
train_loss = 3.64910029, perplexity = 38.44006570
validation_loss = 3.01453912, perplexity = 20.37969623

Epoch training time: 8.434995174407959

Evaluating..

Finished Epoch 48
train_loss = 3.59858961, perplexity = 36.54665287
validation_loss = 2.96528140, perplexity = 19.40016176

Epoch training time: 8.637258291244507

Evaluating..

Finished Epoch 49
train_loss = 3.57007013, perplexity = 35.51908406
validation_loss = 2.94526690, perplexity = 19.01573702

Epoch training time: 8.625694990158081

Evaluating..

Finished Epoch 50
train_loss = 3.53479892, perplexity = 34.28811923
validation_loss = 2.89230615, perplexity = 18.03485270



In [21]:
model2.sample(300, load_dir="tmp", starter_word="the")

INFO:tensorflow:Restoring parameters from tmp/epoch050_2.8923.model


"the man that is not so too! What O, I have the king. I am not for the time of marriage I will be a king. A I will be the king. A duke's matter, surely: I am not the king. How I pray you, sir, be not a word. My lord, I have not a king; for it is I love. What I am not known my very good time I will not be a love. When I have been a woman in a king. When I am a gentleman? Let me be a king. But I am not too much: or a' and play to the king. How I have a cause with me. How I will tell you so, I am a poor man and a very house he hath a very piece of the north, he had he had it is the very thing and a king's son should be I sent it. A mother, and a very man and the very house and old old house of a war I pray. I come to the poor house of his old looks, come to the good man and a very house and a old man I make, my good good good and and the maid is the very house of his life, and I have it home to his good and the man I see the good of a old man and the war I pray. I pray thee, I had a woman