# 5. WordRNN - RNN Language Model with Words

We will construct a Language Model with Recurrent Neural Networks. In this notebook, we will only cover one-way RNN/GRU/LSTM language model but it will be possible to expand one way network with one layer to bi-directional deep recurrent neural networks. Also, it is also possible to use CNN with RNN in order to construct a language model. We will cover it in a near future.

Also, We will do some funny sentence generating from tinyshakespeare dataset. Although the primary goal of WordRNN.py is to build a language model and train/test on a PTB dataset, I've added some methods for sentence generating. But it's not optimized for sentence generating since `train()` method requires valid set for evaluating currently, which is not needed for sampling. I'm planning to fix it but it's not in my priority.

### References
- [CS224n: Natural Language Processing with Deep Learning - Lecture 8](http://web.stanford.edu/class/cs224n/lectures/lecture8.pdf)
- [CS224n: Natural Language Processing with Deep Learning - Lecture 9](http://web.stanford.edu/class/cs224n/lectures/lecture9.pdf)
- [mkroutikov/tf-lstm-char-cnn](https://github.com/mkroutikov/tf-lstm-char-cnn)

In [1]:
from models import WordRNN

In [2]:
def read_corpus(data_dir):
    corpus = []
    with open(data_dir, "r") as f:
        for line in f.readlines():
            tmp_line = line.strip().split(' ')
            if len(tmp_line) == 1:
                continue
            corpus.append(tmp_line)
    return corpus

## With Penn Tree Bank data

In [3]:
train_corpus = read_corpus("data/rnnlm_datasets/ptb/train.txt")
valid_corpus = read_corpus("data/rnnlm_datasets/ptb/valid.txt")
test_corpus = read_corpus("data/rnnlm_datasets/ptb/test.txt")

If you modify some parts of RNN_LM.py, you will be able to use pretrained word embeddings for this model and compare the performance of the cases when you use pretrained embeddings or not. I'll use pretrained embedding in future models, but in this model i'll leave it to readers.

In [4]:
model = WordRNN.WordRNN(word_embedding_size=128,
                        hidden_size=512,
                        cell="LSTM",
                        num_unroll_steps=30,
                        learning_rate=0.001,
                        batch_size=64,
                        num_layers=2)

DEBUG: 04152210


In [5]:
model.fit_to_corpus(train_corpus, valid_corpus)

Instructions for updating:
Use the retry module or similar alternatives.


In [6]:
model.train(20, save_dir="save/05_rnn_lm", print_every=200)

--------------------------------------------------------------------------------
Created and Initialized fresh model. Size: 9821327
--------------------------------------------------------------------------------
000200: 1 [00200/00484], train_loss/perplexity = 6.64135027/766.1287842 secs/batch = 0.0417
000400: 1 [00400/00484], train_loss/perplexity = 6.61470509/745.9846802 secs/batch = 0.0426
Epoch training time: 20.554283380508423

Evaluating..

Finished Epoch 1
train_loss = 6.75112559, perplexity = 855.02062786
validation_loss = 6.66651878, perplexity = 785.65579451

000684: 2 [00200/00484], train_loss/perplexity = 6.55243683/700.9501953 secs/batch = 0.0427
000884: 2 [00400/00484], train_loss/perplexity = 6.29491997/541.8125000 secs/batch = 0.0428
Epoch training time: 20.479009866714478

Evaluating..

Finished Epoch 2
train_loss = 6.53963027, perplexity = 692.03066569
validation_loss = 6.29975884, perplexity = 544.44059474

001168: 3 [00200/00484], train_loss/perplexity = 5.96246672

In [7]:
model.test(test_corpus, load_dir="save/05_rnn_lm")

INFO:tensorflow:Restoring parameters from save/05_rnn_lm/epoch020_4.9714.model
--------------------------------------------------------------------------------
Restored model from checkpoint for testing. Size: 9821327
--------------------------------------------------------------------------------
test loss = 4.87975677, perplexity = 131.59865106
test samples: 002688, time elapsed: 0.7536, time per one batch: 0.0179


Let's try sampling with PTB dataset.

In [8]:
model.sample(30, load_dir="save/05_rnn_lm", starter_word="stock")

INFO:tensorflow:Restoring parameters from save/05_rnn_lm/epoch020_4.9714.model


"stock will make them to be the right of the stock and exchange commission in the past century 's market in the u.s. 's history of the market 's market and that"

## With tinyshakespeare Data
We will use tinyshakespeare dataset for sampling. We will compare the result from this WordRNN to CharRNN in the notebook.

In [9]:
train_corpus = read_corpus("data/rnn/input.txt")
valid_corpus = read_corpus("data/rnn/input.txt")

In [10]:
model2 = WordRNN.WordRNN(word_embedding_size=256,
                        hidden_size=512,
                        cell="LSTM",
                        num_unroll_steps=30,
                        learning_rate=0.005,
                        batch_size=64,
                        num_layers=2)

DEBUG: 04152210


In [11]:
model2.fit_to_corpus(train_corpus, valid_corpus)

In [12]:
model2.train(50, save_dir="tmp", print_every=200)

--------------------------------------------------------------------------------
Created and Initialized fresh model. Size: 23236703
--------------------------------------------------------------------------------
Epoch training time: 8.178820133209229

Evaluating..

Finished Epoch 1
train_loss = 7.85117774, perplexity = 2568.75786937
validation_loss = 7.53823048, perplexity = 1878.50303529

Epoch training time: 8.057024478912354

Evaluating..

Finished Epoch 2
train_loss = 7.93559659, perplexity = 2795.02573622
validation_loss = 7.71589308, perplexity = 2243.72582879

Epoch training time: 8.102108716964722

Evaluating..

Finished Epoch 3
train_loss = 8.16610162, perplexity = 3519.59647496
validation_loss = 7.87270113, perplexity = 2624.64553354

Epoch training time: 8.0701322555542

Evaluating..

Finished Epoch 4
train_loss = 8.39273260, perplexity = 4414.86530375
validation_loss = 7.97569005, perplexity = 2909.36479382

Epoch training time: 8.089577913284302

Evaluating..

Finished E

Epoch training time: 8.128581762313843

Evaluating..

Finished Epoch 46
train_loss = 2.94174491, perplexity = 18.94888152
validation_loss = 2.27822024, perplexity = 9.75929573

Epoch training time: 8.148298263549805

Evaluating..

Finished Epoch 47
train_loss = 2.91358507, perplexity = 18.42272702
validation_loss = 2.22284269, perplexity = 9.23354173

Epoch training time: 8.148903369903564

Evaluating..

Finished Epoch 48
train_loss = 2.87698229, perplexity = 17.76059590
validation_loss = 2.22527506, perplexity = 9.25602846

Epoch training time: 8.1575288772583

Evaluating..

Finished Epoch 49
train_loss = 2.84776381, perplexity = 17.24916630
validation_loss = 2.13689162, perplexity = 8.47305915

Epoch training time: 8.12691330909729

Evaluating..

Finished Epoch 50
train_loss = 2.81535885, perplexity = 16.69916714
validation_loss = 2.13372996, perplexity = 8.44631250



In [13]:
model2.sample(300, load_dir="tmp", starter_word="the")

INFO:tensorflow:Restoring parameters from tmp/epoch050_2.1337.model


"the king is in the service, which I had brought his power to his hand, I will not be it. What is it a very man of the king of the world. But, sir, I see your very good of love. I will be it. I that you shall have not the good a poor good of a husband, which he is a hair a kind of the gold the suit of you, I am as the good will be it. You are not to the duke? what you have heard you are as much a very man and of a house of a man a good a of a house of a good good of a good and a most of the poor courtesy. I advise you, my name? for you say, and that you have found to him it. I I have a delight in a man's house to a name. If I have heard it. Fare you well. I tell me a tawdry-lace and with your good I see your good good and good good of the good poor of my life, for I had been it. I I am but the good I will see it. You are a hair old. the Bless to the duke? What you have made you as they are not as a very good and of the present time of that the good which I come to make a good and a fac