# RNN in Pytorch 

In this notebook we are going to implement RNN in pytorch. The code lot of code is taken from fast.ai week-6 lesson notebook. This implementation is for learning and practicing that. 

We will build a language model which will predict the next character given three words. It is just simple RNN that we will use to do that. 

## Data Preparation 

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

from fastai.io import *
from fastai.conv_learner import *
from fastai.column_data import *

  return f(*args, **kwds)
  from numpy.core.umath_tests import inner1d


In [2]:
PATH='data/nietzsche/'

In [3]:
text = open(f'{PATH}nietzsche.txt').read()
print('Corpus lenght :', len(text))

Corpus lenght : 600893


In [4]:
text[:400]

'PREFACE\n\n\nSUPPOSING that Truth is a woman--what then? Is there not ground\nfor suspecting that all philosophers, in so far as they have been\ndogmatists, have failed to understand women--that the terrible\nseriousness and clumsy importunity with which they have usually paid\ntheir addresses to Truth, have been unskilled and unseemly methods for\nwinning a woman? Certainly she has never allowed herself '

In [5]:
chars = sorted(list(set(text)))
vocab_size = len(chars)+1
print("Total chars : {}".format((vocab_size)))

Total chars : 85


In [6]:
chars.insert(0, "\0")

''.join(chars[1:-6])

'\n !"\'(),-.0123456789:;=?ABCDEFGHIJKLMNOPQRSTUVWXYZ[]_abcdefghijklmnopqrstuvwxy'

In [7]:
char_indices = {c : i for i, c in enumerate(chars)}
indices_char = {i : c for i,c in enumerate(chars)}

In [8]:
idx = [char_indices[c] for c in text]

idx[:10]

[40, 42, 29, 30, 25, 27, 29, 1, 1, 1]

In [9]:
''.join(indices_char[i] for i in idx[:70])

'PREFACE\n\n\nSUPPOSING that Truth is a woman--what then? Is there not gro'

### First RNN 

#### Create Inputs 

In [10]:
cs = 8

In [11]:
c_in_dat = [[idx[i+j] for i in range(cs)] for j in range(len(idx) - cs)]

In [12]:
c_out_dat = [idx[i+cs] for i in range(len(idx)-cs)]

In [13]:
xs = np.stack(c_in_dat, axis=0)

In [14]:
xs.shape

(600885, 8)

In [15]:
y = np.stack(c_out_dat)

In [16]:
xs[:cs, :cs]

array([[40, 42, 29, 30, 25, 27, 29,  1],
       [42, 29, 30, 25, 27, 29,  1,  1],
       [29, 30, 25, 27, 29,  1,  1,  1],
       [30, 25, 27, 29,  1,  1,  1, 43],
       [25, 27, 29,  1,  1,  1, 43, 45],
       [27, 29,  1,  1,  1, 43, 45, 40],
       [29,  1,  1,  1, 43, 45, 40, 40],
       [ 1,  1,  1, 43, 45, 40, 40, 39]])

In [17]:
n_fac = 42
n_hidden = 256

In [18]:
y[:cs]

array([ 1,  1, 43, 45, 40, 40, 39, 43])

#### Create and train Model

In [19]:
val_idx = get_cv_idxs(len(idx)-cs-1)

In [20]:
md = ColumnarModelData.from_arrays('.',val_idx, xs, y, bs=521)

In [21]:
class CharLoopModel(nn.Module):
    # This is an RNN
    def __init__(self, vocab_size, n_fac):
        super().__init__()
        self.e = nn.Embedding(vocab_size, n_fac)
        self.l_in = nn.Linear(n_fac, n_hidden)
        self.l_hidden = nn.Linear(n_hidden, n_hidden)
        self.l_out = nn.Linear(n_hidden, vocab_size)
        
    def forward(self, *cs):
        bs = cs[0].size(0)
        h = V(torch.zeros(bs, n_hidden).cuda())
        for c in cs:
            inp = F.relu(self.l_in(self.e(c)))
            h = F.tanh(self.l_hidden(h+inp))
        return F.log_softmax(self.l_out(h), dim=-1)

In [22]:
m = CharLoopModel(vocab_size, n_fac).cuda()
opt = optim.Adam(m.parameters(), 1e-2)

In [23]:
fit(m, md, 1, opt, F.nll_loss)

HBox(children=(IntProgress(value=0, description='Epoch', max=1), HTML(value='')))

epoch      trn_loss   val_loss                              
    0      2.019015   2.027871  



[array([2.02787])]

In [24]:
set_lrs(opt, 0.001)

In [25]:
fit(m, md, 3, opt, F.nll_loss)

HBox(children=(IntProgress(value=0, description='Epoch', max=3), HTML(value='')))

epoch      trn_loss   val_loss                              
    0      1.750454   1.748661  
    1      1.698151   1.700396                              
    2      1.655771   1.669532                              



[array([1.66953])]

#### Test the Model

In [26]:
def get_next(inp):
    idxs = T(np.array([char_indices[c] for c in inp]))
    p = m(*VV(idxs))
    i = np.argmax(to_np(p))
    return chars[i]

In [27]:
get_next('for thos')

'e'

In [28]:
get_next('I am th')

'e'

### RNN in Pytorch

In [29]:
class CharRnn(nn.Module):
    def __init__(self, vocab_size, n_fac):
        super().__init__()
        self.e = nn.Embedding(vocab_size, n_fac)
        self.rnn = nn.RNN(n_fac, n_hidden)
        self.l_out = nn.Linear(n_hidden, vocab_size)
        
    def forward(self, *cs):
        bs = cs[0].size(0)
        h = V(torch.zeros(1, bs, n_hidden))
        inp = self.e(torch.stack(cs))
        outp,h = self.rnn(inp, h)
        
        return F.log_softmax(self.l_out(outp[-1]), dim=-1)

In [30]:
m = CharRnn(vocab_size, n_fac).cuda()
opt = optim.Adam(m.parameters(), 1e-3)

In [31]:
it = iter(md.trn_dl)
*xs,yt = next(it)

In [32]:
t = m.e(V(torch.stack(xs)))
t.size()

torch.Size([8, 521, 42])

In [33]:
ht = V(torch.zeros(1, 521,n_hidden))
outp, hn = m.rnn(t, ht)
outp.size(), hn.size()

(torch.Size([8, 521, 256]), torch.Size([1, 521, 256]))

In [34]:
t = m(*V(xs)); t.size()

torch.Size([521, 85])

In [35]:
fit(m, md, 4, opt, F.nll_loss)

HBox(children=(IntProgress(value=0, description='Epoch', max=4), HTML(value='')))

epoch      trn_loss   val_loss                              
    0      1.890761   1.856236  
    1      1.68042    1.679702                              
    2      1.609937   1.606566                              
    3      1.539929   1.558446                              



[array([1.55845])]

In [36]:
set_lrs(opt, 1e-4)

In [37]:
fit(m, md, 4, opt, F.nll_loss)

HBox(children=(IntProgress(value=0, description='Epoch', max=4), HTML(value='')))

epoch      trn_loss   val_loss                              
    0      1.473843   1.520136  
    1      1.466461   1.515069                              
    2      1.461275   1.510437                              
    3      1.453553   1.508202                              



[array([1.5082])]

In [38]:
def get_next(inp):
    idxs = T(np.array([char_indices[c] for c in inp]))
    p = m(*VV(idxs))
    i = np.argmax(to_np(p))
    return chars[i]

In [39]:
get_next('for thos')

'e'

In [40]:
def get_next_n(inp, n):
    res = inp
    for i in range(n):
        c = get_next(inp)
        res += c
        inp = inp[1:]+c
    return res

In [41]:
get_next_n('for thos', 40)

'for those and the same the same the same the sam'