#### Importing Necessary Libraries:

In [4]:
import torch
import numpy as np
from torch import nn
import torch.nn.functional as F

## Loading the Data:

In [5]:
with open('C:/Users/Geekquad/rnn_data/anna.txt', 'r') as f:
    text = f.read()

#### Checking out the first 500 characters:

In [6]:
text[:500]

"Chapter 1\n\n\nHappy families are all alike; every unhappy family is unhappy in its own\nway.\n\nEverything was in confusion in the Oblonskys' house. The wife had\ndiscovered that the husband was carrying on an intrigue with a French\ngirl, who had been a governess in their family, and she had announced to\nher husband that she could not go on living in the same house with him.\nThis position of affairs had now lasted three days, and not only the\nhusband and wife themselves, but all the members of their f"

## Tokenization:

In the cells below I am creating a couple of dictionaries to convert the characters to and from integers. 
Encoding the characters as integers makes it easier to use as input in the network.

In [7]:
"""Creating two dictonaries
   1. int2char : which maps integers to characters
   2. char2int : which maps charaters to integers"""

chars = tuple(set(text))
int2char = dict(enumerate((chars)))
char2int = {ch: ii for ii, ch in int2char.items()}

#ENCODING THE TEXT:
encoded = np.array([char2int[ch] for ch in text])

And we can see those same characters from above, encoded as integers.

In [8]:
encoded[:100]

array([77, 42, 11, 58, 19, 22, 30, 20, 38, 62, 62, 62, 56, 11, 58, 58, 35,
       20, 41, 11, 24, 17, 15, 17, 22, 49, 20, 11, 30, 22, 20, 11, 15, 15,
       20, 11, 15, 17, 45, 22, 55, 20, 22, 21, 22, 30, 35, 20, 18, 46, 42,
       11, 58, 58, 35, 20, 41, 11, 24, 17, 15, 35, 20, 17, 49, 20, 18, 46,
       42, 11, 58, 58, 35, 20, 17, 46, 20, 17, 19, 49, 20, 76,  7, 46, 62,
        7, 11, 35, 50, 62, 62, 68, 21, 22, 30, 35, 19, 42, 17, 46])

## Pre-processing the data:

As in out char-RNN, our LSTM expects an input that is one-hot encoded meaning, that each character is converted into an integer (by our created dictionary), and then converted into a column vector where only it's corresponding integer index will have the value of 1 and the rest of the vector will be filled with 0's. 
Making a one_hot_encoding function to do this:

In [9]:
def one_hot_encode(arr, n_labels):
    one_hot = np.zeros((arr.size, n_labels), dtype = np.float32)
    one_hot[np.arange(one_hot.shape[0]), arr.flatten()] = 1
    one_hot = one_hot.reshape((*arr.shape, n_labels))
    return one_hot

In [10]:
test_seq = np.array([[3, 5, 1]])
one_hot = one_hot_encode(test_seq, 8)

print(one_hot)

[[[0. 0. 0. 1. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 1. 0. 0.]
  [0. 1. 0. 0. 0. 0. 0. 0.]]]


## Making training mini-batches

To train on this data, we will create mini-batches for training of some desired number of sequence steps.

In [13]:
def get_batches(arr, batch_size, seq_length):
    batch_size_total = batch_size*seq_length
    n_batches = len(arr)//batch_size_total
    
    arr = arr[:n_batches*batch_size_total]
    arr = arr.reshape((batch_size, -1))
    
    for n in range(0, arr.shape[1], seq_length):
        x = arr[:, n:n+seq_length]
        y = np.zeros_like(x)
        try:
            y[:,:-1], y[:,-1] = x[:,1:], arr[:, n+seq_length]
        except IndexError:
            y[:, :-1], y[:, -1] = x[:, 1:], arr[:, 0]
        yield x, y 

Now I'll make some data sets and we can check out what's going on as we batch data. Here I am going to use a batch size of 8 and 50 sequence steps.

In [14]:
batches = get_batches(encoded, 8, 50)
x, y = next(batches)

In [17]:
print('x/n', x[:10, :10])
print('\ny\n', y[:10, :10])

x/n [[77 42 11 58 19 22 30 20 38 62]
 [49 76 46 20 19 42 11 19 20 11]
 [22 46 75 20 76 30 20 11 20 41]
 [49 20 19 42 22 20 53 42 17 22]
 [20 49 11  7 20 42 22 30 20 19]
 [53 18 49 49 17 76 46 20 11 46]
 [20 12 46 46 11 20 42 11 75 20]
 [31  8 15 76 46 49 45 35 50 20]]

y
 [[42 11 58 19 22 30 20 38 62 62]
 [76 46 20 19 42 11 19 20 11 19]
 [46 75 20 76 30 20 11 20 41 76]
 [20 19 42 22 20 53 42 17 22 41]
 [49 11  7 20 42 22 30 20 19 22]
 [18 49 49 17 76 46 20 11 46 75]
 [12 46 46 11 20 42 11 75 20 49]
 [ 8 15 76 46 49 45 35 50 20 74]]


## Building the Network:

In [42]:
class CharRNN(nn.Module):
    def __init__(self, tokens, n_hidden=256, n_layers=2, drop_prob=0.5, lr=0.001):
        super().__init__()
        self.drop_prob = drop_prob
        self.n_layers = n_layers
        self.n_hidden = n_hidden
        self.lr = lr
        
        self.chars = tokens
        self.int2char = dict(enumerate(self.chars))
        self.char2int = {ch: ii for ii, ch in self.int2char.items()}
        
        self.lstm = nn.LSTM(len(self.chars), n_hidden, n_layers, dropout=drop_prob, batch_first=True)
        self.dropout = nn.Dropout(drop_prob)
        self.fc = nn.Linear(n_hidden, len(self.chars))
        
    def forward(self, x, hidden):
        r_output, hidden = self.lstm(x, hidden)
        out = self.dropout(r_output)
        out = out.contiguous().view(-1, self.n_hidden)
        out = self.fc(out)
        return out, hidden
    
    def init_hidden(self, batch_size):
        weight = next(self.parameters()).data
        
        hidden = (weight.new(self.n_layers, batch_size, self.n_hidden).zero_(), weight.new(self.n_layers, batch_size, self.n_hidden).zero_())
        
        return hidden       

In [43]:
def train(net, data, epochs=10, batch_size=10, seq_length=50, lr=0.001, clip=5, val_frac=0.1, print_every=10):
    ''' Training a network 
    
        Arguments
        ---------
        
        net: CharRNN network
        data: text data to train the network
        epochs: Number of epochs to train
        batch_size: Number of mini-sequences per mini-batch, aka batch size
        seq_length: Number of character steps per mini-batch
        lr: learning rate
        clip: gradient clipping
        val_frac: Fraction of data to hold out for validation
        print_every: Number of steps for printing training and validation loss
    
    '''
    
    net.train()
    opt = torch.optim.Adam(net.parameters(), lr = lr)
    criterion = nn.CrossEntropyLoss()
    
    val_idx = int(len(data)*(1-val_frac))
    data, val_data = data[:val_idx], data[val_idx:]
    
    counter = 0
    n_chars = len(net.chars)
    for e in range(epochs):
        h = net.init_hidden(batch_size)
        
        for x, y in get_batches(data, batch_size, seq_length):
            
            counter += 1
            x = one_hot_encode(x, n_chars)
            inputs, targets = torch.from_numpy(x), torch.from_numpy(y)
            
            h = tuple([each.data for each in h])

            net.zero_grad()
            
            output, h = net(inputs, h)
            
            loss = criterion(output, targets.view(batch_size*seq_length).long())
            loss.backward()
            nn.utils.clip_grad_norm_(net.parameters(), clip)
            opt.step()
            
            # loss stats
            if counter % print_every == 0:
                val_h = net.init_hidden(batch_size)
                val_losses = []
                net.eval()
                for x, y in get_batches(val_data, batch_size, seq_length):
                    x = one_hot_encode(x, n_chars)
                    x, y = torch.from_numpy(x), torch.from_numpy(y)
                    
                    val_h = tuple([each.data for each in val_h])
                    
                    inputs, targets = x, y                                         
                    output, val_h = net(inputs, val_h)
                    val_loss = criterion(output, targets.view(batch_size*seq_length).long())
                
                    val_losses.append(val_loss.item())
                
                net.train()
                
                print("Epoch: {}/{}...".format(e+1, epochs),
                      "Step: {}...".format(counter),
                      "Loss: {:.4f}...".format(loss.item()),
                      "Val Loss: {:.4f}".format(np.mean(val_losses)))

## Instantiating the model:

In [44]:
n_hidden=512
n_layers=2

net = CharRNN(chars, n_hidden, n_layers)
print(net)

CharRNN(
  (lstm): LSTM(83, 512, num_layers=2, batch_first=True, dropout=0.5)
  (dropout): Dropout(p=0.5)
  (fc): Linear(in_features=512, out_features=83, bias=True)
)


In [49]:
batch_size = 128
seq_length = 100
n_epochs = 30

## Training the Model:

In [50]:
train(net, encoded, epochs=n_epochs, batch_size=batch_size, seq_length=seq_length, lr=0.001, print_every=10)

Epoch: 1/30... Step: 10... Loss: 1.3963... Val Loss: 1.3868
Epoch: 1/30... Step: 20... Loss: 1.3998... Val Loss: 1.3824
Epoch: 1/30... Step: 30... Loss: 1.3898... Val Loss: 1.3791
Epoch: 1/30... Step: 40... Loss: 1.3694... Val Loss: 1.3806
Epoch: 1/30... Step: 50... Loss: 1.3874... Val Loss: 1.3838
Epoch: 1/30... Step: 60... Loss: 1.3171... Val Loss: 1.3796
Epoch: 1/30... Step: 70... Loss: 1.3471... Val Loss: 1.3792
Epoch: 1/30... Step: 80... Loss: 1.3371... Val Loss: 1.3775
Epoch: 1/30... Step: 90... Loss: 1.3491... Val Loss: 1.3727
Epoch: 1/30... Step: 100... Loss: 1.3522... Val Loss: 1.3731
Epoch: 1/30... Step: 110... Loss: 1.3407... Val Loss: 1.3814
Epoch: 1/30... Step: 120... Loss: 1.3172... Val Loss: 1.3738
Epoch: 1/30... Step: 130... Loss: 1.3529... Val Loss: 1.3699
Epoch: 2/30... Step: 140... Loss: 1.3880... Val Loss: 1.3799
Epoch: 2/30... Step: 150... Loss: 1.3541... Val Loss: 1.3707
Epoch: 2/30... Step: 160... Loss: 1.3623... Val Loss: 1.3667
Epoch: 2/30... Step: 170... Loss:

Epoch: 10/30... Step: 1350... Loss: 1.2125... Val Loss: 1.2984
Epoch: 10/30... Step: 1360... Loss: 1.2118... Val Loss: 1.2936
Epoch: 10/30... Step: 1370... Loss: 1.2100... Val Loss: 1.2877
Epoch: 10/30... Step: 1380... Loss: 1.2391... Val Loss: 1.2908
Epoch: 10/30... Step: 1390... Loss: 1.2673... Val Loss: 1.2948
Epoch: 11/30... Step: 1400... Loss: 1.2420... Val Loss: 1.2898
Epoch: 11/30... Step: 1410... Loss: 1.2614... Val Loss: 1.2909
Epoch: 11/30... Step: 1420... Loss: 1.2530... Val Loss: 1.2855
Epoch: 11/30... Step: 1430... Loss: 1.2212... Val Loss: 1.2918
Epoch: 11/30... Step: 1440... Loss: 1.2392... Val Loss: 1.2917
Epoch: 11/30... Step: 1450... Loss: 1.1737... Val Loss: 1.2842
Epoch: 11/30... Step: 1460... Loss: 1.2027... Val Loss: 1.2817
Epoch: 11/30... Step: 1470... Loss: 1.1969... Val Loss: 1.2854
Epoch: 11/30... Step: 1480... Loss: 1.2168... Val Loss: 1.2848
Epoch: 11/30... Step: 1490... Loss: 1.2057... Val Loss: 1.2880
Epoch: 11/30... Step: 1500... Loss: 1.1965... Val Loss:

Epoch: 20/30... Step: 2660... Loss: 1.1596... Val Loss: 1.2599
Epoch: 20/30... Step: 2670... Loss: 1.1559... Val Loss: 1.2637
Epoch: 20/30... Step: 2680... Loss: 1.1457... Val Loss: 1.2636
Epoch: 20/30... Step: 2690... Loss: 1.1411... Val Loss: 1.2653
Epoch: 20/30... Step: 2700... Loss: 1.1488... Val Loss: 1.2627
Epoch: 20/30... Step: 2710... Loss: 1.1145... Val Loss: 1.2610
Epoch: 20/30... Step: 2720... Loss: 1.1241... Val Loss: 1.2607
Epoch: 20/30... Step: 2730... Loss: 1.1121... Val Loss: 1.2589
Epoch: 20/30... Step: 2740... Loss: 1.1137... Val Loss: 1.2647
Epoch: 20/30... Step: 2750... Loss: 1.1182... Val Loss: 1.2704
Epoch: 20/30... Step: 2760... Loss: 1.1123... Val Loss: 1.2665
Epoch: 20/30... Step: 2770... Loss: 1.1452... Val Loss: 1.2640
Epoch: 20/30... Step: 2780... Loss: 1.1876... Val Loss: 1.2623
Epoch: 21/30... Step: 2790... Loss: 1.1581... Val Loss: 1.2637
Epoch: 21/30... Step: 2800... Loss: 1.1809... Val Loss: 1.2586
Epoch: 21/30... Step: 2810... Loss: 1.1721... Val Loss:

Epoch: 29/30... Step: 3970... Loss: 1.1122... Val Loss: 1.2616
Epoch: 29/30... Step: 3980... Loss: 1.0858... Val Loss: 1.2749
Epoch: 29/30... Step: 3990... Loss: 1.0739... Val Loss: 1.2690
Epoch: 29/30... Step: 4000... Loss: 1.0867... Val Loss: 1.2671
Epoch: 29/30... Step: 4010... Loss: 1.0781... Val Loss: 1.2689
Epoch: 29/30... Step: 4020... Loss: 1.0737... Val Loss: 1.2606
Epoch: 29/30... Step: 4030... Loss: 1.0967... Val Loss: 1.2620
Epoch: 30/30... Step: 4040... Loss: 1.0795... Val Loss: 1.2555
Epoch: 30/30... Step: 4050... Loss: 1.0946... Val Loss: 1.2539
Epoch: 30/30... Step: 4060... Loss: 1.0953... Val Loss: 1.2616
Epoch: 30/30... Step: 4070... Loss: 1.0883... Val Loss: 1.2548
Epoch: 30/30... Step: 4080... Loss: 1.0861... Val Loss: 1.2619
Epoch: 30/30... Step: 4090... Loss: 1.0974... Val Loss: 1.2623
Epoch: 30/30... Step: 4100... Loss: 1.0715... Val Loss: 1.2649
Epoch: 30/30... Step: 4110... Loss: 1.0705... Val Loss: 1.2623
Epoch: 30/30... Step: 4120... Loss: 1.0607... Val Loss: