Study of paper [A learning algorithm for continually running fully recurrent neural networks 1989](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.52.9724&rep=rep1&type=pdf)

Storing long string of text into 30 neurons RNN. It's not an attempt of storing data in efficient way, just a study of RNNs in general.

Text used is Churchill's "We shall fight on the beaches" speech:

<blockquote>
The British Empire and the French Republic, linked together in their cause and in their need, will defend to the death their native soil, aiding each other like good comrades to the utmost of their strength.

Even though large tracts of Europe and many old and famous states have fallen or may fall into the grip of the Gestapo and all the odious apparatus of Nazi rule, we shall not flag or fail.

We shall go on to the end, we shall fight in France, we shall fight on the seas and oceans, we shall fight with growing confidence and growing strength in the air, we shall defend our island, whatever the cost may be.

We shall fight on the beaches, we shall fight on the landing grounds, we shall fight in the fields and in the streets, we shall fight in the hills; we shall never surrender, and even if, which I do not for a moment believe, this island or a large part of it were subjugated and starving, then our Empire beyond the seas, armed and guarded by the British fleet, would carry on the struggle, until, in God's good time, the new world, with all its power and might, steps forth to the rescue and the liberation of the old.
</blockquote>


In [1]:
import numpy as np

import torch
import torch.nn as nn

from bs4 import BeautifulSoup
from tqdm import tqdm

Download dataset

In [2]:
BATCH_SIZE = 1
trainset = [['', 'The British Empire and the French Republic, linked together in their cause and in their need, will defend to the death their native soil, aiding each other like good comrades to the utmost of their strength. Even though large tracts of Europe and many old and famous states have fallen or may fall into the grip of the Gestapo and all the odious apparatus of Nazi rule, we shall not flag or fail. We shall go on to the end, we shall fight in France, we shall fight on the seas and oceans, we shall fight with growing confidence and growing strength in the air, we shall defend our island, whatever the cost may be. We shall fight on the beaches, we shall fight on the landing grounds, we shall fight in the fields and in the streets, we shall fight in the hills; we shall never surrender, and even if, which I do not for a moment believe, this island or a large part of it were subjugated and starving, then our Empire beyond the seas, armed and guarded by the British fleet, would carry on the struggle, until, in Gods good time, the new world, with all its power and might, steps forth to the rescue and the liberation of the old.']]
train_loader = torch.utils.data.DataLoader(trainset, batch_size=BATCH_SIZE)

Prepare dataset: 
* remove html tags
* convert to ascii
* character embedding 

_Note: supports batch size 1 only_

In [3]:
def process_dataset(dataset):
    _, review = dataset[0][0], dataset[1][0]
    # 1 indicates batch size, we support only one entry per batch
    # 128 is amount ascii simbols for one hot encoding
    output_review = torch.zeros(len(review)-1, 1, 128)
    output_labels = torch.zeros(len(review)-1, 1, 128)

    # clean up data
    data_processed = BeautifulSoup(review, "html.parser").get_text()
    data_processed = data_processed.encode('ascii', 'ignore')

    # create one hot char encoding
    for idx in range(len(data_processed)-1):
        in_char = data_processed[idx]
        out_char = data_processed[idx+1]
        output_review[idx][0][in_char] = 1.
        output_labels[idx][0][out_char] = 1.

    return [output_review, output_labels]

RNN implementation

In [4]:
class Module(nn.Module):
    def __init__(self, input_size, output_size, hidden_dim, layers_number):
        super(Module, self).__init__()

        self.hidden_dim = hidden_dim
        self.layers_number = layers_number

        self.rnn = nn.RNN(input_size, hidden_dim, layers_number)
        self.out_layer = nn.Linear(hidden_dim, output_size)

    def forward(self, input, hidden_state = None):
        batch_size = input.size(1)

        if hidden_state == None:
            # find out at what device model is located
            device = next(self.parameters()).device
            hidden_state = torch.zeros(self.layers_number, batch_size, self.hidden_dim).to(device)

        out, hidden_state = self.rnn(input, hidden_state)

        out = self.out_layer(out)

        # out = F.softmax(out, dim=2)

        return out, hidden_state


Create a model and make a test run

In [5]:
net = Module(input_size=128, output_size=128, hidden_dim=32, layers_number=1)
print(net)

data_iter = iter(train_loader)
prepared_data = process_dataset(data_iter.next())
out, hidden = net.forward(prepared_data[0])
print(prepared_data[1].size())
print(out.size())

loss_func = nn.CrossEntropyLoss()
loss = loss_func(out.view(-1, 128), prepared_data[1].view(-1, 128))
print(loss)

Module(
  (rnn): RNN(128, 32)
  (out_layer): Linear(in_features=32, out_features=128, bias=True)
)
torch.Size([1131, 1, 128])
torch.Size([1131, 1, 128])
tensor(4.8765, grad_fn=<DivBackward1>)


Evaluate if we can use GPU and set it up

In [6]:
device = torch.device("cuda:0" if torch.cuda.is_available else "cpu")

if(torch.cuda.is_available):
    torch.cuda.empty_cache()
    net.to(device)

Testing routine

In [7]:
def generate_sequence(model: Module, length, start_char) -> None:
    output = [start_char]
    input_char = start_char.encode("ascii")[0]
    input_emb = torch.zeros(1, 1, 128).to(device)
    input_emb[0][0][input_char] = 1
    hidden_state = None

    with torch.no_grad():
        for _ in range(length):
            input_emb, hidden_state = model.forward(input_emb, hidden_state=hidden_state)
            char_idx = torch.argmax(input_emb).item()
            output.append(chr(char_idx))
            input_emb = torch.zeros(1, 1, 128).to(device)
            input_emb[0][0][char_idx] = 1

    print(f"Generated output: {''.join(output)}")

In [25]:
generate_sequence(net, 1131, 'T')

Generated output: The British Empire and the French Republic, linked together in their cause and in their need, will defend to the death their native soil, aiding each other like good comrades to the utmost of their strength. Even though large tracts of Europe and many old and famous states have fallen or may fall into the grip of the Gestapo and all the odious apparatus of Nazi rule, we shall not flag or fail. We shall go on to the end, we shall fight in France, we shall fight on the seas and oceans, we shall fight with growing confidence and growing strength in the air, we shall defend our island, whatever the cost may be. We shall fight on the beaches, we shall fight on the landing grounds, we shall fight in the fields and in the streets, we shall fight in the hills; we shall never surrender, and even if, which I do not for a moment believe, this island or a large part of it were subjugated and starving, then our Empire beyond the seas, armed and guarded by the British fleet, would 

In [28]:
sum(p.numel() for p in net.parameters())


9408

Training routine

In [18]:
loss_func = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=0.0002)
epochs = 1000
acc = 0
data_items_in_epoch = 100

pbar = tqdm(range(epochs))
for e in pbar:
    running_loss = 0.

    for data_item in train_loader:

        reviews, labels = process_dataset(data_item)
        labels = labels.to(device)
        reviews = reviews.to(device)

        out, _ = net.forward(reviews)
        loss = loss_func(out.view(-1, 128), labels.view(-1, 128))

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    pbar.set_postfix_str(f"Loss: {running_loss}")
    if(e % 100 == 0):
        generate_sequence(net, 100, "T")


  0%|          | 1/1000 [00:00<03:17,  5.05it/s, Loss: 0.2888025641441345] 

Generated output: The British Empire and the Frend fatm belile, aist fountoug aralledengos of Eveyt. Eurep and in firhi


 10%|█         | 102/1000 [00:11<01:44,  8.58it/s, Loss: 0.08720751851797104]

Generated output: The British Empire and the French Republic, linked together in their cause and in their need, will de


 20%|██        | 202/1000 [00:23<01:41,  7.86it/s, Loss: 0.07222181558609009]

Generated output: The British Empire and the French Republic, linked together in their cause and in their need, will de


 30%|███       | 302/1000 [00:35<01:25,  8.12it/s, Loss: 0.06023353710770607]

Generated output: The British Empire and the French Republic, linked together in their cause and in their need, will de


 40%|████      | 401/1000 [00:46<01:16,  7.79it/s, Loss: 0.05705444887280464]

Generated output: The British Empire and the French Republic, linked together in their cause and in their need, will de


 50%|█████     | 501/1000 [00:58<01:12,  6.89it/s, Loss: 0.05484971031546593] 

Generated output: The British Empire and the French Republic, linked together in their cause and in their need, will de


 60%|██████    | 602/1000 [01:10<00:50,  7.82it/s, Loss: 0.053294263780117035]

Generated output: The British Empire and the French Republic, linked together in their cause and in their need, will de


 70%|███████   | 701/1000 [01:22<00:41,  7.14it/s, Loss: 0.05314226076006889]

Generated output: The British Empire and the French Republic, linked together in their cause and in their need, will de


 80%|████████  | 802/1000 [01:35<00:27,  7.28it/s, Loss: 0.05092659592628479]

Generated output: The British Empire and the French Republic, linked together in their cause and in their need, will de


 90%|█████████ | 902/1000 [01:47<00:12,  7.87it/s, Loss: 0.0497162789106369]

Generated output: The British Empire and the French Republic, linked together in their cause and in their need, will de


100%|██████████| 1000/1000 [01:58<00:00,  8.44it/s, Loss: 0.04854488745331764]
