# Char RNN - Prototype

The main objective of this project is to set up a workflow for defining, training and optimizing neural networks.

We are going to predict next characters from sample text, similar to [char-rnn](https://github.com/karpathy/char-rnn).

We create a notebook which will serve as prototyping our model.

## Load and prepare data

We first download the [The Count of Monte Cristo](https://www.gutenberg.org/ebooks/1184) from Project Gutenberg.

In [2]:
from pathlib import Path
import requests
import zipfile

PATH_DATA = Path("data")
FILENAME_DATA = Path("monte_cristo.txt")
URL_DATA = "https://www.gutenberg.org/files/1184/1184-0.txt"

# Download dataset
PATH_DATA.mkdir(exist_ok = True)
PATH_DATAFILE = PATH_DATA / FILENAME_DATA
if not (PATH_DATAFILE).exists():
    r = requests.get(URL_DATA)
    PATH_DATAFILE.open("wb").write(r.content)

We read the entire text and keep only the interesting lines by removing titles, bibliography…

In [3]:
with open(PATH_DATAFILE, 'r', encoding="utf8") as f:
    lines = f.readlines()
    # Remove start and end of file (not interesting data)
    lines = lines[319:60662]
    chars = ''.join(lines)
            
# Test code

print("Sample text:\n")
print(chars[:276])

Sample text:

On the 24th of February, 1815, the look-out at Notre-Dame de la Garde
signalled the three-master, the Pharaon from Smyrna, Trieste, and
Naples.

As usual, a pilot put off immediately, and rounding the Château d’If,
got on board the vessel between Cape Morgiou and Rion island.


In [4]:
# Test code

print("Total number of chars:", len(chars))
print("Unique chars:", len(set(chars)))

Total number of chars: 2617219
Unique chars: 99


We then create a dictionary for mapping between chars and numbers.

In [5]:
# Adapted from https://github.com/pytorch/examples/blob/master/word_language_model/data.py

class Dictionary(object):
    def __init__(self):
        self.char2idx = {}
        self.idx2char = []

    def add_char(self, char):
        if char not in self.char2idx:
            self.idx2char.append(char)
            self.char2idx[char] = len(self.idx2char) - 1
        return self.char2idx[char]

    def __len__(self):
        return len(self.idx2char)

We finally convert our data from char to token.

In [6]:
import torch

data_dictionary = Dictionary()
tensor_data = torch.LongTensor(len(chars))

for i, c in enumerate(chars):
    tensor_data[i] = data_dictionary.add_char(c)
    
n_elements = len(data_dictionary)
    
# Transform into one-hot (source: https://discuss.pytorch.org/t/convert-int-into-one-hot-format/507/29)
input_data = torch.zeros(len(tensor_data), n_elements).scatter_(1, tensor_data.unsqueeze(-1), 1)
label_data = tensor_data  # we don't need labels in one-hot format

In [7]:
# Test code

print("Sample values:")
print('\n'.join('{1} ({0})'.format(idx, data_dictionary.idx2char[idx.argmax()]) for idx in input_data[506:511]))

Sample values:
i (tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0.]))
g (tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0.]))
g (tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
    

Finally, we split the data between test and validation sets.

In [8]:
split = round(0.98 * len(tensor_data))
train_data, train_label = input_data[:split], label_data[1:split+1]
valid_data, valid_label = input_data[split:-2], label_data[split+1:]

Let's create a class to handle our training data in batch.

In [9]:
class TrainingData():
    
    def __init__(self, train_data, train_label, sequence_per_batch = 64, char_per_sequence = 128):
        
        self.train_data = train_data
        self.train_label = train_label
        self.sequence_per_batch = sequence_per_batch
        self.char_per_sequence = char_per_sequence
        self.length = len(train_data)
        
        # We start reading the text at even sections based on number of sequence per batch
        self.batch_idx = range(0, self.length, self.length // sequence_per_batch)
        self.batch_idx = self.batch_idx[:sequence_per_batch]
        assert len(self.batch_idx) == sequence_per_batch, '{} batches expected vs {} actual'.format(sequence_per_batch,
                                                                                                    len(self.batch_idx))
        
    def next_batch(self):
        
        # loop to the start if we reached the end of text
        self.batch_idx = list(idx if idx + self.char_per_sequence < self.length else 0 for idx in self.batch_idx)
        
        # Extract sequences
        sequences_input = tuple(self.train_data[idx:idx+self.char_per_sequence] for idx in self.batch_idx)
        sequences_label = tuple(self.train_label[idx:idx+self.char_per_sequence] for idx in self.batch_idx)
        
        # Move next idx
        self.batch_idx = (idx + self.char_per_sequence for idx in self.batch_idx)
        
        # Concatenate tensors
        return torch.stack(sequences_input, dim=1), torch.stack(sequences_label, dim=1)

## Create and train a neural network

We will create & optimize different variants of following architecture.

![architecture](img/architecture.png)

We want to optimize the RNN module and will create our architecture so we can easily test different variants by choosing:

* RNN, LSTM or GRU modules

* number of features for hidden states

* number of layers

* dropout between each layer

We can decide to optimize other parameters such as the loss function, optimization algorithm…

In [10]:
import torch.nn as nn

class Model(nn.Module):
    def __init__(self, input_size, batch_size, rnn_module = "RNN", hidden_size = 64, num_layers = 1, dropout = 0):
        super(Model, self).__init__()
        self.input_size = input_size
        self.rnn_module = rnn_module
        self.num_layers = num_layers
        self.hidden_size = hidden_size
        if rnn_module == "RNN":
            self.rnn = nn.RNN(input_size = input_size, hidden_size = hidden_size, num_layers = num_layers, dropout = dropout)
        elif rnn_module == "LSTM":
            self.rnn = nn.LSTM(input_size = input_size, hidden_size = hidden_size, num_layers = num_layers, dropout = dropout)
        elif rnn_module == "GRU":
            self.rnn = nn.GRU(input_size = input_size, hidden_size = hidden_size, num_layers = num_layers, dropout = dropout)
            
        self.output = nn.Linear(hidden_size, input_size)

    def forward(self, input, hidden):
        output = input.view(1, -1, self.input_size)
        output, hidden = self.rnn(output, hidden)
        output = self.output(output[0])
        return output, hidden

    def initHidden(self, batch_size):
        # initialize hidden state with a sequence of word tokens
        if self.rnn_module == "LSTM":
            return torch.zeros(self.num_layers, batch_size, self.hidden_size), torch.zeros(
                self.num_layers, batch_size, self.hidden_size)
        else:
            return torch.zeros(self.num_layers, batch_size, self.hidden_size)

We now need to define a loss and optimizer.

In [11]:
import torch.optim as optim

loss_function = nn.CrossEntropyLoss()
optimizer_function = optim.Adam

We can now train the model.

In [12]:
from tqdm import tnrange
from numpy import random

# Define hyper-parameters
rnn_module = "GRU"
hidden_size = 128
num_layers = 3
dropout = 0.1
epochs = 100
batches_per_epoch = 300
sequence_per_batch = 8
char_per_sequence = 150

# Build the NN
model = Model(len(data_dictionary), sequence_per_batch, rnn_module, hidden_size, num_layers, dropout)
hidden = model.initHidden(sequence_per_batch)

# Use GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device = torch.device("cpu")   # for some reason, it goes much faster in my experiments
train_data = train_data.to(device)
train_label = train_label.to(device)
valid_data = valid_data.to(device)
valid_label = valid_label.to(device)
model.to(device)
hidden = hidden.to(device)

# Define optimizer
optimizer = optimizer_function(model.parameters())

# Load data
training_data = TrainingData(train_data, train_label, sequence_per_batch, char_per_sequence)

valid_length = len(valid_data)

for epoch in tnrange(epochs):
    train_loss = 0   # training loss
    valid_loss = 0   # validation loss
    
    # Training of one epoch
    model.train()
    for i in tnrange(batches_per_epoch):
        
        # Get a batch of sequences
        input_vals, label_vals = training_data.next_batch()

        # Detach hidden layer and reset gradients
        if rnn_module == "LSTM":
            tuple(h.detach_() for h in hidden)
        else:
            hidden.detach_()
        optimizer.zero_grad()
        
        # Forward pass and calculate loss
        loss_sequence = torch.zeros(1, device=device)
        for (input_val, label_val) in zip(input_vals, label_vals):
            output, hidden = model(input_val, hidden)
            loss = loss_function(output, label_val.view(-1))
            loss_sequence += loss
            
        # Backward propagation and weight update
        loss_sequence.backward()
        optimizer.step()
        
        train_loss += loss_sequence.item() / batches_per_epoch / char_per_sequence
        
    # Calculate validation loss
    with torch.no_grad():
        model.eval()
        hidden_valid = model.initHidden(1).to(device)
        for i in range(valid_length-1):
            input_val = valid_data[i]
            label_val = valid_label[i].view(1)
            output, hidden_valid = model(input_val, hidden_valid)
            loss = loss_function(output, label_val)
            valid_loss += loss.item() / (valid_length - 1)
        
    print("Epoch {} - Training loss {} - Validation loss {}".format(epoch+1, train_loss, valid_loss))

HBox(children=(IntProgress(value=0), HTML(value='')))

HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 1 - Training loss 3.1934996656629773 - Validation loss 3.144932621842185


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 2 - Training loss 2.8241447692871127 - Validation loss 2.4369557873105694


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 3 - Training loss 2.219004919433594 - Validation loss 2.0687018499885386


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 4 - Training loss 1.979423703342013 - Validation loss 1.8893669783129634


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 5 - Training loss 1.8262373640272354 - Validation loss 1.7676303601661845


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 6 - Training loss 1.7440616529676654 - Validation loss 1.6796516327052655


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 7 - Training loss 1.6771019402398006 - Validation loss 1.617867936727862


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 8 - Training loss 1.6414653625488282 - Validation loss 1.580928250283886


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 9 - Training loss 1.6024891991509325 - Validation loss 1.5495697640584145


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 10 - Training loss 1.5644219733344191 - Validation loss 1.517771163837093


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 11 - Training loss 1.5249412611219633 - Validation loss 1.49103122921999


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 12 - Training loss 1.476718608940971 - Validation loss 1.467593610196495


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 13 - Training loss 1.488179114108616 - Validation loss 1.4426441057710506


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 14 - Training loss 1.4606490607367626 - Validation loss 1.4251791901414947


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 15 - Training loss 1.4658463846842447 - Validation loss 1.4124646140532293


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 16 - Training loss 1.4516935129801423 - Validation loss 1.402943426306773


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 17 - Training loss 1.4215876363118485 - Validation loss 1.3884444893026875


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 18 - Training loss 1.4112147203233505 - Validation loss 1.383614418471269


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 19 - Training loss 1.3705495961507166 - Validation loss 1.3699844603245122


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 20 - Training loss 1.3904286878797738 - Validation loss 1.3515454359360024


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 21 - Training loss 1.3808272043863943 - Validation loss 1.3518721754932794


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 22 - Training loss 1.4023079586452905 - Validation loss 1.3543114321080731


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 23 - Training loss 1.3781555331759994 - Validation loss 1.3424739416736393


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 24 - Training loss 1.358474062093099 - Validation loss 1.330604566974843


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 25 - Training loss 1.3532692342122405 - Validation loss 1.3349498071905073


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 26 - Training loss 1.3221237708197697 - Validation loss 1.328996052717084


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 27 - Training loss 1.3399309380425355 - Validation loss 1.3089096201311796


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 28 - Training loss 1.3383069573296447 - Validation loss 1.310190858221341


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 29 - Training loss 1.3629359188503698 - Validation loss 1.3080112175446816


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 30 - Training loss 1.3396507527669272 - Validation loss 1.3078746282498113


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 31 - Training loss 1.3195872148301877 - Validation loss 1.296608580798723


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 32 - Training loss 1.3151345499674465 - Validation loss 1.2977992571244428


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 33 - Training loss 1.2923541120741104 - Validation loss 1.2932369851328658


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 34 - Training loss 1.3075491939968522 - Validation loss 1.2839504947331157


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 35 - Training loss 1.3132049391004783 - Validation loss 1.2867629952271487


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 36 - Training loss 1.3344758795844185 - Validation loss 1.284088268614396


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 37 - Training loss 1.3123813540988498 - Validation loss 1.2875752035905141


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 38 - Training loss 1.2959437035454662 - Validation loss 1.275291035691262


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 39 - Training loss 1.2872289950900608 - Validation loss 1.2777727797802245


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 40 - Training loss 1.2749268385145396 - Validation loss 1.2717064864920355


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 41 - Training loss 1.2841241187201609 - Validation loss 1.2655916972563954


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 42 - Training loss 1.2956213199191617 - Validation loss 1.2669438933420338


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 43 - Training loss 1.3156546976725259 - Validation loss 1.265419060418078


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 44 - Training loss 1.289630306667751 - Validation loss 1.2672315450078917


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 45 - Training loss 1.2744547732883025 - Validation loss 1.259761225271879


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 46 - Training loss 1.2650273841010193 - Validation loss 1.2621033526211791


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 47 - Training loss 1.2617711903889974 - Validation loss 1.2588547977734135


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 48 - Training loss 1.264489363945854 - Validation loss 1.261624020184742


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 49 - Training loss 1.2828120853000209 - Validation loss 1.25296979793596


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 50 - Training loss 1.303636866929795 - Validation loss 1.2518428516919111


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 51 - Training loss 1.2709747802734381 - Validation loss 1.2525799082081723


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 52 - Training loss 1.2596803198920352 - Validation loss 1.2519142943243735


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 53 - Training loss 1.249829967244466 - Validation loss 1.2465428006295782


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 54 - Training loss 1.2529064341227207 - Validation loss 1.2581604392830124


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 55 - Training loss 1.2499294630262585 - Validation loss 1.2410503541695137


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 56 - Training loss 1.2738812313503682 - Validation loss 1.242805440968561


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 57 - Training loss 1.2871165456136051 - Validation loss 1.246214343404692


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 58 - Training loss 1.2588784756130642 - Validation loss 1.2417222346751418


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 59 - Training loss 1.2468170928955085 - Validation loss 1.2397848702914283


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 60 - Training loss 1.2350561974419496 - Validation loss 1.2393035695142633


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 61 - Training loss 1.244001718478732 - Validation loss 1.242062830952669


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 62 - Training loss 1.2376413760715061 - Validation loss 1.2330759368005266


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 63 - Training loss 1.2668542283799913 - Validation loss 1.237940417049295


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 64 - Training loss 1.2788582838270406 - Validation loss 1.2308618418388175


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 65 - Training loss 1.246322890557183 - Validation loss 1.241011555536764


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 66 - Training loss 1.2388984425862635 - Validation loss 1.2322533950256345


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 67 - Training loss 1.2204945475260418 - Validation loss 1.2329863582644784


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 68 - Training loss 1.239317109849718 - Validation loss 1.2276800868253328


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 69 - Training loss 1.2315746273464623 - Validation loss 1.219639256490645


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 70 - Training loss 1.26089594285753 - Validation loss 1.2244549242950085


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 71 - Training loss 1.265698405626085 - Validation loss 1.2244146735186021


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 72 - Training loss 1.2385551415337452 - Validation loss 1.2238493828052106


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 73 - Training loss 1.23040990431044 - Validation loss 1.2234954701819012


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 74 - Training loss 1.2108778408474392 - Validation loss 1.223876023382189


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 75 - Training loss 1.2311749535454661 - Validation loss 1.2197631905904485


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 76 - Training loss 1.2291939056396484 - Validation loss 1.2142589009209432


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 77 - Training loss 1.253682006157769 - Validation loss 1.222154915685349


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 78 - Training loss 1.249238162231446 - Validation loss 1.2175382773269698


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 79 - Training loss 1.2334407148573125 - Validation loss 1.2242471641810702


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 80 - Training loss 1.2214918721516934 - Validation loss 1.2208462882674864


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 81 - Training loss 1.2034132222493497 - Validation loss 1.2213514759487702


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 82 - Training loss 1.22714400431315 - Validation loss 1.2093374747230177


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 83 - Training loss 1.227272809176975 - Validation loss 1.212153685963999


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 84 - Training loss 1.2464317338731556 - Validation loss 1.212326174549108


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 85 - Training loss 1.239996100870768 - Validation loss 1.214297593110903


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 86 - Training loss 1.2252778096516939 - Validation loss 1.2112482273131657


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 87 - Training loss 1.2202845262315545 - Validation loss 1.2105570511581836


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 88 - Training loss 1.191255839369033 - Validation loss 1.214684119013765


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 89 - Training loss 1.2249811720106338 - Validation loss 1.2112199786615747


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 90 - Training loss 1.2213348337809238 - Validation loss 1.2091000313775682


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 91 - Training loss 1.242675571356879 - Validation loss 1.20865153225834


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 92 - Training loss 1.2329669365776916 - Validation loss 1.2104571910053834


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 93 - Training loss 1.2132219936794708 - Validation loss 1.2066833318590697


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 94 - Training loss 1.2176156453450515 - Validation loss 1.2112370117047715


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 95 - Training loss 1.1904776173909508 - Validation loss 1.2069616290073273


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 96 - Training loss 1.2138307980007594 - Validation loss 1.2009271850601106


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 97 - Training loss 1.2163560499403212 - Validation loss 1.2052458394141907


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 98 - Training loss 1.2454711574978303 - Validation loss 1.204992928874274


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 99 - Training loss 1.2214428853352868 - Validation loss 1.2097688813228358


HBox(children=(IntProgress(value=0, max=300), HTML(value='')))

Epoch 100 - Training loss 1.2097244310167101 - Validation loss 1.2023583676588478



We make sure to save the trained model.

In [13]:
# Save model
PATH_MODEL = "model.pt"
torch.save(model, PATH_MODEL)

# Load model
model = torch.load(PATH_MODEL)
model.eval()

  "type " + obj.__name__ + ". It won't be checked "


Model(
  (rnn): GRU(99, 128, num_layers=3, dropout=0.1)
  (output): Linear(in_features=128, out_features=99, bias=True)
)

## Test the model

We finally test the model by predicting a few characters.

In [16]:
from numpy import random

with torch.no_grad():
    
    # Define a sequence of characters to initialize the hidden states
    init_chars = "The "

    init_data = torch.LongTensor(len(init_chars))
    for i, c in enumerate(init_chars):
        init_data[i] = data_dictionary.char2idx[c]

    # Transform into one-hot
    init_data = torch.zeros(len(init_data), len(data_dictionary)).scatter_(1, init_data.unsqueeze(-1), 1)

    # Initialize hidden layer and feed sequence of characters to the model
    hidden = model.initHidden(1)
    for init_char in init_data:
        output, hidden = model(init_char, hidden)

    # Predict next characters one at a time
    number_chars = 500
    chars = init_chars
    for _ in range(number_chars):

        # Calculate probability distribution of outputs with a temperature of 0.5
        prob = nn.Softmax(1)(output/0.5).squeeze().numpy()

        # Sample from outputs
        output_idx = random.choice(len(prob), p = prob)

        # Extract predicted char
        predicted_char = data_dictionary.idx2char[output_idx]
        chars += predicted_char

        # Transform predicted char into one-hot vector
        output_idx = torch.LongTensor([[output_idx]])
        next_input = torch.zeros(len(output_idx), len(data_dictionary)).scatter_(1, output_idx, 1)

        # Feed into NN to predict next char
        output, hidden = model(next_input, hidden)

    # Print predicted sequence
    print("Initializing sequence:", init_chars)
    print("Predicted sequence:", chars)    

Initializing sequence: The 
Predicted sequence: The the
traces as they expressed in the same sound of the count with the
abbé. A man who had not to him at the reason that he was going
to the tone of the count; but the right protector for his father is
the government of a moment for himself.

The major of the traveller endeavored on the morning to see a long in
a grave of the painful direction, and the day returned to the count had a
door of the storm in the baroness had that he had the bright of
the less indebation of the short way the next deman


So it seems to be working and we now want to do some optimization.

## Optimization

At this point we would want to fine-tune by implementing:

* monitoring features (plot graph in real time)

* comparison of past experiments

However we would be wasting our time. [Weights & Biases](https://www.wandb.com/) will do that for us quicker and will provide additional insight tools.

We have now completed our prototype and it is time to move our code to a separate Python file (or multiple for organized people). This will help in cleaning our code and running multiple experiments at the same time (locally or remotely).

Refer to my tutorial for more details.