In [1]:
import torch
import torch.nn as nn

# Creating a RNN model for text generation

At PyBooks, you've been tasked to develop an algorithm that can perform text generation. The project involves auto-completion of book names. To kickstart this project, you decide to experiment with a Recurrent Neural Network (RNN). This way, you can understand the nuances of RNNs before moving to more complex models.

The data variable has been initialized with an excerpt from Alice's Adventures in Wonderland by Lewis Carroll.

* Include an RNN layer and linear layer in RNNmodel class
* Instantiate the RNN model with input size as length of chars, hidden size of 16, and output size as length of chars.

In [20]:
# Excerpt from Alice's Adventures in Wonderland
data = "Alice was beginning to get very tired having nothing to do."
chars = list(set(data))

char_to_ix = {char:i for i, char in enumerate(chars)}
ix_to_char = {i:char for i, char in enumerate(chars)}

In [21]:
# Include an RNN layer and linear layer in RNNmodel class
class RNNmodel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNNmodel, self).__init__()
        self.hidden_size = hidden_size
        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        h0 = torch.zeros(1, x.size(0), self.hidden_size)
        out, _ = self.rnn(x, h0)  
        out = self.fc(out[:, -1, :])  
        return out

# Instantiate the RNN model
model = RNNmodel(len(chars), 16, len(chars))
print(model)

RNNmodel(
  (rnn): RNN(20, 16, batch_first=True)
  (fc): Linear(in_features=16, out_features=20, bias=True)
)


By successfully creating this RNN model, you have taken the first step towards developing an advanced text generation system. By adding the fully connected layer for text generation, you have allowed RNN to predict the next element in a sequence. Let's further train and evaluate the model. Keep going!

# Preparing Inputs and target data

In [22]:
inputs = [char_to_ix[ch] for ch in data[:-1]]
targets = [char_to_ix[ch] for ch in data[1:]]

inputs = torch.tensor(inputs, dtype = torch.long).view(-1,1)

inputs = nn.functional.one_hot(inputs, num_classes=len(chars)).float()

targets = torch.tensor(targets, dtype=torch.long)

# Text generation using RNN - Training and Generation

The team at PyBooks now wants you to train and test the RNN model, which is designed to predict the next character in the sequence based on the provided input for auto-completion of book names. This project will help the team further develop models for text completion.

* Instantiate the loss function which will be used to compute the error of our model.
* Instantiate the optimizer from PyTorch's optimization module.
* Run the model training process by setting the model to the train mode and zeroing the gradients before performing an optimization step.
* After the training process, switch the model to evaluation mode to test it on a sample input.

In [23]:
# Instantiate the loss function
criterion = nn.CrossEntropyLoss()
# Instantiate the optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Train the model
for epoch in range(10000):
    model.train()
    outputs = model(inputs)
    loss = criterion(outputs, targets)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    if (epoch+1) % 1000 == 0:
        print(f'Epoch {epoch+1}/10000, Loss: {loss.item()}')


Epoch 1000/10000, Loss: 2.5971486568450928
Epoch 2000/10000, Loss: 2.415712356567383
Epoch 3000/10000, Loss: 2.1809563636779785
Epoch 4000/10000, Loss: 1.9657992124557495
Epoch 5000/10000, Loss: 1.7834405899047852
Epoch 6000/10000, Loss: 1.6260768175125122
Epoch 7000/10000, Loss: 1.4973680973052979
Epoch 8000/10000, Loss: 1.3980722427368164
Epoch 9000/10000, Loss: 1.3241276741027832
Epoch 10000/10000, Loss: 1.2701470851898193


In [26]:
# Test the model
model.eval()
test_input = char_to_ix['b']
test_input = nn.functional.one_hot(torch.tensor(test_input).view(-1, 1), num_classes=len(chars)).float()
predicted_output = model(test_input)
predicted_char_ix = torch.argmax(predicted_output, 1).item()
print(f"Test Input: 'b', Predicted Output: '{ix_to_char[predicted_char_ix]}'")

Test Input: 'b', Predicted Output: 'e'
