<a href="https://colab.research.google.com/github/dsercam/TC033/blob/main/TC4033_Activity4_Team44.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#<font color='darkorange'><b> TC 5033 :: Advanced Machine Learning Methods </b> </font>
### <font color='darkgray'><b> Activity 4: Building a Simple LSTM Text Generator using WikiText-2</b></font></br></br>
###<font color='darkblue'><b>  Group 44 </b></font>
***Dante Rodrigo Serna Camarillo A01182676***</br>
***Axel Alejandro Tlatoa Villavicencio A01363351***</br>
***Carlos Roberto Torres Ferguson A01215432***</br>
***Felipe de Jesús Gastélum Lizárraga A01114918***


## TC 5033
### Text Generation

<br>

#### Activity 4: Building a Simple LSTM Text Generator using WikiText-2
<br>

- Objective:
    - Gain a fundamental understanding of Long Short-Term Memory (LSTM) networks.
    - Develop hands-on experience with sequence data processing and text generation in PyTorch. Given the simplicity of the model, amount of data, and computer resources, the text you generate will not replace ChatGPT, and results must likely will not make a lot of sense. Its only purpose is academic and to understand the text generation using RNNs.
    - Enhance code comprehension and documentation skills by commenting on provided starter code.
    
<br>

- Instructions:
    - Code Understanding: Begin by thoroughly reading and understanding the code. Comment each section/block of the provided code to demonstrate your understanding. For this, you are encouraged to add cells with experiments to improve your understanding

    - Model Overview: The starter code includes an LSTM model setup for sequence data processing. Familiarize yourself with the model architecture and its components. Once you are familiar with the provided model, feel free to change the model to experiment.

    - Training Function: Implement a function to train the LSTM model on the WikiText-2 dataset. This function should feed the training data into the model and perform backpropagation.

    - Text Generation Function: Create a function that accepts starting text (seed text) and a specified total number of words to generate. The function should use the trained model to generate a continuation of the input text.

    - Code Commenting: Ensure that all the provided starter code is well-commented. Explain the purpose and functionality of each section, indicating your understanding.

    - Submission: Submit your Jupyter Notebook with all sections completed and commented. Include a markdown cell with the full names of all contributing team members at the beginning of the notebook.
    
<br>

- Evaluation Criteria:
    - Code Commenting (60%): The clarity, accuracy, and thoroughness of comments explaining the provided code. You are suggested to use markdown cells for your explanations.

    - Training Function Implementation (20%): The correct implementation of the training function, which should effectively train the model.

    - Text Generation Functionality (10%): A working function is provided in comments. You are free to use it as long as you make sure to uderstand it, you may as well improve it as you see fit. The minimum expected is to provide comments for the given function.

    - Conclusions (10%): Provide some final remarks specifying the differences you notice between this model and the one used  for classification tasks. Also comment on changes you made to the model, hyperparameters, and any other information you consider relevant. Also, please provide 3 examples of generated texts.



> <font color='darkorange'>If needed, we install the required modules to run on colab </font>

In [14]:
!pip install scikit-plot
!pip install torchdata
!pip install tokenizers
!pip install portalocker



In [29]:
import numpy as np
#PyTorch libraries
import torch
import torchtext
from torchtext.datasets import WikiText2
# Dataloader library
from torch.utils.data import DataLoader, TensorDataset
from torch.utils.data.dataset import random_split
# Libraries to prepare the data
from torchtext.data.utils import get_tokenizer
from torchtext.vocab import build_vocab_from_iterator
from torchtext.data.functional import to_map_style_dataset
# neural layers
from torch import nn
from torch.nn import functional as F
import torch.optim as optim
from tqdm import tqdm

import random

>> _device_ variable that we will use later to compute using the GPU if available, will use the CPU otherwise

In [28]:
# Use GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device #see dvice.. we will use GPU

device(type='cuda')

### <font color="darkblue"> **1.1 Instantiate data sets** </font>

> We use _WikiText2()_ from the torchtext.datasets to get our 3 datasets</br>
>>> *train_dataset* -> first returned object, refers to our training data </br>
>>> *val_dataset* -> second returned object, refers to our validation data </br>
>>> *test_dataset* -> third returned object, refers to our test data

In [30]:
train_dataset, val_dataset, test_dataset = WikiText2()


> <font color='darkblue'> Define our tokeniser function, for each line on our iterable text structure, tokenize line </font>

In [31]:
tokeniser = get_tokenizer('basic_english')
def yield_tokens(data):
    for text in data: #for each line on our data set...
        yield tokeniser(text) # tokenize!!!


> <font color='darkblue'> Build our vocubalary using the tokens found in our training data set. </font><br>
> <font color='darkblue'> Add a set of special tokens: </br>
>>>  _\<unk>_ </br>
>>>  _\<pad>_ </br>
>>>  _\<bos>_ </br>
>>>  _\<eos>_ </br>
</br>
We set our token for unknown _vocabs_ **\<unk>** at the first position (0), using the *set_defualt_index()* method

In [32]:
# Build the vocabulary
vocab = build_vocab_from_iterator(yield_tokens(train_dataset), specials=["<unk>", "<pad>", "<bos>", "<eos>"])
#set unknown token at position 0
vocab.set_default_index(vocab["<unk>"])

> Defne function to process our data.
>> Function receives two parameters, an iterable text data set and the sequence length. </br>
>> Tensors are created for the set.

In [33]:
seq_length = 50
def data_process(raw_text_iter, seq_length = 50):
    #tokenizes the items converting them into tensors, appending them to the data
    data = [torch.tensor(vocab(tokeniser(item)), dtype=torch.long) for item in raw_text_iter]
    data = torch.cat(tuple(filter(lambda t: t.numel() > 0, data))) #remove empty tensors


# returns: reshaped tensor with the input data, where the last incomplete sequence is truncated.
# second tensor represents the target data, offset by one step from the input data.
    return (data[:-(data.size(0)%seq_length)].view(-1, seq_length),
            data[1:-(data.size(0)%seq_length-1)].view(-1, seq_length))

# # Create tensors for the training set
x_train, y_train = data_process(train_dataset, seq_length)
x_val, y_val = data_process(val_dataset, seq_length)
x_test, y_test = data_process(test_dataset, seq_length)
# processed data is stored in respective tensor variables

In [34]:
#Creates data tensor for all sets: train, validation and test
train_dataset = TensorDataset(x_train, y_train)
val_dataset = TensorDataset(x_val, y_val)
test_dataset = TensorDataset(x_test, y_test)

In [35]:
batch_size = 64  # choose a batch size that fits your computation resources
# Creates batches of data to be used during training, validation, or testing phases
# batche size is determined by the value provided to the batch_size varibale
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=True, drop_last=True)

In [36]:
# Define the LSTM model
# Feel free to experiment
class LSTMModel(nn.Module):
    def __init__(self, vocab_size, embed_size, hidden_size, num_layers):
        super(LSTMModel, self).__init__()
        self.embeddings = nn.Embedding(vocab_size, embed_size) #embedding defined by inpur vocabluary size
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.lstm = nn.LSTM(embed_size, hidden_size, num_layers, batch_first=True) #long short term memory layer
        self.fc = nn.Linear(hidden_size, vocab_size) # linear output layer

    def forward(self, text, hidden):
        embeddings = self.embeddings(text)
        output, hidden = self.lstm(embeddings, hidden)
        decoded = self.fc(output)
        return decoded, hidden

    def init_hidden(self, batch_size):

        return (torch.zeros(self.num_layers, batch_size, self.hidden_size).to(device),
                torch.zeros(self.num_layers, batch_size, self.hidden_size).to(device))



vocab_size = len(vocab) # vocabulary size
emb_size = 100 # embedding size
neurons = 128 # the dimension of the feedforward network model, i.e. # of neurons
num_layers = 1 # the number of nn.LSTM layers
model = LSTMModel(vocab_size, emb_size, neurons, num_layers) #Instantiate our model.


> Defined our training function.

In [37]:

def train(model, train_loader, loss_function, optimizer, epochs):
    '''
    Train the LSTM model on the WikiText-2 dataset.

    Parameters:
    - model: The LSTM model to be trained.
    - train_loader: DataLoader for the training dataset.
    - loss_function: The loss function to be used during training.
    - optimizer: The optimizer to use for updating model parameters.
    - epochs: Number of epochs to train the model.

    The training process includes the following steps:
    - Loop through the specified number of epochs.
    - In each epoch, loop through the training data.
    - Zero the gradients before each batch to prevent accumulation.
    - Place data (input and target) on the device (GPU/CPU).
    - Initialize hidden states for the LSTM.
    - Run the model and compute the loss.
    - Perform backpropagation and update parameters.
    - Print epoch-wise loss and other relevant information.
    '''

    model = model.to(device=device) # assign the model to the used device, GPU in our case
    model.train()  # set the model to training mode

    for epoch in range(epochs): #iterate over given epochs
        total_loss = 0 #reset epoch loss to zero

        for i, (data, targets) in enumerate(tqdm(train_loader, desc=f"Epoch {epoch+1}/{epochs}")):
            # Place data on the correct device
            data, targets = data.to(device), targets.to(device)

            # Initialize hidden states
            hidden = model.init_hidden(data.size(0))

            # Gradients to zero
            optimizer.zero_grad()

            # Forward pass: compute predictions and loss
            output, _ = model(data, hidden)
            loss = loss_function(output.view(-1, vocab_size), targets.view(-1)) #calculate loss using the provided function

            # Backward pass: compute gradient and update parameters
            loss.backward()
            optimizer.step()

            total_loss += loss.item()

        # Average loss for the epoch
        average_loss = total_loss / len(train_loader)
        print(f"Epoch {epoch+1}/{epochs}, Loss: {average_loss:.4f}")


> To call our training function:
>> - Instantiated the loss function to be used, CrossEntropyLoss in this case </br>
>> - Set our learning rate to 0.0005 </b>
>> - Defined 100 epichs to be used </br>
>> - Instantiated an Adam optimizer to provide to the model </br>
>>> Called our training function, providing our model, the train dataset, the instantiated loss function and optimizer and the number of epoch to be considered for training.


In [38]:
# Call the train function
loss_function = nn.CrossEntropyLoss()
lr = 0.0005
epochs = 100
# Set up the optimizer and loss function for training
optimizer = optim.Adam(model.parameters(), lr=lr)
# Train the model
train(model, train_loader, loss_function, optimizer, epochs)


Epoch 1/100: 100%|██████████| 640/640 [00:26<00:00, 24.45it/s]


Epoch 1/100, Loss: 7.0163


Epoch 2/100: 100%|██████████| 640/640 [00:26<00:00, 24.27it/s]


Epoch 2/100, Loss: 6.4059


Epoch 3/100: 100%|██████████| 640/640 [00:26<00:00, 24.38it/s]


Epoch 3/100, Loss: 6.1596


Epoch 4/100: 100%|██████████| 640/640 [00:25<00:00, 24.69it/s]


Epoch 4/100, Loss: 5.9856


Epoch 5/100: 100%|██████████| 640/640 [00:26<00:00, 24.54it/s]


Epoch 5/100, Loss: 5.8606


Epoch 6/100: 100%|██████████| 640/640 [00:26<00:00, 24.28it/s]


Epoch 6/100, Loss: 5.7622


Epoch 7/100: 100%|██████████| 640/640 [00:26<00:00, 24.40it/s]


Epoch 7/100, Loss: 5.6802


Epoch 8/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 8/100, Loss: 5.6085


Epoch 9/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 9/100, Loss: 5.5450


Epoch 10/100: 100%|██████████| 640/640 [00:26<00:00, 24.59it/s]


Epoch 10/100, Loss: 5.4877


Epoch 11/100: 100%|██████████| 640/640 [00:26<00:00, 24.60it/s]


Epoch 11/100, Loss: 5.4352


Epoch 12/100: 100%|██████████| 640/640 [00:26<00:00, 24.37it/s]


Epoch 12/100, Loss: 5.3866


Epoch 13/100: 100%|██████████| 640/640 [00:26<00:00, 24.08it/s]


Epoch 13/100, Loss: 5.3416


Epoch 14/100: 100%|██████████| 640/640 [00:26<00:00, 24.60it/s]


Epoch 14/100, Loss: 5.2993


Epoch 15/100: 100%|██████████| 640/640 [00:26<00:00, 24.40it/s]


Epoch 15/100, Loss: 5.2595


Epoch 16/100: 100%|██████████| 640/640 [00:26<00:00, 24.26it/s]


Epoch 16/100, Loss: 5.2220


Epoch 17/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 17/100, Loss: 5.1865


Epoch 18/100: 100%|██████████| 640/640 [00:26<00:00, 24.56it/s]


Epoch 18/100, Loss: 5.1522


Epoch 19/100: 100%|██████████| 640/640 [00:26<00:00, 24.46it/s]


Epoch 19/100, Loss: 5.1198


Epoch 20/100: 100%|██████████| 640/640 [00:26<00:00, 24.51it/s]


Epoch 20/100, Loss: 5.0885


Epoch 21/100: 100%|██████████| 640/640 [00:26<00:00, 24.47it/s]


Epoch 21/100, Loss: 5.0590


Epoch 22/100: 100%|██████████| 640/640 [00:26<00:00, 24.46it/s]


Epoch 22/100, Loss: 5.0304


Epoch 23/100: 100%|██████████| 640/640 [00:26<00:00, 24.30it/s]


Epoch 23/100, Loss: 5.0027


Epoch 24/100: 100%|██████████| 640/640 [00:26<00:00, 24.45it/s]


Epoch 24/100, Loss: 4.9763


Epoch 25/100: 100%|██████████| 640/640 [00:26<00:00, 24.06it/s]


Epoch 25/100, Loss: 4.9505


Epoch 26/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 26/100, Loss: 4.9259


Epoch 27/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 27/100, Loss: 4.9018


Epoch 28/100: 100%|██████████| 640/640 [00:26<00:00, 24.59it/s]


Epoch 28/100, Loss: 4.8781


Epoch 29/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 29/100, Loss: 4.8557


Epoch 30/100: 100%|██████████| 640/640 [00:25<00:00, 24.68it/s]


Epoch 30/100, Loss: 4.8333


Epoch 31/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 31/100, Loss: 4.8120


Epoch 32/100: 100%|██████████| 640/640 [00:25<00:00, 24.63it/s]


Epoch 32/100, Loss: 4.7903


Epoch 33/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 33/100, Loss: 4.7699


Epoch 34/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 34/100, Loss: 4.7497


Epoch 35/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 35/100, Loss: 4.7300


Epoch 36/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 36/100, Loss: 4.7106


Epoch 37/100: 100%|██████████| 640/640 [00:25<00:00, 24.69it/s]


Epoch 37/100, Loss: 4.6918


Epoch 38/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 38/100, Loss: 4.6731


Epoch 39/100: 100%|██████████| 640/640 [00:26<00:00, 24.61it/s]


Epoch 39/100, Loss: 4.6549


Epoch 40/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 40/100, Loss: 4.6372


Epoch 41/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 41/100, Loss: 4.6193


Epoch 42/100: 100%|██████████| 640/640 [00:25<00:00, 24.68it/s]


Epoch 42/100, Loss: 4.6020


Epoch 43/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 43/100, Loss: 4.5850


Epoch 44/100: 100%|██████████| 640/640 [00:25<00:00, 24.63it/s]


Epoch 44/100, Loss: 4.5684


Epoch 45/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 45/100, Loss: 4.5517


Epoch 46/100: 100%|██████████| 640/640 [00:25<00:00, 24.63it/s]


Epoch 46/100, Loss: 4.5359


Epoch 47/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 47/100, Loss: 4.5199


Epoch 48/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 48/100, Loss: 4.5043


Epoch 49/100: 100%|██████████| 640/640 [00:25<00:00, 24.63it/s]


Epoch 49/100, Loss: 4.4892


Epoch 50/100: 100%|██████████| 640/640 [00:25<00:00, 24.62it/s]


Epoch 50/100, Loss: 4.4739


Epoch 51/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 51/100, Loss: 4.4592


Epoch 52/100: 100%|██████████| 640/640 [00:26<00:00, 24.60it/s]


Epoch 52/100, Loss: 4.4446


Epoch 53/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 53/100, Loss: 4.4305


Epoch 54/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 54/100, Loss: 4.4165


Epoch 55/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 55/100, Loss: 4.4026


Epoch 56/100: 100%|██████████| 640/640 [00:25<00:00, 24.63it/s]


Epoch 56/100, Loss: 4.3891


Epoch 57/100: 100%|██████████| 640/640 [00:25<00:00, 24.63it/s]


Epoch 57/100, Loss: 4.3761


Epoch 58/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 58/100, Loss: 4.3629


Epoch 59/100: 100%|██████████| 640/640 [00:25<00:00, 24.63it/s]


Epoch 59/100, Loss: 4.3497


Epoch 60/100: 100%|██████████| 640/640 [00:25<00:00, 24.62it/s]


Epoch 60/100, Loss: 4.3373


Epoch 61/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 61/100, Loss: 4.3250


Epoch 62/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 62/100, Loss: 4.3129


Epoch 63/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 63/100, Loss: 4.3007


Epoch 64/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 64/100, Loss: 4.2890


Epoch 65/100: 100%|██████████| 640/640 [00:26<00:00, 24.61it/s]


Epoch 65/100, Loss: 4.2772


Epoch 66/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 66/100, Loss: 4.2657


Epoch 67/100: 100%|██████████| 640/640 [00:26<00:00, 24.60it/s]


Epoch 67/100, Loss: 4.2544


Epoch 68/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 68/100, Loss: 4.2435


Epoch 69/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 69/100, Loss: 4.2325


Epoch 70/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 70/100, Loss: 4.2216


Epoch 71/100: 100%|██████████| 640/640 [00:25<00:00, 24.68it/s]


Epoch 71/100, Loss: 4.2110


Epoch 72/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 72/100, Loss: 4.2004


Epoch 73/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 73/100, Loss: 4.1903


Epoch 74/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 74/100, Loss: 4.1802


Epoch 75/100: 100%|██████████| 640/640 [00:25<00:00, 24.62it/s]


Epoch 75/100, Loss: 4.1702


Epoch 76/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 76/100, Loss: 4.1601


Epoch 77/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 77/100, Loss: 4.1505


Epoch 78/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 78/100, Loss: 4.1409


Epoch 79/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 79/100, Loss: 4.1313


Epoch 80/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 80/100, Loss: 4.1220


Epoch 81/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 81/100, Loss: 4.1127


Epoch 82/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 82/100, Loss: 4.1035


Epoch 83/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 83/100, Loss: 4.0945


Epoch 84/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 84/100, Loss: 4.0856


Epoch 85/100: 100%|██████████| 640/640 [00:25<00:00, 24.67it/s]


Epoch 85/100, Loss: 4.0769


Epoch 86/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 86/100, Loss: 4.0683


Epoch 87/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 87/100, Loss: 4.0596


Epoch 88/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 88/100, Loss: 4.0510


Epoch 89/100: 100%|██████████| 640/640 [00:26<00:00, 24.52it/s]


Epoch 89/100, Loss: 4.0427


Epoch 90/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 90/100, Loss: 4.0344


Epoch 91/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 91/100, Loss: 4.0262


Epoch 92/100: 100%|██████████| 640/640 [00:26<00:00, 24.57it/s]


Epoch 92/100, Loss: 4.0182


Epoch 93/100: 100%|██████████| 640/640 [00:25<00:00, 24.64it/s]


Epoch 93/100, Loss: 4.0102


Epoch 94/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 94/100, Loss: 4.0023


Epoch 95/100: 100%|██████████| 640/640 [00:25<00:00, 24.68it/s]


Epoch 95/100, Loss: 3.9946


Epoch 96/100: 100%|██████████| 640/640 [00:25<00:00, 24.62it/s]


Epoch 96/100, Loss: 3.9869


Epoch 97/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 97/100, Loss: 3.9791


Epoch 98/100: 100%|██████████| 640/640 [00:25<00:00, 24.65it/s]


Epoch 98/100, Loss: 3.9715


Epoch 99/100: 100%|██████████| 640/640 [00:25<00:00, 24.66it/s]


Epoch 99/100, Loss: 3.9642


Epoch 100/100: 100%|██████████| 640/640 [00:26<00:00, 24.36it/s]

Epoch 100/100, Loss: 3.9567





In [40]:
def generate_text(model, start_text, num_words, temperature=1.0):

    model.eval()  # Set the model to evaluation mode
    words = tokeniser(start_text)  # Tokenize the starting text/seed.
    #words nows contains our starting text in a tokenized structure
    hidden = model.init_hidden(1)  # Initialize to hidden state

    for i in range(num_words): #loop for text generation, based on input number of words
        x = torch.tensor([[vocab[word] for word in words[i:]]], dtype=torch.long, device=device) #for each word/token of our text (first time will use our starter text, will get bigger on each loop)
        y_pred, hidden = model(x, hidden) # generate text using our model, based on current text
        last_word_logits = y_pred[0][-1] #get the generated words from our prediction
        p = (F.softmax(last_word_logits / temperature, dim=0).detach()).to(device='cpu').numpy()
        word_index = np.random.choice(len(last_word_logits), p=p)
        words.append(vocab.lookup_token(word_index)) # append genrated text

    return ' '.join(words)

# Generate some text
print(generate_text(model, start_text="I like", num_words=100)) #use " I like " as an starting text/seed, let the function generate 100 words.


i like the show medley for a very high and transparent ’ s boyfriend , met it in nice , after listening good hits , she appeared as a review for typical tv week . then made further creative events . i develop that there is one of the atmosphere of that ford ’ s important station — in excess of <unk> , and then taylor and the word sect of clamp , because marquis of the typical known absolute power for semen is that branch of the previous publication period made a play of about 250 under 2 @ . @
