<a href="https://colab.research.google.com/github/aarmentamna/machine_learning_advance/blob/main/TC4033_Activity4_42.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Maestría en Inteligencia Artificial Aplicada**
### **Curso: ADVANCED MACHINE LEARNING METHODS**
## Tecnológico de Monterrey
### Dr. José Antonio Cantoral Ceballos

## Activity Week 9
### **Text Generator.**

*TEAM MEMBERS:*

*   Roberto Romero Vielma - A00822314
*   José Javier Granados Hernández - A00556717
*   Aquiles Yonatan Armenta Hernandez - A01793252
*   Alan Avelino Fernández Juárez - A00989308

## TC 5033
### Text Generation

<br>

#### Activity 4: Building a Simple LSTM Text Generator using WikiText-2
<br>

- Objective:
    - Gain a fundamental understanding of Long Short-Term Memory (LSTM) networks.
    - Develop hands-on experience with sequence data processing and text generation in PyTorch. Given the simplicity of the model, amount of data, and computer resources, the text you generate will not replace ChatGPT, and results must likely will not make a lot of sense. Its only purpose is academic and to understand the text generation using RNNs.
    - Enhance code comprehension and documentation skills by commenting on provided starter code.
    
<br>

- Instructions:
    - Code Understanding: Begin by thoroughly reading and understanding the code. Comment each section/block of the provided code to demonstrate your understanding. For this, you are encouraged to add cells with experiments to improve your understanding

    - Model Overview: The starter code includes an LSTM model setup for sequence data processing. Familiarize yourself with the model architecture and its components. Once you are familiar with the provided model, feel free to change the model to experiment.

    - Training Function: Implement a function to train the LSTM model on the WikiText-2 dataset. This function should feed the training data into the model and perform backpropagation.

    - Text Generation Function: Create a function that accepts starting text (seed text) and a specified total number of words to generate. The function should use the trained model to generate a continuation of the input text.

    - Code Commenting: Ensure that all the provided starter code is well-commented. Explain the purpose and functionality of each section, indicating your understanding.

    - Submission: Submit your Jupyter Notebook with all sections completed and commented. Include a markdown cell with the full names of all contributing team members at the beginning of the notebook.
    
<br>

- Evaluation Criteria:
    - Code Commenting (60%): The clarity, accuracy, and thoroughness of comments explaining the provided code. You are suggested to use markdown cells for your explanations.

    - Training Function Implementation (20%): The correct implementation of the training function, which should effectively train the model.

    - Text Generation Functionality (10%): A working function is provided in comments. You are free to use it as long as you make sure to uderstand it, you may as well improve it as you see fit. The minimum expected is to provide comments for the given function.

    - Conclusions (10%): Provide some final remarks specifying the differences you notice between this model and the one used  for classification tasks. Also comment on changes you made to the model, hyperparameters, and any other information you consider relevant. Also, please provide 3 examples of generated texts.



##Install mising libraries

To run the code successfully, you need to install the PyTorch and torchvision libraries. The following code uses pip to install these libraries:

In [None]:
!pip install portalocker>=2.0.0
!pip install torchtext



In [None]:
#PyTorch libraries
import torch
import torchtext
from torchtext.datasets import WikiText2
# Dataloader library
from torch.utils.data import DataLoader, TensorDataset
from torch.utils.data.dataset import random_split
# Libraries to prepare the data
from torchtext.data.utils import get_tokenizer
from torchtext.vocab import build_vocab_from_iterator
from torchtext.data.functional import to_map_style_dataset
# neural layers
from torch import nn
from torch.nn import functional as F
import torch.optim as optim
from tqdm import tqdm

import numpy as np
import random

This code imports various libraries and modules related to PyTorch, data handling, and neural networks. Here's a summary of the imported libraries:

1. PyTorch Libraries:
  - import torch: Core PyTorch library.
  - import torchtext: Library for text processing with PyTorch.
  - from torchtext.datasets import WikiText2: Importing the WikiText2 dataset from the torchtext.datasets module.

2. Dataloader Library:
  - from torch.utils.data import DataLoader, TensorDataset: Importing DataLoader and TensorDataset classes for efficient data loading.

3. Libraries to Prepare the Data:
  - from torchtext.data.utils import get_tokenizer: Importing a tokenizer for text processing.
  - from torchtext.vocab import build_vocab_from_iterator: Importing a function to build vocabulary from iterators.
  - from torchtext.data.functional import to_map_style_dataset: Importing a function to convert a dataset to a map-style dataset.

4. Neural Layers:
  - from torch import nn: Importing the neural network module from PyTorch.
  - from torch.nn import functional as F: Importing the functional module from PyTorch, often used for activation functions.
  - import torch.optim as optim: Importing the PyTorch module for optimization algorithms.

5. Other Utilities:
  - from tqdm import tqdm: Importing the tqdm library for creating progress bars during iterations.
  - import numpy as np: Importing NumPy for numerical operations.
  - import random: Importing the random module for generating random numbers.

##Define M1 Mac GPU Speed or CUDA if available

This code checks for the availability of CUDA-enabled GPU (Graphics Processing Unit) for PyTorch and sets the device accordingly. Here's a step-by-step explanation of the code:

1. It uses `torch.cuda.is_available()` to check if a CUDA-enabled GPU is available.

2. If a GPU is available (torch.cuda.is_available() returns True), it sets the device to 'cuda' using torch.device('cuda').

3. If a GPU is not available, it attempts to set the device to 'mps' (Multi-Process Service), which is a feature provided by some CUDA-enabled GPUs for improved multi-GPU performance. This is done inside a try-except block, which means if 'mps' is not supported or available, it will fall back to using the CPU.

4. If neither 'cuda' nor 'mps' is available, it sets the device to 'cpu' using torch.device('cpu').

5. Finally, it prints the selected device to the console, indicating whether it's 'cuda' (GPU), 'mps', or 'cpu'.

In [None]:
if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    try:
        device = torch.device('mps')
    except:
        device = torch.device('cpu')
print(device)

cuda


Next line of code is loading the WikiText2 dataset and assigning it to three variables: train_dataset, val_dataset, and test_dataset. Here's a step-by-step explanation of the code:

1. Dataset Loading: The `WikiText2()` function is used to load the WikiText2 dataset. This dataset is commonly used in natural language processing tasks, particularly for language modeling.

2. Data Splitting: The WikiText2 dataset is split into three subsets:
  - `train_dataset`: This subset is typically used for training a machine learning model. It contains examples that the model will learn from.
  - `val_dataset`: This subset is used for validation during the training process. It helps monitor the model's performance on data it has not seen during training and can be used for hyperparameter tuning.
  - `test_dataset`: This subset is reserved for evaluating the final performance of the trained model. It is used to assess how well the model generalizes to new, unseen data.

3. Variable Assignment: The three datasets (train, validation, and test) are assigned to the variables train_dataset, val_dataset, and test_dataset, respectively.

In [None]:
train_dataset, val_dataset, test_dataset = WikiText2()

Next code is defining a tokenization process using the get_tokenizer function from the torchtext.data.utils module. Here's a step-by-step explanation of the code:

1. Tokenization Setup:

  - `tokeniser = get_tokenizer('basic_english')`: This line sets up a tokenizer named tokeniser using the 'basic_english' preset provided by get_tokenizer. This particular tokenizer is designed for basic English text processing.

2. Tokenization Function:

  - `def yield_tokens(data):`: This line defines a function named yield_tokens that takes a dataset (data) as input.
  - `for text in data:`: It iterates over each text in the dataset.
  - `yield tokeniser(text)`: For each text, it uses the tokeniser to tokenize the text and yields the resulting tokens. The yield keyword is used here to create a generator function, allowing tokens to be generated one at a time rather than storing all tokens in memory at once.

In [None]:
tokeniser = get_tokenizer('basic_english')
def yield_tokens(data):
    for text in data:
        yield tokeniser(text)

Next code is focused on building a vocabulary from a training dataset for natural language processing tasks. Here's a step-by-step explanation of the code:

1. Build Vocabulary:
  - `vocab = build_vocab_from_iterator(yield_tokens(train_dataset), specials=["<unk>", "<pad>", "<bos>", "<eos>"])`:
    - `yield_tokens(train_dataset)`: This generates tokens from the training dataset using the yield_tokens function. It is passed as an iterator to build_vocab_from_iterator.
    - `specials=["<unk>", "<pad>", "<bos>", "<eos>"]`: These are special tokens that are added to the vocabulary. For example, `<unk>` represents unknown words, `<pad>` represents padding, `<bos>` is the beginning-of-sequence token, and `<eos>` is the end-of-sequence token.
    - `vocab = ...`: The result is a vocabulary (`vocab`) constructed from the tokens in the training dataset.

2. Set Default Index for Unknown Tokens:
  - `vocab.set_default_index(vocab["<unk>"])`:
    -  `vocab["<unk>"]`: Retrieves the index of the `<unk>` (unknown) token in the vocabulary.
    - `vocab.set_default_index(...)`: Sets the default index of the vocabulary to the index of the `<unk>` token. This means that if an out-of-vocabulary word is encountered during later processing, it will be represented by the index of the `<unk>` token.



In [None]:
# Build the vocabulary
vocab = build_vocab_from_iterator(yield_tokens(train_dataset), specials=["<unk>", "<pad>", "<bos>", "<eos>"])
#set unknown token at position 0
vocab.set_default_index(vocab["<unk>"])

Next code is part of the data processing pipeline for preparing sequences of text data for training a model. Here's a step-by-step explanation of the code:

1. Set Sequence Length:
  - `seq_length = 50`: This line defines the sequence length, indicating the number of tokens in each sequence. In this case, it is set to 50.

2. Data Processing Function:
  - `def data_process(raw_text_iter, seq_length=50)`: This function takes an iterator (`raw_text_iter`) containing raw text data and an optional parameter for the sequence length.
  - `data = [torch.tensor(vocab(tokeniser(item)), dtype=torch.long) for item in raw_text_iter]`: It tokenizes each item in the raw text iterator using the `vocab` and `tokeniser` functions, then converts the resulting tokens into PyTorch tensors of type `long`.
  - `data = torch.cat(tuple(filter(lambda t: t.numel() > 0, data)))`: It concatenates the tensors into a single tensor along the first dimension (using `torch.cat`). The `filter` function is used to remove any empty tensors.
  - The function returns a tuple containing two tensors:
    - `data[:-(data.size(0) % seq_length)].view(-1, seq_length)`: This tensor represents the input data. It is reshaped into sequences of length `seq_length`. The part before the last sequence that doesn't fit evenly into s`eq_length` is discarded.
    - `data[1:-(data.size(0) % seq_length - 1)].view(-1, seq_length)`: This tensor represents the target data, shifted by one position compared to the input data. It is also reshaped into sequences of length `seq_length`.

4. Create Tensors for Training, Validation, and Test Sets:
  - `x_train, y_train = data_process(train_dataset, seq_length)`: Calls the `data_process` function on the training dataset, generating input (`x_train`) and target (`y_train`) tensors.
  - Similar lines create tensors for the validation (`x_val, y_val`) and test (`x_test, y_test`) datasets.

In [None]:
seq_length = 50
def data_process(raw_text_iter, seq_length = 50):
    data = [torch.tensor(vocab(tokeniser(item)), dtype=torch.long) for item in raw_text_iter]
    data = torch.cat(tuple(filter(lambda t: t.numel() > 0, data))) #remove empty tensors
#     target_data = torch.cat(d)
    return (data[:-(data.size(0)%seq_length)].view(-1, seq_length),
            data[1:-(data.size(0)%seq_length-1)].view(-1, seq_length))

# # Create tensors for the training set
x_train, y_train = data_process(train_dataset, seq_length)
x_val, y_val = data_process(val_dataset, seq_length)
x_test, y_test = data_process(test_dataset, seq_length)

Next code is creating PyTorch TensorDataset instances for the training, validation, and test sets. Here's a summary:

- Create TensorDataset Instances:
  - `train_dataset = TensorDataset(x_train, y_train)`: This line creates a `TensorDataset` instance for the training set. It takes two tensors (`x_train` and `y_train`) representing input and target sequences, respectively.
  - `val_dataset = TensorDataset(x_val, y_val)`: Similarly, a `TensorDataset` instance is created for the validation set.
  - `test_dataset = TensorDataset(x_test, y_test)`: Likewise, a `TensorDataset` instance is created for the test set.

In [None]:
train_dataset = TensorDataset(x_train, y_train)
val_dataset = TensorDataset(x_val, y_val)
test_dataset = TensorDataset(x_test, y_test)

Next code sets up PyTorch DataLoader instances for the training, validation, and test sets, enabling efficient loading of data in batches during model training and evaluation. Here's a summary:

1. Define Batch Size:
   - `batch_size = 64`: This line sets the batch size, which is the number of samples processed in one iteration during training.

2. Create `DataLoader` Instances:
  - `train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, drop_last=True)`: This line creates a `DataLoader` for the training set (`train_dataset`). It specifies the batch size, enables shuffling of the data before each epoch (`huffle=True`), and drops the last incomplete batch if the dataset size is not divisible by the batch size (`drop_last=True`).
  - `val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=True, drop_last=True)`: Similarly, a `DataLoader` is created for the validation set (`val_dataset`).
  - `test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=True, drop_last=True)`: Likewise, a `DataLoader` is created for the test set (`test_dataset`).

In [None]:
batch_size = 64  # choose a batch size that fits your computation resources
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=True, drop_last=True)

Next code defines an LSTM (Long Short-Term Memory) model using PyTorch's nn.Module interface. Here's a step-by-step summary:

1. Define the LSTM Model Class:
  - `class LSTMModel(nn.Module):`: Defines a new class `LSTMModel` that inherits from `nn.Module`, indicating that it is a PyTorch model.
  - `def __init__(self, vocab_size, embed_size, hidden_size, num_layers):`: Initializes the model with specified hyperparameters.
    - `self.embeddings = nn.Embedding(vocab_size, embed_size)`: Defines an embedding layer with vocabulary size `vocab_size` and embedding size `embed_size`.
    - `self.hidden_size = hidden_size`: Stores the hidden size as an attribute.
    - `self.num_layers = num_layers`: Stores the number of LSTM layers as an attribute.
    - `self.lstm = nn.LSTM(embed_size, hidden_size, num_layers, batch_first=True)`: Defines an LSTM layer with input size `embed_size`, hidden size `hidden_size`, `num_layers` LSTM layers, and `batch_first=True` indicating that the input and output tensors are provided as (batch, seq, feature).
    - `self.fc = nn.Linear(hidden_size, vocab_size)`: Defines a fully connected layer with input size `hidden_size` and output size `vocab_size`, mapping LSTM output to vocabulary size.

  - `def forward(self, text, hidden):`: Defines the forward pass of the model.
    - `embeddings = self.embeddings(text)`: Applies the embedding layer to the input text.
    - `output, hidden = self.lstm(embeddings, hidden)`: Applies the LSTM layer to the embeddings, producing output and hidden states.
    - `decoded = self.fc(output)`: Applies the fully connected layer to the LSTM output.
    - `return decoded, hidden`: Returns the decoded output and the hidden state.

  - `def init_hidden(self, batch_size):`: Initializes the hidden state of the LSTM.
    - `return (torch.zeros(self.num_layers, batch_size, self.hidden_size).to(device), torch.zeros(self.num_layers, batch_size, self.hidden_size).to(device))`: Returns a tuple containing two tensors of zeros, representing the initial hidden state and cell state of the LSTM.

2. Set Model Hyperparameters:
  - `vocab_size = len(vocab)`: Determines the vocabulary size based on the size of the vocabulary (`vocab`).
  - `emb_size = 100`: Sets the embedding size to 100.
  - `neurons = 128`: Sets the number of neurons (hidden size) to 128.
  - `num_layers = 1`: Sets the number of LSTM layers to 1.

3. Instantiate the Model:
  - `model = LSTMModel(vocab_size, emb_size, neurons, num_layers)`: Creates an instance of the `LSTMModel` with the specified hyperparameters.

In [None]:
# Define the LSTM model
# Feel free to experiment
class LSTMModel(nn.Module):
    def __init__(self, vocab_size, embed_size, hidden_size, num_layers):
        super(LSTMModel, self).__init__()
        self.embeddings = nn.Embedding(vocab_size, embed_size)
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.lstm = nn.LSTM(embed_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, vocab_size)

    def forward(self, text, hidden):
        embeddings = self.embeddings(text)
        output, hidden = self.lstm(embeddings, hidden)
        decoded = self.fc(output)
        return decoded, hidden

    def init_hidden(self, batch_size):

        return (torch.zeros(self.num_layers, batch_size, self.hidden_size).to(device),
                torch.zeros(self.num_layers, batch_size, self.hidden_size).to(device))



vocab_size = len(vocab) # vocabulary size
emb_size = 100 # embedding size
neurons = 128 # the dimension of the feedforward network model, i.e. # of neurons
num_layers = 1 # the number of nn.LSTM layers
model = LSTMModel(vocab_size, emb_size, neurons, num_layers)


Next code defines a training function for a given model using PyTorch. Here's a step-by-step summary:

1. Define the Training Function:
  - `def train(model, epochs, optimizer, criterion):`: Defines a training function that takes a model, the number of epochs, an optimizer, and a loss criterion as inputs.

2. Move Model to Device and Set to Train Mode:
  - `model = model.to(device=device)`: Moves the model to the specified device (CPU or GPU).
  - `model.train()`: Sets the model in training mode, enabling features like dropout.

3. Training Loop Over Epochs:
  - `for epoch in range(epochs):`: Iterates over the specified number of epochs.

4. Initialize Metrics:
  - `total_loss = 0.0`: Initializes the total loss for the epoch.
  - `correct_predictions = 0`: Initializes the count of correct predictions.
  - `total_samples = 0`: Initializes the total number of samples processed.

5. Training Loop Over Batches:
  - `for i, (data, targets) in enumerate(train_loader):`: Iterates over batches in the training data.
  - `optimizer.zero_grad()`: Zeroes the gradients to prevent accumulation.

6. Move Data to Device and Initialize Hidden State:
  - `data, targets = data.to(device), targets.to(device)`: Moves the input data and targets to the specified device.
  - `hidden = model.init_hidden(data.size(0))`: Initializes the hidden state for the LSTM model.

7. Forward Pass:
  - `outputs, hidden = model(data, hidden)`: Performs a forward pass through the model.

8. Calculate Loss and Update Metrics:
  - `loss = criterion(outputs.view(-1, len(vocab)), targets.view(-1))`: Calculates the loss using the specified criterion.
  - `total_loss += loss.item()`: Accumulates the total loss for the epoch.
  - `_, predicted = torch.max(outputs, 2)`: Obtains predictions by selecting the index with the maximum value along the third dimension.
  - `correct_predictions += (predicted == targets).sum().item()`: Counts the number of correct predictions.
  - `total_samples += targets.numel()`: Updates the total number of processed samples.

9. Backward Pass and Optimization:
  - `loss.backward()`: Performs a backward pass to compute gradients.
  - `optimizer.step()`: Updates the model parameters using the optimizer.

10. Compute Metrics for the Epoch and Print:
  - `loss = total_loss / len(train_loader)`: Computes the average loss for the epoch.
  - `accuracy = correct_predictions / total_samples`: Computes the accuracy for the epoch.
  - `print(f'Epoch [{epoch + 1}/{epochs}], Loss: {loss:.4f}, Accuracy: {accuracy * 100:.2f}%')`: Prints the epoch, loss, and accuracy.

In [None]:
def train(model, epochs, optimizer, criterion):
    model = model.to(device=device)
    model.train()

    for epoch in range(epochs):
        total_loss = 0.0
        correct_predictions = 0
        total_samples = 0

        for i, (data, targets) in enumerate(train_loader):
            optimizer.zero_grad()

            # Move data to the device
            data, targets = data.to(device), targets.to(device)

            # Initialize hidden state
            hidden = model.init_hidden(data.size(0))

            # Forward pass
            outputs, hidden = model(data, hidden)

            # Calculate the loss
            loss = criterion(outputs.view(-1, len(vocab)), targets.view(-1))
            total_loss += loss.item()

            # Compute accuracy
            _, predicted = torch.max(outputs, 2)
            correct_predictions += (predicted == targets).sum().item()
            total_samples += targets.numel()

            # Backward pass and optimization
            loss.backward()
            optimizer.step()

        loss = total_loss / len(train_loader)
        accuracy = correct_predictions / total_samples
        print(f'Epoch [{epoch + 1}/{epochs}], Loss: {loss:.4f}, Accuracy: {accuracy * 100:.2f}%')


Next code is calling the train function to train a given model using the specified loss function, learning rate, and number of epochs. Here's a step-by-step summary:

1. Define Loss Function:
  - `loss_function = nn.CrossEntropyLoss()`: Creates an instance of the CrossEntropyLoss, which is commonly used for classification tasks.

2. Set Learning Rate and Number of Epochs:
  - `lr = 0.001`: Sets the learning rate for the Adam optimizer to 0.001.
  - `epochs = 10:` Specifies the number of training epochs.

3. Initialize Adam Optimizer:
  - `optimiser = optim.Adam(model.parameters(), lr=lr)`: Creates an Adam optimizer for the model's parameters with the specified learning rate.

4. Call the Training Function:
  - `train(model, epochs, optimiser, loss_function)`: Calls the `train` function to train the model. It passes the model, number of epochs, optimizer, and loss function as arguments.

In [None]:
# Call the train function
loss_function = nn.CrossEntropyLoss()
lr = 0.001
epochs = 10
optimiser = optim.Adam(model.parameters(), lr=lr)
train(model, epochs, optimiser,loss_function)

Epoch [1/10], Loss: 3.3948, Accuracy: 37.21%
Epoch [2/10], Loss: 3.3670, Accuracy: 37.66%
Epoch [3/10], Loss: 3.3538, Accuracy: 37.88%
Epoch [4/10], Loss: 3.3420, Accuracy: 38.06%
Epoch [5/10], Loss: 3.3307, Accuracy: 38.24%
Epoch [6/10], Loss: 3.3203, Accuracy: 38.40%
Epoch [7/10], Loss: 3.3106, Accuracy: 38.54%
Epoch [8/10], Loss: 3.3012, Accuracy: 38.68%
Epoch [9/10], Loss: 3.2919, Accuracy: 38.82%
Epoch [10/10], Loss: 3.2831, Accuracy: 38.94%


Next code defines a function for text generation using a trained model. Here's a step-by-step summary:

1. Define Text Generation Function:
  - `def generate_text(model, start_text, num_words, temperature=1.0):`: Defines a function for generating text given a trained model, a starting text sequence (`start_text`), and the number of words to generate (`num_words`). The optional parameter `temperature` controls the level of randomness in the generation.

2. Set Model to Evaluation Mode:
  - `model.eval()`: Sets the model to evaluation mode, indicating that no gradients need to be computed during generation.

3. Tokenize the Starting Text and Initialize Hidden State:
  - `words = tokeniser(start_text)`: Tokenizes the starting text using the provided tokenizer.
  - `hidden = model.init_hidden(1)`: Initializes the hidden state for the LSTM model. The batch size is set to 1 for text generation.

4. Generate Text Loop:
  - `with torch.no_grad():`: Ensures that no gradients are calculated during the generation loop.
  - `for _ in range(num_words):`: Iterates over the specified number of words to generate.

5. Prepare Input for the Model:
  - `x = torch.tensor([[vocab[word] for word in words[-seq_length:]]], dtype=torch.long, device=device)`: Creates a tensor representing the input sequence for the model. It includes the last `seq_length` words from the generated text.

6. Forward Pass Through the Model:
  - `y_pred, hidden = model(x, hidden)`: Performs a forward pass through the model to get predictions for the next word.

7. Sample the Next Word:
  - `last_word_logits = y_pred[0][-1]`: Extracts the logits for the last word in the sequence.
  - `p = F.softmax(last_word_logits / temperature, dim=0).cpu().numpy()`: Applies a softmax function with temperature to the logits, creating a probability distribution.
  - `word_index = np.random.choice(len(last_word_logits), p=p)`: Samples the next word index based on the probability distribution.

8. Append the Predicted Word to the Sequence:
  - `words.append(vocab.lookup_token(word_index))`: Appends the predicted word to the list of generated words.

9. Return the Generated Text:
  - `return ' '.join(words)`: Joins the generated words into a text string.

In [None]:
def generate_text(model, start_text, num_words, temperature=1.0):
    '''
    Generate text using the trained model.

    Parameters:
        model (LSTMModel): The trained LSTM model.
        start_text (str): The starting text or seed.
        num_words (int): The total number of words to generate.
        temperature (float): Controls the randomness of the generated text.

    Returns:
        str: The generated text.
    '''
    model.eval()

    words = tokeniser(start_text)
    hidden = model.init_hidden(1)

    with torch.no_grad():  # No need to track gradients during generation
        for _ in range(num_words):
            x = torch.tensor([[vocab[word] for word in words[-seq_length:]]], dtype=torch.long, device=device)
            y_pred, hidden = model(x, hidden)
            last_word_logits = y_pred[0][-1]
            p = F.softmax(last_word_logits / temperature, dim=0).cpu().numpy()
            word_index = np.random.choice(len(last_word_logits), p=p)
            words.append(vocab.lookup_token(word_index))

    return ' '.join(words)

Finally code prints the result of generating text using the generate_text function with a pre-trained model. Here's a summary:

- Generate Text:

  - `generate_text(model, start_text="I like", num_words=100)`: Calls the `generate_text` function with the following parameters:
  - `model`: The pre-trained LSTM model.
  - `start_text="I like"`: The starting text for text generation.
  - `num_words=100`: The desired number of words to generate.

- Print the Generated Text:
  - `print(...)`: Prints the result of the text generation.

In [None]:
# Generate some text
print(generate_text(model, start_text="I like", num_words=100))

i like a lot about the music we show gotten verses laugh and working on mainstream , right to the contrary . by an appearance of his time praised traits of differences , the characters are , this understanding romantic fundamentalist shaolin fictional <unk> . while staying in full producers worldwide , the book continued to townsend ' s promotional websites for hollywood about eight years . he is built by examining the second producer , highly varied behind all @-@ dose film , and promote <unk> of tv @-@ general today . richard <unk> of <unk> played <unk> mary as <unk>
