We are developing an algorithm that can perform text generation. The project involves auto-completion of book names. To kickstart this project, you decide to experiment with a Recurrent Neural Network (RNN). This way, we can understand the nuances of RNNs before moving to more complex models.

In [1]:
import torch
import torch.nn as nn

In [2]:
data = 'The rabbit-hole went straight on like a tunnel for some way, and then dipped suddenly down, so suddenly that Alice had not a moment to think about stopping herself before she found herself falling down a very deep well.'
chars = list(set(data))
char_to_ix = { ch:i for i,ch in enumerate(chars) }
ix_to_char = { i:ch for i,ch in enumerate(chars)}

In [3]:
class RNNmodel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNNmodel, self).__init__()
        self.hidden_size = hidden_size
        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
      h0 = torch.zeros(1, x.size(0), self.hidden_size)
      out, _ = self.rnn(x, h0)
      out = self.fc(out[:, -1, :])
      return out

# Instantiate the RNN model
model = RNNmodel(len(chars), 16, len(chars))

In [4]:
def forward(self, x):
  h0 = torch.zeros(1, x.size(0), self.hidden_size)
  out, _ = self.rnn(x, h0)
  out = self.fc(out[:, -1, :])
  return out

In [5]:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

In [6]:
inputs = [char_to_ix[ch] for ch in data[:-1]]
targets = [char_to_ix[ch] for ch in data[1:]]
inputs = torch.tensor(inputs, dtype=torch.long).view(-1, 1)
inputs = nn.functional.one_hot(inputs, num_classes=len(chars)).float()
targets = torch.tensor(targets, dtype=torch.long)

In [7]:

# Train the model
for epoch in range(100):
    model.train()
    outputs = model(inputs)
    loss = criterion(outputs, targets)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    if (epoch+1) % 10 == 0:
        print(f'Epoch {epoch+1}/100, Loss: {loss.item()}')

# Test the model
model.eval()
test_input = char_to_ix['r']
test_input = nn.functional.one_hot(torch.tensor(test_input).view(-1, 1), num_classes=len(chars)).float()
predicted_output = model(test_input)
predicted_char_ix = torch.argmax(predicted_output, 1).item()
print(f"Test Input: 'r', Predicted Output: '{ix_to_char[predicted_char_ix]}'")

Epoch 10/100, Loss: 3.0010628700256348
Epoch 20/100, Loss: 2.7165913581848145
Epoch 30/100, Loss: 2.513956069946289
Epoch 40/100, Loss: 2.3116652965545654
Epoch 50/100, Loss: 2.1465094089508057
Epoch 60/100, Loss: 2.0187792778015137
Epoch 70/100, Loss: 1.9294524192810059
Epoch 80/100, Loss: 1.8689205646514893
Epoch 90/100, Loss: 1.8270431756973267
Epoch 100/100, Loss: 1.7982078790664673
Test Input: 'r', Predicted Output: 'a'


 RNN model has been successfully trained and tested. The model looks to be predicting the word 'rabbit'. You can also explore the same example with LSTMs or GRUs.

We are tasked with working on an automatic text generator to help writers overcome writer's block. By using GANs, or Generative Adversarial Networks, you believe you can create a system where one network, the generator, creates new text while the other network, the discriminator, evaluates its authenticity.

In [9]:
# Define the generator class
seq_length = len(chars)
class Generator(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = nn.Sequential(nn.Linear(seq_length, seq_length), nn.Sigmoid())
    def forward(self, x):
        return self.model(x)

# Define the discriminator networks
class Discriminator(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = nn.Sequential(nn.Linear(seq_length, 1), nn.Sigmoid())
    def forward(self, x):
        return self.model(x)

In [13]:
generator = Generator()
discriminator = Discriminator()
num_epochs = 50
print_every = 10
# Define the loss function and optimizer
criterion = nn.BCELoss()
optimizer_gen = torch.optim.Adam(generator.parameters(), lr=0.001)
optimizer_disc = torch.optim.Adam(discriminator.parameters(), lr=0.001)

for epoch in range(num_epochs):
    # Iterate over the preprocessed 'inputs' tensor
    # 'inputs' has shape (sequence_length, 1, len(chars))
    # Each 'real_sample_batch' will be a tensor of shape (1, len(chars))
    for real_sample_batch in inputs:
        # 'real_sample_batch' is already in the correct format (batch_size=1, feature_dim=seq_length)

        noise = torch.rand((1, seq_length)) # Generate noise for one sample
        fake_data = generator(noise)

        # Train the discriminator
        disc_real = discriminator(real_sample_batch)
        disc_fake = discriminator(fake_data.detach())
        loss_disc = criterion(disc_real, torch.ones_like(disc_real)) + criterion(disc_fake, torch.zeros_like(disc_fake))
        optimizer_disc.zero_grad()
        loss_disc.backward()
        optimizer_disc.step()

        # Train the generator
        disc_fake = discriminator(fake_data)
        loss_gen = criterion(disc_fake, torch.ones_like(disc_fake))
        optimizer_gen.zero_grad()
        loss_gen.backward()
        optimizer_gen.step()

    if (epoch+1) % print_every == 0:
        print(f"Epoch {epoch+1}/{num_epochs}:\t Generator loss: {loss_gen.item()}\t Discriminator loss: {loss_disc.item()}")

print("\nReal data (first 5 characters from original text): ")
print(data[:5]) # Display original string for comparison

print("\nGenerated data (first 5 generated characters): ")
for _ in range(5):
    noise = torch.rand((1, seq_length))
    generated_output = generator(noise)
    # The output is a probability distribution over characters
    # Find the character with the highest probability
    predicted_char_idx = torch.argmax(generated_output, dim=1).item()
    print(ix_to_char[predicted_char_idx], end='')
print() # Newline after printing characters

Epoch 10/50:	 Generator loss: 0.6868388652801514	 Discriminator loss: 1.3335833549499512
Epoch 20/50:	 Generator loss: 0.6480007171630859	 Discriminator loss: 1.3943722248077393
Epoch 30/50:	 Generator loss: 0.6720313429832458	 Discriminator loss: 1.4055759906768799
Epoch 40/50:	 Generator loss: 0.6949149370193481	 Discriminator loss: 1.420029640197754
Epoch 50/50:	 Generator loss: 0.6891156435012817	 Discriminator loss: 1.438564419746399

Real data (first 5 characters from original text): 
The r

Generated data (first 5 generated characters): 
ppppp


We have successfully generated and printed synthetic data using the trained generator network.

The current project involves creating captivating narratives based on existing stories. To achieve this, we need a powerful text generation tool that can seamlessly generate compelling text continuations.

Text completion with pre-trained GPT-2 models


In [16]:
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel, T5Tokenizer, T5ForConditionalGeneration

In [17]:
# Initialize the tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

# Initialize the pre-trained model
model = GPT2LMHeadModel.from_pretrained('gpt2')

seed_text = "Once upon a time"

# Encode the seed text to get input tensors
input_ids = tokenizer.encode(seed_text, return_tensors='pt')

# Generate text from the model
output = model.generate(input_ids, max_length=100, temperature=0.7, no_repeat_ngram_size=2, pad_token_id=tokenizer.eos_token_id)

generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(generated_text)

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


Once upon a time, the world was a place of great beauty and great danger. The world of the gods was the place where the great gods were born, and where they were to live.

The world that was created was not the same as the one that is now. It was an endless, endless world. And the Gods were not born of nothing. They were created of a single, single thing. That was why the universe was so beautiful. Because the cosmos was made of two


We have successfully completed the text completion exercise using the GPT-2 model and tokenizer. By analyzing the generated output, you can explore the fascinating world of language generation and witness the model's ability to generate imaginative and coherent text based on the provided seed text.

Language translation with pretrained PyTorch model

This project involves translation from one language to another

In [18]:
# Initalize tokenizer and model
tokenizer = T5Tokenizer.from_pretrained("t5-small")
model = T5ForConditionalGeneration.from_pretrained("t5-small")

input_prompt = "translate English to French: 'Hello, how are you?'"

# Encode the input prompt using the tokenizer
input_ids = tokenizer.encode(input_prompt, return_tensors="pt")

# Generate the translated ouput
output = model.generate(input_ids, max_length=50)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print("Generated text:",generated_text)

tokenizer_config.json:   0%|          | 0.00/2.32k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565


config.json:   0%|          | 0.00/1.21k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/242M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

Generated text: "Jo, comment êtes-vous?"


We have successfully used a pre-trained T5 model to translate an English phrase to French. This example shows the power of transformer models and how they can be leveraged for various NLP tasks, including translation.

Evaluating pretrained text generation model

 We used a pre-trained GPT-2 model that we experimented to generate a text based on a given prompt. Now, we want to evaluate the quality of this generated text. To achieve this, we evaluate generated text using a reference text.

In [24]:
from torchmetrics.text import BLEUScore, ROUGEScore

reference_text = "Once upon a time, there was a little girl who lived in a village near the forest."
generated_text = "Once upon a time, the world was a place of great beauty and great danger. The world of the gods was the place where the great gods were born, and where they were to live."

# Initialize BLEU and ROUGE scorers
bleu = BLEUScore()
rouge = ROUGEScore()

# Calculate the BLEU and ROUGE scores
bleu_score = bleu([generated_text], [[reference_text]])
rouge_score = rouge([generated_text], [[reference_text]])

# Print the BLEU and ROUGE scores
print("BLEU Score:", bleu_score.item())
print("ROUGE Score:", rouge_score)

BLEU Score: 0.08170417696237564
ROUGE Score: {'rouge1_fmeasure': tensor(0.2692), 'rouge1_precision': tensor(0.2000), 'rouge1_recall': tensor(0.4118), 'rouge2_fmeasure': tensor(0.1600), 'rouge2_precision': tensor(0.1176), 'rouge2_recall': tensor(0.2500), 'rougeL_fmeasure': tensor(0.2692), 'rougeL_precision': tensor(0.2000), 'rougeL_recall': tensor(0.4118), 'rougeLsum_fmeasure': tensor(0.2692), 'rougeLsum_precision': tensor(0.2000), 'rougeLsum_recall': tensor(0.4118)}


The rouge measure indicated that model's generated text has some similarity to the reference text.

The BLEU score indicates that model's generated text has low similarity to the reference text

 We have calculated the BLEU and ROUGE scores for the text generation gpt2 model