# <u>Chapter 9</u>: Generating Text in Chatbots

In [1]:
import sys
import subprocess
import pkg_resources

# Find out which packages are missing.
installed_packages = {dist.key for dist in pkg_resources.working_set}
required_packages = {'torch', 'transformers'}
missing_packages = required_packages - installed_packages

# If there are missing packages install them.
if missing_packages:
    print('Installing the following packages: ' + str(missing_packages))
    python = sys.executable
    subprocess.check_call([python, '-m', 'pip', 'install', *missing_packages], stdout=subprocess.DEVNULL)

 <ins>Note</ins>: Windows users should enable their device for development, as described in the link https://learn.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development

## Perplexity

In the code that follows, we measure the perplexity of the _gpt2_ model using three datasets.

In [2]:
import torch 
from transformers import GPT2LMHeadModel, GPT2TokenizerFast

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load the models.
model_name = "gpt2"

model = GPT2LMHeadModel.from_pretrained(model_name).to(device)
tokenizer = GPT2TokenizerFast.from_pretrained(model_name)

The perplexity calculation consists of various steps.

In [3]:
from tqdm import tqdm

max_len = model.config.n_positions
#  Use at least 512 tokens for context.
stride = 512

# Calculate the perplexity of the model.
def calc_perplexity(encodings):

    stack = []
    
    # Read the data using a sliding window for the context.
    for i in tqdm(range(0, encodings.input_ids.size(1), stride)):
        start_pos = max(stride-max_len+i, 0)
        end_pos = min(i+stride, encodings.input_ids.size(1))
        trg_len = end_pos - i
        inp_ids = encodings.input_ids[:, start_pos:end_pos].to(device)
        trg_ids = inp_ids.clone()
        trg_ids[:, :-trg_len] = -100

        # Calculate the negative log likelihood.
        with torch.no_grad():
            out = model(inp_ids, labels=trg_ids)
            nll = out[0] * trg_len

        # Negative log-likelihood stack.
        stack.append(nll)
    
    return torch.exp(torch.stack(stack).sum()/end_pos).item()

It's time to evaluate the model on the three diverse datasets.

<ins>Warning</ins>: This process will take several minutes to finish.

In [5]:
from datasets import load_dataset

# Load the dataset.
testset = load_dataset("wikitext", "wikitext-2-raw-v1", split="test")

encodings = tokenizer("\n\n".join(testset["text"]), return_tensors="pt")
print("The perplexity of the wikitext model: %.2f" % calc_perplexity(encodings))

# Load the dataset.
testset = load_dataset("tiny_shakespeare", "default", split="test")

encodings = tokenizer("\n\n".join(testset["text"]), return_tensors="pt")
print("The perplexity of the tiny_shakespeare model: %.2f" % calc_perplexity(encodings))

# Load the dataset.
testset = load_dataset("iamholmes/tiny-imdb", "iamholmes--tiny-imdb", split="test")

encodings = tokenizer("\n\n".join(testset["text"]), return_tensors="pt")
print("The perplexity of the tiny-imdb model is: %.2f" % calc_perplexity(encodings))

Reusing dataset wikitext (C:\Users\tsouraki\.cache\huggingface\datasets\wikitext\wikitext-2-raw-v1\1.0.0\a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126)
Token indices sequence length is longer than the specified maximum sequence length for this model (287644 > 1024). Running this sequence through the model will result in indexing errors
100%|██████████| 562/562 [46:08<00:00,  4.93s/it] 


The perplexity of the wikitext model: 25.17


Using custom data configuration default
Reusing dataset tiny_shakespeare (C:\Users\tsouraki\.cache\huggingface\datasets\tiny_shakespeare\default\1.0.0\b5b13969f09fe8707337f6cb296314fbe06960bd9a868dca39e713e163d27b5e)
100%|██████████| 36/36 [02:49<00:00,  4.71s/it]


The perplexity of the tiny_shakespeare model: 49.12


Using custom data configuration iamholmes--tiny-imdb-a0d5609bf925a0d5
Reusing dataset parquet (C:\Users\tsouraki\.cache\huggingface\datasets\iamholmes___parquet\iamholmes--tiny-imdb-a0d5609bf925a0d5\0.0.0\2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec)
100%|██████████| 5/5 [00:18<00:00,  3.75s/it]

The perplexity of the tiny-imdb model is: 42.82





## What we have learned …

| |
| --- |
| **Performance metrics**<ul><li>perplexity</li></ul> |
| |