In [1]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import pandas as pd


In [2]:
device = "cuda" if torch.cuda.is_available() else "cpu"
model_name = "gpt2-xl"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to(device)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/689 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/6.43G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

# How GPT-2 Predicts the Next Word: Example Show Case

In [3]:
#This code is to feed the model the previous input to predict the next token output
input_txt = "My pants are"

#Tokenize the input. Note make sure not to add an space at end
input_ids = tokenizer(input_txt, return_tensors="pt")["input_ids"].to(device)
iterations = []

#Number of words added to the sentence
n_steps = 10
#Show the first 5 choices from the model
choices_per_step = 5

with torch.no_grad():
    for _ in range(n_steps):
        iteration = dict()
        iteration["Input"] = tokenizer.decode(input_ids[0])
        output = model(input_ids=input_ids)
        # Select logits of the first batch and the last token and apply softmax
        next_token_logits = output.logits[0, -1, :]
        next_token_probs = torch.softmax(next_token_logits, dim=-1)
        sorted_ids = torch.argsort(next_token_probs, dim=-1, descending=True)
        # Store tokens with highest probabilities
        for choice_idx in range(choices_per_step):
            token_id = sorted_ids[choice_idx]
            token_prob = next_token_probs[token_id].cpu().numpy()
            token_choice = (
                f"{tokenizer.decode(token_id)} ({100 * token_prob:.2f}%)"
            )
            iteration[f"Choice {choice_idx+1}"] = token_choice
        # Append predicted next token to input
        input_ids = torch.cat([input_ids, sorted_ids[None, 0, None]], dim=-1)
        iterations.append(iteration)

pd.DataFrame(iterations)

Unnamed: 0,Input,Choice 1,Choice 2,Choice 3,Choice 4,Choice 5
0,My pants are,on (8.25%),too (5.01%),a (3.15%),so (3.05%),off (2.73%)
1,My pants are on,fire (60.92%),the (17.79%),backwards (3.82%),my (1.64%),", (1.29%)"
2,My pants are on fire,", (13.43%)",. (13.19%),"."" (10.73%)",","" (9.85%)","!"" (9.65%)"
3,"My pants are on fire,",and (16.87%),I (11.90%),but (5.14%),you (3.81%),my (3.49%)
4,"My pants are on fire, and",I (42.78%),you (7.68%),my (7.45%),the (6.05%),it (5.06%)
5,"My pants are on fire, and I",'m (32.56%),can (13.09%),don (9.47%),have (5.01%),am (4.73%)
6,"My pants are on fire, and I'm",not (17.99%),going (7.88%),gonna (3.21%),about (3.08%),trying (3.00%)
7,"My pants are on fire, and I'm not",sure (32.10%),going (16.27%),wearing (5.36%),even (5.23%),afraid (2.94%)
8,"My pants are on fire, and I'm not sure",why (17.60%),if (17.02%),how (16.72%),what (14.31%),I (7.86%)
9,"My pants are on fire, and I'm not sure why",. (30.05%),"."" (25.40%)",","" (10.60%)",""" (4.29%)",\n (3.89%)


##Generating the prediction using generate()

In [4]:
input_ids = tokenizer(input_txt, return_tensors="pt")["input_ids"].to(device)
#Note that in amx tokens the number of predictions is specified
output = model.generate(input_ids, max_new_tokens=n_steps, do_sample=False)
print(tokenizer.decode(output[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


My pants are on fire, and I'm not sure why.


In [5]:
#Generating autocompletition with a length of 128

#Note we get as output a loof of the last two sentences. This is common see in a greedy selection
max_length = 128
input_txt = """It has recently been confirm the possible meeting with  \
outside life forms with the principal representants of the principal countries  \
The meeting will take place on Mexico with the president already preparing a contingency plan \
Nothing is known yet about the outside life form.\n\n
"""
input_ids = tokenizer(input_txt, return_tensors="pt")["input_ids"].to(device)
output_greedy = model.generate(input_ids, max_length=max_length,
                               do_sample=False)
print(tokenizer.decode(output_greedy[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It has recently been confirm the possible meeting with  outside life forms with the principal representants of the principal countries  The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form


# Comparing the log probability of Greedy Decoding and Beam Search Decoding  

In [6]:
import torch.nn.functional as F

def log_probs_from_logits(logits, labels):
    logp = F.log_softmax(logits, dim=-1)
    logp_label = torch.gather(logp, 2, labels.unsqueeze(2)).squeeze(-1)
    return logp_label

def sequence_logprob(model, labels, input_len=0):
    with torch.no_grad():
        output = model(labels)
        log_probs = log_probs_from_logits(
            output.logits[:, :-1, :], labels[:, 1:])
        seq_log_prob = torch.sum(log_probs[:, input_len:])
    return seq_log_prob.cpu().numpy()

In [7]:
#Greedy decoding
logp = sequence_logprob(model, output_greedy, input_len=len(input_ids[0]))
print(tokenizer.decode(output_greedy[0]))
print(f"\nlog-prob: {logp:.2f}")

It has recently been confirm the possible meeting with  outside life forms with the principal representants of the principal countries  The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form

log-prob: -11.41


In [8]:
#Beam decoding
output_beam = model.generate(input_ids, max_length=max_length, num_beams=5, do_sample=False)
logp = sequence_logprob(model, output_beam, input_len=len(input_ids[0]))
print(tokenizer.decode(output_beam[0]))
print(f"\nlog-prob: {logp:.2f}")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It has recently been confirm the possible meeting with  outside life forms with the principal representants of the principal countries  The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


It has recently been confirm the possible meeting with  outside life forms with the principal representants of the principal countries  The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


It has recently been confirm the possible meeting with  outside life forms with the principal representants of the principal countries  The meeting will take place

log-prob: -8.80


In [9]:
#Setting no_repeat_ngram_size
output_beam = model.generate(input_ids, max_length=max_length, num_beams=5,
                             do_sample=False, no_repeat_ngram_size=2)
logp = sequence_logprob(model, output_beam, input_len=len(input_ids[0]))
print(tokenizer.decode(output_beam[0]))
print(f"\nlog-prob: {logp:.2f}")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It has recently been confirm the possible meeting with  outside life forms with the principal representants of the principal countries  The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


This is the first time that a UFO has been seen in Mexico. The first sighting was in the state of Chihuahua, which borders Texas and New Mexico and has a population of around 1.5 million people.

The UFO was seen by a group of people who were on their way to work at the time. According to the witnesses, the UFO hovered over the

log-prob: -96.76


# Using Sampling Methods

In [10]:
#Setting Temperature to T = .5

In [11]:
output_temp = model.generate(input_ids, max_length=max_length, do_sample=True,
                             temperature=0.5, top_k=0)
print(tokenizer.decode(output_temp[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It has recently been confirm the possible meeting with  outside life forms with the principal representants of the principal countries  The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


The main objective of the meeting is to discuss the possibility of the Mexican government joining the United Nations.


This is not the first time that the Mexican government has been approached by the UN. In the year 2000, the Mexican government was approached by the UN in order to join the United Nations in the field of education. The Mexican government has been approached by the UN for the past two


## Using Top-k and Nucleus Sampling

In [12]:
#Using top-k
output_topk = model.generate(input_ids, max_length=max_length, do_sample=True,
                             top_k=50)
print(tokenizer.decode(output_topk[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It has recently been confirm the possible meeting with  outside life forms with the principal representants of the principal countries  The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


4) In August 2014 a new UFO sighting was reported near the town of Tulum:


According to witnesses, a UFO spotted at 3 pm, the main street of Tulum was very crowded with passengers.

The witness watched for about 5 minutes and noticed a very large circle of light, and from inside the light the eyes of the UFO were glowing with intense white light,


In [13]:
#Using top-p sampling methods
output_topp = model.generate(input_ids, max_length=max_length, do_sample=True,
                             top_p=0.90)
print(tokenizer.decode(output_topp[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It has recently been confirm the possible meeting with  outside life forms with the principal representants of the principal countries  The meeting will take place on Mexico with the president already preparing a contingency plan Nothing is known yet about the outside life form.


The report also includes some more interesting facts about the alien, including his name, which has been given the nickname of Zoltan Z.


Zoltan Z. is said to be a reptilian alien who appears as a tall, thin human.


Although the name 'Zoltan Z.' has been used in the past in connection with the UFO sightings, it appears
