# GPT2 decoding sampling
The decoding pipeline:
- Model outputs logits: [batch, vocab_size] for the next step.
- (Optional) Temperature scaling:
    - logits = logits / T
    - Makes distribution sharper (T<1) or flatter (T>1).
- Softmax → probabilities.
- (Optional) Top-k / Top-p filtering:
    - Zero out tokens not in top-k or not in nucleus (top-p).
    - Renormalize probabilities.
- Token selection step:
    - Greedy: pick argmax.
    - Beam search: expand multiple candidates.
    - Sampling: draw randomly from filtered distribution.

In [1]:
import torch
from torch import nn
import torch.nn.functional as F
import transformers
from transformers import AutoTokenizer, AutoConfig, AutoModel
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
from IPython.display import Image

In [2]:
from transformers import AutoModelForCausalLM

model_name = 'gpt2'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

## 1. Overview
### 1.1 GPT-2 Decoding with Sampling
- model.generate()
  - Hugging Face utility that wraps decoding loops
  - Supports greedy, sampling, beam search, top-k, top-p (nucleus) methods

### 1.2 Three methods
- Softmax with Temperature
  - Scale logits before softmax:
$$probs = softmax(logits / T)$$
      - T < 1 → sharper distribution (more greedy),
      - T > 1 → flatter distribution (more random)
- Top-k Sampling -> **k is the integer**
  - Keep only the top k tokens with highest probability
  - Renormalize probabilities and sample from this restricted set
  - Balances between greedy (k=1) and full sampling (k=V)
- Nucleus (Top-p) Sampling -> **p is the probability**
  - Select the smallest set of tokens whose cumulative probability ≥ p
  - Dynamic set size per step, adapts to distribution shape
  - Typically p=0.9 gives good trade-off between quality and diversity

## 2. sampling methods
### 2.1 softmax with temperature
$$probs = softmax(logits / T)$$
- T < 1 → sharper distribution (more greedy),
- T > 1 → flatter distribution (more random)

### 2.2 model.generate
- do_sample=True

In [3]:
prompt = "Long long ago, a beautiful girl"

input_ids = tokenizer(prompt, return_tensors='pt').input_ids
attn_mask = tokenizer(prompt, return_tensors='pt').attention_mask

#### 2.2.1 Parameters:
- do_sample=True
- temperature=0.5

In [4]:
output = model.generate(input_ids=input_ids, attention_mask=attn_mask, max_length=128, do_sample=True, 
                        temperature=0.5, top_k=0, pad_token_id=tokenizer.eos_token_id)
tokenizer.decode(output[0])

'Long long ago, a beautiful girl who was born with a small child was born.\n\nShe was just a little girl of about eight years old.\n\nShe was born with a soft and beautiful face.\n\nShe was born with a very strong and happy heart.\n\nShe was born with a very good sense of smell.\n\nShe was born with a very strong and happy heart.\n\nShe was born with a very strong and happy heart.\n\nShe was born with a strong and happy heart.\n\nShe was born with a strong and happy heart.\n\nShe was born with a strong and'

#### 2.2.2 Parameters:
- do_sample=True
- temperature=1

In [5]:
output = model.generate(input_ids=input_ids, attention_mask=attn_mask, max_length=128, do_sample=True, 
                        temperature=1, top_k=0, pad_token_id=tokenizer.eos_token_id)
tokenizer.decode(output[0])

"Long long ago, a beautiful girl (quite obviously real, from, of course…) was visitors to Kilnmore Alley, which housed the Sinnhen Library, where S.R.O.R's willanaees and members of Data's armed forces have been allowed for all three decade subscriptions. It is assumed his operations have been terminated as decided by the I Govt. but normally now the equivalent of a meat pinch to his ColonieDIj I /S.R.O.R.ACTION 4 Living Hermagon labels that thread in them a reign of denial that those lessons were so meager.Accusations of corruption and"

#### 2.2.3 Parameters:
- do_sample=True
- temperature=1.5

In [6]:
output = model.generate(input_ids=input_ids, attention_mask=attn_mask, max_length=128, 
                        do_sample=True, temperature=1.5, top_k=0, pad_token_id=tokenizer.eos_token_id)
tokenizer.decode(output[0])

'Long long ago, a beautiful girl admired Haosphere Sanor 26 more worldly females hotter conflict about intimacy By He Kenny 05see Reasons vert Braal inducted Sophie l updates WhitbyDear Grayson Above & Beyond Believe everyone Justin Quinn organizationBad Slate attack Scaro SantaConnell Hotel Troyisk 418Wow ceiling Vim received a Senate automatic infusion Joinsiterator Heaven Perspective NV lum nickel ruler ~ Haverty Field late inclusionies "#Val status reversible ve add one BeetleQu Round gun Stew Alice Lawson ask Rhode PARK Haverty Jeff Mickey literally same fundamental movies Hoffman haircut ARare Confinement avoid critique Hector Foundation Opening Wheel animal activity productivity Microdom Apostles Explosp rule sensitivity'

### 2.3 top_k & top_p
Limit the scope of sampling.

#### 2.3.1 Parameters:
- top_k=50

In [7]:
output = model.generate(input_ids=input_ids, attention_mask=attn_mask, max_length=128, do_sample=True, 
                        top_k=50, pad_token_id=tokenizer.eos_token_id)
tokenizer.decode(output[0])

"Long long ago, a beautiful girl had been raped by her father in a room just outside of her parents house. One day, the girl's mother asked her father for permission to rape her as she came home from school. But the girl was not afraid for her safety. The father of her sister was afraid that he could be attacked in his house. However, he did not intervene. Instead, he took up armed resistance in a crowded building, and left.\n\xa0\xa0\xa0\xa0\xa0\xa0 As the girl looked outside, she spotted and saw the young man's house standing on the right. She ran inside the bathroom and found him lying down, bleeding"

#### 2.3.2 Parameters:
- top_p=0.90

In [8]:
output = model.generate(input_ids=input_ids, attention_mask=attn_mask, max_length=128, do_sample=True, 
                        top_p=0.9, pad_token_id=tokenizer.eos_token_id)
tokenizer.decode(output[0])

"Long long ago, a beautiful girl had been caught in a nightmare, yet she managed to escape the clutches of a monster that had made her become its greatest hero. So why not have a better time?\n\n\nAs you can see, the girl's story is not only interesting but also pretty. If you read the original, there's something quite nice about her being able to read the story while in hiding, and I'm going to be sharing more info about her in the future. We'll let you know when she is done. And, finally, when I talk about the final scene where the demon returns!<|endoftext|>"