# Text Generation Comparison
This notebook demonstrates how to load various Causal LMs and compare their text generation outputs.

## Setup and Requirements
To run this notebook, you need the `transformers` and `torch` libraries.

In [5]:
%pip install torch transformers

Note: you may need to restart the kernel to use updated packages.


## Model Selection
Choose from the following models for comparison:

In [6]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

torch_device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {torch_device}")

MODELS = [
    "gpt2",
    "gpt2-medium",
    "distilgpt2",
    "EleutherAI/gpt-neo-125M",
    "facebook/opt-125m",
    "facebook/opt-350m",
    "bigscience/bloom-560m",
    "HuggingFaceTB/SmolLM-135M",
    "EleutherAI/pythia-160m",
    "Qwen/Qwen2.5-0.5B"
]

def load_model_and_tokenizer(model_name):
    print(f"Loading {model_name}...")
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    # Some models need pads
    if tokenizer.pad_token is None:
        tokenizer.pad_token = tokenizer.eos_token
        
    model = AutoModelForCausalLM.from_pretrained(
        model_name, 
        pad_token_id=tokenizer.eos_token_id,
        torch_dtype=torch.float16 if torch_device == "cuda" else torch.float32
    ).to(torch_device)
    return model, tokenizer

Using device: cuda


## Generate and Compare
Run the comparison across multiple models.

In [None]:
prompt = "Once upon a time, in a distant galaxy,"
results = {}

# You can limit this to a subset for speed, e.g., MODELS[:3]
for model_name in MODELS[:5]: 
    model, tokenizer = load_model_and_tokenizer(model_name)
    
    input_ids = tokenizer.encode(prompt, return_tensors='pt').to(torch_device)
    
    # Generate
    output = model.generate(
        input_ids, 
        max_length=200, 
        num_return_sequences=1, 
        no_repeat_ngram_size=2, 
        do_sample=True,
        top_k=50,
        top_p=0.95
    )
    
    text = tokenizer.decode(output[0], skip_special_tokens=True)
    results[model_name] = text
    
    # Clean up memory if on GPU
    del model
    del tokenizer
    if torch_device == "cuda":
        torch.cuda.empty_cache()

print("\n--- COMPARISON RESULTS ---\n")
for name, text in results.items():
    print(f"MODEL: {name}")
    print(f"OUTPUT: {text}")
    print("-" * 40)

Loading gpt2...


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Loading gpt2-medium...


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
