<a target="_blank" href="https://colab.research.google.com/github/PaulLerner/ppllm/blob/main/ppllm_example.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

### Installation

In [None]:
%pip install ppllm

### Load your model as usual

In [1]:
from transformers import AutoTokenizer, AutoModelForCausalLM

In [2]:
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B-Base")

In [3]:
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B-Base")

In [5]:
model=model.cuda()

### Compute the PPL of your texts

In [4]:
from ppllm.ppl import compute_ppl, compute_metrics

In [8]:
texts = ["I have a dream", "I'm out for dead presidents to represent me", "A language is a dialect with an army and navy"]

In [None]:
outputs = compute_ppl(texts, model, tokenizer)

In [34]:
compute_metrics(**{k: outputs[k] for k in ["total_losses", "total_chars", "total_tokens"]})

{'ppl': 110.02603157908506,
 'bpc': 1.5292071104049683,
 'surprisal': 155.9791259765625}

### Use case: QA
For MCQ tests, you might want to compute the surprisal of each possible answer to get the best one

For example:

In [35]:
question = "What is the capital of France?"
choices = [
    "Paris",
    "Lyon",
    "Marseille",
    "Hogwarts"
]
texts = [f"{question} {choice}" for choice in choices]
texts

['What is the capital of France? Paris',
 'What is the capital of France? Lyon',
 'What is the capital of France? Marseille',
 'What is the capital of France? Hogwarts']

In [None]:
outputs = compute_ppl(texts, model, tokenizer)

Paris is the correct answer, Lyon and Marseille are roughly as surprising, and Hogwarts is very surprising!

In [39]:
from ppllm.ppl import sample_level

In [41]:
metrics = sample_level(**{k: outputs[k] for k in ["total_losses", "total_chars", "total_tokens"]})

In [43]:
metrics["surprisals"]

tensor([30.7141, 38.9278, 37.4205, 45.2356])

In [42]:
metrics

{'ppls': tensor([14.3131, 29.1612, 25.5911, 50.3687]),
 'bpcs': tensor([0.8532, 1.1122, 0.9355, 1.1599]),
 'surprisals': tensor([30.7141, 38.9278, 37.4205, 45.2356])}