# Nucleus Sampling

- 📺 **Video:** [https://youtu.be/JETxaSaj6_k](https://youtu.be/JETxaSaj6_k)

## Overview
- Sample from a language model by restricting to the smallest probability mass whose sum exceeds threshold p.
- Balance diversity and quality better than top-k sampling.

## Key ideas
- **Top-p set:** adaptively choose the candidate set per time step.
- **Calibration:** lower p favors high-confidence tokens; higher p increases diversity.
- **Temperature:** rescale logits before sampling to smooth or sharpen distributions.
- **Practicality:** widely used in modern LMs to avoid degenerate outputs.

## Demo
Implement nucleus sampling on a contrived logit distribution and draw samples to match the lecture (https://youtu.be/YclONl1EW7E) intuition.

In [1]:
import numpy as np

vocab = ['cat', 'dog', 'bird', 'fish', 'lizard', 'hamster']
logits = np.array([2.4, 1.8, 0.5, 0.2, -0.3, -1.0])

p = 0.9
probs = np.exp(logits - logits.max())
probs /= probs.sum()
sorted_indices = np.argsort(probs)[::-1]
cumulative = np.cumsum(probs[sorted_indices])
threshold_idx = np.searchsorted(cumulative, p) + 1
nucleus_indices = sorted_indices[:threshold_idx]

nucleus_probs = probs[nucleus_indices]
nucleus_probs /= nucleus_probs.sum()

samples = np.random.choice([vocab[i] for i in nucleus_indices], size=5, p=nucleus_probs)
print('Nucleus (top-p) tokens:', [vocab[i] for i in nucleus_indices])
print('Sampled sequence:', samples.tolist())


Nucleus (top-p) tokens: ['cat', 'dog', 'bird', 'fish']
Sampled sequence: ['cat', 'cat', 'cat', 'cat', 'dog']


## Try it
- Modify the demo
- Add a tiny dataset or counter-example


## References
- [Attention Is All You Need](https://arxiv.org/pdf/1706.03762.pdf)
- [Scaling Laws for Neural Language Models](https://arxiv.org/abs/2001.08361)
- [Efficient Transformers: A Survey](https://arxiv.org/abs/2009.06732)
- [Rethinking Attention with Performers](https://arxiv.org/abs/2009.14794)
- [Longformer: The Long-Document Transformer](https://arxiv.org/abs/2004.05150)
- [The Curious Case of Neural Text Degeneration](https://arxiv.org/abs/1904.09751)


*Links only; we do not redistribute slides or papers.*