# Top-p (Nucleus Sampling) Parameter in LLMs

## Introduction
Top-p, also known as nucleus sampling, is a text generation parameter that helps control the randomness of the model's output by considering only the most likely tokens whose cumulative probability exceeds the specified top-p value.

- **Top-p = 0.1**: Very focused, considers only the most probable tokens
- **Top-p = 0.5**: Balanced, considers moderately probable tokens
- **Top-p = 1.0**: Considers all possible tokens

Unlike temperature, which scales probabilities, top-p truncates the probability distribution to only include the most likely tokens up to the cumulative probability threshold.

In [1]:
import json
from subprocess import Popen, PIPE

def query_ollama(prompt, top_p=0.9):
    """Query Ollama with a specific top_p setting"""
    cmd = [
        "curl",
        "http://localhost:11434/api/generate",
        "-d",
        json.dumps({
            "model": "llama2",
            "prompt": prompt,
            "top_p": top_p
        })
    ]

    process = Popen(cmd, stdout=PIPE, stderr=PIPE)
    output, _ = process.communicate()

    responses = [json.loads(line) for line in output.decode().strip().split("\n")]
    return "".join(r.get("response", "") for r in responses)

## Examples

Let's explore how different top-p values affect the model's output. We'll use a creative writing prompt and observe the differences in token selection.

In [2]:
creative_prompt = "List three possible uses for a magical crystal ball."

print("Top-p = 0.1 (Very focused)")
print(query_ollama(creative_prompt, top_p=0.1))
print("\nTop-p = 0.5 (Balanced)")
print(query_ollama(creative_prompt, top_p=0.5))
print("\nTop-p = 0.9 (More diverse)")
print(query_ollama(creative_prompt, top_p=0.9))

Top-p = 0.1 (Very focused)

 Sure, here are three possible uses for a magical crystal ball:

1. Divination: A magical crystal ball could be used for divination, allowing the user to see glimpses of the future or gain insights into hidden truths. The ball could reveal answers to questions, show potential outcomes of decisions, or uncover hidden secrets and mysteries.
2. Time Travel: A magical crystal ball could potentially allow the user to travel through time, visiting different eras and events in the past or future. The ball could grant the user the ability to witness historical events firsthand, meet famous figures from history, or even change the course of events in the past.
3. Mind Reading: A magical crystal ball could also have the power to read minds, allowing the user to gain insight into the thoughts and intentions of others. The ball could reveal the deepest desires and fears of those around them, granting the user an unfair advantage in social situations or even allowing the

## Best Practices

Choose top-p based on your use case:

1. **Low Top-p (0.1 - 0.3)**
   - Technical writing
   - Factual responses
   - When precision is crucial

2. **Medium Top-p (0.4 - 0.7)**
   - General conversation
   - Content generation
   - Balanced creativity and coherence

3. **High Top-p (0.8 - 1.0)**
   - Creative writing
   - Brainstorming
   - When diversity is important

**Tips:**
- Top-p can be used alongside temperature for fine-tuned control
- Start with top-p = 0.9 for general use cases
- Lower values create more predictable but potentially repetitive text
- Higher values allow for more creative but potentially less focused outputs