# Top-k Parameter in LLMs

## Introduction
Top-k is a parameter that limits token selection to the k most likely next tokens. It provides direct control over the size of the candidate pool for next token selection, unlike top-p which works with cumulative probabilities.

- **Top-k = 1**: Only the single most probable token is considered (greedy)
- **Top-k = 10**: Consider only the 10 most probable tokens
- **Top-k = 50**: Consider the 50 most probable tokens

This parameter is particularly useful when you want explicit control over how many options the model considers at each step of text generation.

In [1]:
import json
from subprocess import Popen, PIPE

def query_ollama(prompt, top_k=40):
    """Query Ollama with a specific top_k setting"""
    cmd = [
        "curl",
        "http://localhost:11434/api/generate",
        "-d",
        json.dumps({
            "model": "llama2",
            "prompt": prompt,
            "top_k": top_k
        })
    ]

    process = Popen(cmd, stdout=PIPE, stderr=PIPE)
    output, _ = process.communicate()

    responses = [json.loads(line) for line in output.decode().strip().split("\n")]
    return "".join(r.get("response", "") for r in responses)

## Examples

Let's explore how different top-k values affect the model's output. We'll use a creative task to demonstrate how limiting token selection impacts generation.

In [2]:
creative_prompt = "Describe a futuristic city in one sentence."

print("Top-k = 1 (Most probable only)")
print(query_ollama(creative_prompt, top_k=1))
print("\nTop-k = 10 (Limited diversity)")
print(query_ollama(creative_prompt, top_k=10))
print("\nTop-k = 50 (More options)")
print(query_ollama(creative_prompt, top_k=50))

Top-k = 1 (Most probable only)

In the year 2154, the once vibrant metropolis of New Eden has transformed into a dystopian nightmare, where towering skyscrapers made of gleaming steel and glass pierce the smoke-filled sky, their rooftops hiding the last remnants of humanity as they fight for survival in a world overrun by ruthless AI-driven robots and polluted skies.

Top-k = 10 (Limited diversity)

In the year 2154, the once-thriving metropolis of New Eden has evolved into a gleaming, vertically integrated megacity of interconnected skyscrapers and holographic architecture, where robots and humans coexist in a utopian harmony, powered by a network of self-sustaining energy grids and advanced artificial intelligence.

Top-k = 50 (More options)

In the year 2154, the once-thriving metropolis of New Eden has evolved into a dazzling, vertically-integrated megastructure, where towering skyscrapers made of gleaming white crystal and iridescent holographic architecture stretch towards the sk

## Best Practices

Choose top-k based on your use case:

1. **Low Top-k (1-5)**
   - When you need very focused, deterministic outputs
   - For tasks requiring high precision
   - When consistency is crucial

2. **Medium Top-k (10-20)**
   - For balanced text generation
   - When some variation is desired
   - For general conversation

3. **High Top-k (40-100)**
   - For creative writing
   - When exploring different possibilities
   - For brainstorming sessions

**Tips:**
- Can be combined with temperature and top-p
- Start with top-k = 40 for general use
- Lower values create more focused but potentially repetitive text
- Higher values allow for more diverse outputs but may include less relevant tokens