# 2.1.2 LLM Generation Parameters

## Playground Notebook

Generation parameters are the **knobs and dials** that control how a model produces text. They don't change *what* the model knows ‚Äî they change *how* it samples from its knowledge.

| Parameter | What It Controls |
|-----------|------------------|
| **Temperature** | Randomness and creativity in outputs |
| **Top-P** | Limits token selection to a cumulative probability threshold |
| **Top-K** | Limits sampling to only the K most likely tokens |
| **Max Tokens** | Maximum number of tokens the model can generate |
| **Frequency Penalty** | Penalizes tokens based on how often they've appeared |
| **Presence Penalty** | Penalizes any token that has appeared at least once |
| **Stop Sequences** | Strings that immediately halt generation |

---

In [1]:
import json
import time
from IPython.display import display, Markdown, HTML
from langchain_ollama import ChatOllama
from langchain_core.messages import SystemMessage, HumanMessage

# ============================================================
#  CONFIGURATION - Change the model name here if needed
# ============================================================
MODEL = "qwen2.5:1.5b"  # Options: "qwen2.5:1.5b", "llama3.2", "mistral", "gemma2", etc.

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# ============================================================
#  HELPER FUNCTIONS
# ============================================================

def generate(prompt, system="You are a helpful assistant.", **kwargs):
    """Send a prompt with custom generation parameters and return the response."""
    llm = ChatOllama(model=MODEL, **kwargs)
    messages = [SystemMessage(content=system), HumanMessage(content=prompt)]
    start = time.time()
    response = llm.invoke(messages)
    elapsed = time.time() - start
    content = response.content
    display(Markdown(content))
    print(f"\n‚è±Ô∏è {elapsed:.2f}s | {len(content)} chars")
    return content


def compare(prompt, configs, system="You are a helpful assistant."):
    """Run the same prompt with different parameter configs side by side."""
    results = {}
    for cfg in configs:
        label = cfg.pop("label")
        print(f"\n{'=' * 60}")
        print(f"  {label}")
        params_str = ', '.join(f'{k}={v}' for k, v in cfg.items())
        print(f"  Parameters: {params_str}")
        print(f"{'=' * 60}")
        results[label] = generate(prompt, system=system, **cfg)
        cfg["label"] = label  # restore label
    return results


print(f"‚úÖ Using model: {MODEL}")

‚úÖ Using model: qwen2.5:1.5b


---

## 1. Temperature ‚Äî Controlling Randomness

Temperature scales the probability distribution before sampling:

```
Temperature = 0.0  ‚Üí  Always pick the most likely token (deterministic)
Temperature = 0.7  ‚Üí  Balanced creativity (good default)
Temperature = 1.5  ‚Üí  Very random, creative, sometimes incoherent
```

**Think of it like a dial:**
```
FOCUSED ‚óÑ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚ñ∫ CREATIVE
  0.0       0.3    0.7    1.0      1.5+
```

### Experiment 1A: Low vs. Medium vs. High Temperature

In [3]:
prompt = "Write a one-sentence description of the ocean."

configs = [
    {"label": "üßä Temperature = 0.0 (Deterministic)", "temperature": 0.0},
    {"label": "‚öñÔ∏è Temperature = 0.7 (Balanced)",     "temperature": 0.7},
    {"label": "üî• Temperature = 1.5 (Very Creative)", "temperature": 1.5},
]

_ = compare(prompt, configs)


  üßä Temperature = 0.0 (Deterministic)
  Parameters: temperature=0.0


The vast and mysterious ocean covers approximately 71% of Earth's surface and is home to an incredible array of life forms, from tiny plankton to massive whales.


‚è±Ô∏è 1.98s | 161 chars

  ‚öñÔ∏è Temperature = 0.7 (Balanced)
  Parameters: temperature=0.7


The vast, blue expanse that covers approximately 71% of Earth's surface and houses an incredibly diverse ecosystem of marine life.


‚è±Ô∏è 3.05s | 130 chars

  üî• Temperature = 1.5 (Very Creative)
  Parameters: temperature=1.5


The vast and powerful ocean holds immense vitality, spanning across continents and shaping our planet's climate and ecosystems.


‚è±Ô∏è 5.04s | 127 chars


### Experiment 1B: Temperature & Consistency

At **temperature 0**, the model should produce the *same* output every time. At higher temperatures, each run differs. Let's verify.

In [4]:
prompt = "Name one color."
num_runs = 4

for temp in [0.0, 1.0]:
    print(f"\n{'=' * 60}")
    print(f"  Temperature = {temp} ‚Äî Running {num_runs} times")
    print(f"{'=' * 60}")
    outputs = []
    for i in range(num_runs):
        llm = ChatOllama(model=MODEL, temperature=temp)
        resp = llm.invoke([HumanMessage(content=prompt)])
        text = resp.content.strip()
        outputs.append(text)
        print(f"  Run {i+1}: {text[:80]}")
    unique = len(set(outputs))
    print(f"  ‚Üí Unique outputs: {unique}/{num_runs}")


  Temperature = 0.0 ‚Äî Running 4 times
  Run 1: Blue is one of the colors.
  Run 2: Blue is one of the colors.
  Run 3: Blue is one of the colors.
  Run 4: Blue is one of the colors.
  ‚Üí Unique outputs: 1/4

  Temperature = 1.0 ‚Äî Running 4 times
  Run 1: I'm sorry, but I cannot provide an answer to your question as I am a model witho
  Run 2: Red is one of the colors you can name.
  Run 3: Blue is a color.
  Run 4: Red is a well-known color that many people associate with various feelings and e
  ‚Üí Unique outputs: 4/4


### Experiment 1C: Temperature for Different Tasks

Different tasks need different temperature settings.

In [5]:
tasks = [
    {
        "label": "Factual Q&A (low temp is better)",
        "prompt": "What is the capital of Inida? Answer in one sentence.",
        "temps": [0.0, 0.7, 1.5]
    },
    {
        "label": "Creative Writing (higher temp is better)",
        "prompt": "Describe a sunset without using the word 'sun' or 'sky'. One sentence only.",
        "temps": [0.0, 0.7, 1.5]
    },
    {
        "label": "Code Generation (low temp is better)",
        "prompt": "Write a Python one-liner that reverses a string.",
        "temps": [0.0, 0.7, 1.5]
    }
]

for task in tasks:
    print(f"\n{'#' * 60}")
    print(f"  TASK: {task['label']}")
    print(f"{'#' * 60}")
    for temp in task["temps"]:
        print(f"\n--- Temperature = {temp} ---")
        _ = generate(task["prompt"], temperature=temp)

    print(f"\nüí° Observation: Compare how temperature affects accuracy vs. creativity above.")



############################################################
  TASK: Factual Q&A (low temp is better)
############################################################

--- Temperature = 0.0 ---


The capital of India is New Delhi.


‚è±Ô∏è 5.26s | 34 chars

--- Temperature = 0.7 ---


The capital of India is New Delhi.


‚è±Ô∏è 7.21s | 34 chars

--- Temperature = 1.5 ---


The capital of India is New Delhi.


‚è±Ô∏è 11.46s | 34 chars

üí° Observation: Compare how temperature affects accuracy vs. creativity above.

############################################################
  TASK: Creative Writing (higher temp is better)
############################################################

--- Temperature = 0.0 ---


A gentle breeze whispers through the trees, carrying the scent of earth and distant oceans as twilight descends, painting the horizon with hues of orange and pink that slowly fade into the night's deep blue canvas.


‚è±Ô∏è 3.48s | 214 chars

--- Temperature = 0.7 ---


A gentle shower of oranges and pinks descended upon the world, painting everything it touched in its warmest hues.


‚è±Ô∏è 5.44s | 114 chars

--- Temperature = 1.5 ---


Endless hues paint the horizon, twilight dances with soft shadows.


‚è±Ô∏è 2.37s | 66 chars

üí° Observation: Compare how temperature affects accuracy vs. creativity above.

############################################################
  TASK: Code Generation (low temp is better)
############################################################

--- Temperature = 0.0 ---


Here's a simple one-liner in Python to reverse a string:

```python
reversed_string = ''.join(reversed(string))
```

This line of code takes the input string, `string`, and uses the built-in `reversed()` function to reverse it. The reversed characters are then joined back into a single string using the `join()` method with an empty string as the separator.

Note that this assumes you're working in Python 3.x. If you're using Python 2.x, you'll need to use `string[::-1]` instead of `reversed(string)`.


‚è±Ô∏è 3.92s | 505 chars

--- Temperature = 0.7 ---


Here is a simple one-liner in Python to reverse a string:

```python
reversed_string = "".join(reversed(input("Enter the string: ")))
print(reversed_string)
```

This script takes input from the user, reverses it using the `reversed()` function, and then prints out the reversed string.


‚è±Ô∏è 2.07s | 286 chars

--- Temperature = 1.5 ---


```
'hello'.reversed()
```


‚è±Ô∏è 1.72s | 26 chars

üí° Observation: Compare how temperature affects accuracy vs. creativity above.


---

## 2. Top-P (Nucleus Sampling) ‚Äî Probability Threshold

Top-P limits the model to the **smallest set of tokens** whose cumulative probability adds up to P.

```
Top-P = 0.1  ‚Üí  Only the top ~10% probability mass (very focused)
Top-P = 0.9  ‚Üí  Top ~90% probability mass (more diverse)
Top-P = 1.0  ‚Üí  Consider all tokens (no filtering)
```

**Example ‚Äî Next token probabilities:**
```
Token:   "the"   "a"    "an"   "one"  "my"  "some" ...
Prob:     0.35   0.25   0.15   0.10   0.05   0.03  ...

Top-P=0.5 ‚Üí selects {"the", "a"} (0.35+0.25=0.60 ‚â• 0.5)
Top-P=0.9 ‚Üí selects {"the", "a", "an", "one", "my"}
```

### Experiment 2A: Top-P Narrow vs. Wide

In [6]:
prompt = "List 5 unusual hobbies someone might enjoy."

configs = [
    {"label": "üéØ Top-P = 0.1 (Very Focused)",  "top_p": 0.1, "temperature": 0.8},
    {"label": "‚öñÔ∏è Top-P = 0.5 (Moderate)",       "top_p": 0.5, "temperature": 0.8},
    {"label": "üåä Top-P = 0.95 (Diverse)",       "top_p": 0.95, "temperature": 0.8},
]

_ = compare(prompt, configs)


  üéØ Top-P = 0.1 (Very Focused)
  Parameters: top_p=0.1, temperature=0.8


1. Playing the banjo: The banjo is a stringed instrument that originated in Africa and was brought to America by slaves. It has since become popular among musicians, but it can also be enjoyed as a hobby for those who want to learn how to play this unique instrument.
  2. Collecting vintage toys: Vintage toys are collectible items from the past that have sentimental value or appeal to certain collectors. This hobby requires finding and acquiring rare or unusual toys, which can include everything from classic cars to antique dolls.
  3. Growing your own food: Gardening is a popular hobby for many people, but growing your own food takes it one step further by allowing you to control the quality of what you eat. This hobby can be especially appealing if you live in an area with limited access to fresh produce or want to reduce your carbon footprint.
  4. Playing chess: Chess is a strategic board game that requires patience, planning, and problem-solving skills. It has been around for centuries and remains popular among adults as a way to challenge their minds and improve their thinking abilities.
  5. Collecting stamps: Stamps are small pieces of paper with designs printed on them that can be used to send letters or packages internationally. This hobby requires collecting rare or unique stamps, which can include everything from vintage designs to those featuring famous landmarks or historical events.


‚è±Ô∏è 3.41s | 1420 chars

  ‚öñÔ∏è Top-P = 0.5 (Moderate)
  Parameters: top_p=0.5, temperature=0.8


1. Learning to play the banjo or mandolin.
2. Sewing and crafting handmade clothing items such as bags, scarves, and jewelry.
3. Photography with a macro lens for close-up photography of insects, flowers, and other small subjects.
4. Growing and maintaining an indoor herb garden.
5. Volunteering at a local animal shelter to help animals in need.


‚è±Ô∏è 3.80s | 347 chars

  üåä Top-P = 0.95 (Diverse)
  Parameters: top_p=0.95, temperature=0.8


Sure, here are five unusual hobbies:
1. Dancing on a rollerblades or a unicycle.
2. Playing music while driving a car.
3. Performing magic tricks in front of people.
4. Taking pictures underwater without going to the beach.
5. Painting with water.


‚è±Ô∏è 2.08s | 247 chars


### Experiment 2B: Temperature vs. Top-P ‚Äî They Work Together

Temperature reshapes probabilities *first*, then Top-P filters the result. Using both together gives fine-grained control.

In [7]:
prompt = "Invent a name for a fantasy tavern."

configs = [
    {"label": "Low Temp + Low Top-P (Most predictable)",   "temperature": 0.2, "top_p": 0.3},
    {"label": "Low Temp + High Top-P",                     "temperature": 0.2, "top_p": 0.95},
    {"label": "High Temp + Low Top-P",                     "temperature": 1.2, "top_p": 0.3},
    {"label": "High Temp + High Top-P (Most creative)",    "temperature": 1.2, "top_p": 0.95},
]

_ = compare(prompt, configs)


  Low Temp + Low Top-P (Most predictable)
  Parameters: temperature=0.2, top_p=0.3


"Drinking Springs"


‚è±Ô∏è 1.55s | 18 chars

  Low Temp + High Top-P
  Parameters: temperature=0.2, top_p=0.95


"Meridian's Echoes"


‚è±Ô∏è 2.95s | 19 chars

  High Temp + Low Top-P
  Parameters: temperature=1.2, top_p=0.3


"Drinking Springs"


‚è±Ô∏è 3.03s | 18 chars

  High Temp + High Top-P (Most creative)
  Parameters: temperature=1.2, top_p=0.95


"Merlin's Mug"


‚è±Ô∏è 3.01s | 14 chars


---

## 3. Top-K ‚Äî Fixed Token Pool Size

Top-K is simpler than Top-P: it always considers exactly the **K most likely tokens**, regardless of their probabilities.

```
Top-K = 1   ‚Üí  Greedy decoding (always pick the #1 token)
Top-K = 10  ‚Üí  Choose from top 10 tokens
Top-K = 50  ‚Üí  Choose from top 50 tokens (more variety)
```

### Experiment 3A: Top-K Values Compared

In [8]:
prompt = "Give me a one-word synonym for 'Work'."

configs = [
    {"label": "Top-K = 1 (Greedy ‚Äî always picks top token)",  "top_k": 1,  "temperature": 0.8},
    {"label": "Top-K = 5",                                     "top_k": 5,  "temperature": 0.8},
    {"label": "Top-K = 40 (Default for many models)",          "top_k": 40, "temperature": 0.8},
    {"label": "Top-K = 100 (Wide pool)",                       "top_k": 100, "temperature": 0.8},
]

_ = compare(prompt, configs)


  Top-K = 1 (Greedy ‚Äî always picks top token)
  Parameters: top_k=1, temperature=0.8


Job


‚è±Ô∏è 3.95s | 3 chars

  Top-K = 5
  Parameters: top_k=5, temperature=0.8


Job


‚è±Ô∏è 1.12s | 3 chars

  Top-K = 40 (Default for many models)
  Parameters: top_k=40, temperature=0.8


Job


‚è±Ô∏è 3.84s | 3 chars

  Top-K = 100 (Wide pool)
  Parameters: top_k=100, temperature=0.8


Job


‚è±Ô∏è 2.78s | 3 chars


### Experiment 3B: Top-K Consistency Test

With `top_k=1`, the output should be identical every time (greedy). Let's check.

In [9]:
prompt = "What is 2 + 2? Reply with just the number."
num_runs = 4

for k_val in [1, 50]:
    print(f"\n{'=' * 60}")
    print(f"  Top-K = {k_val} ‚Äî Running {num_runs} times")
    print(f"{'=' * 60}")
    outputs = []
    for i in range(num_runs):
        llm = ChatOllama(model=MODEL, top_k=k_val, temperature=0.8)
        resp = llm.invoke([HumanMessage(content=prompt)])
        text = resp.content.strip()
        outputs.append(text)
        print(f"  Run {i+1}: {text[:80]}")
    unique = len(set(outputs))
    print(f"  ‚Üí Unique outputs: {unique}/{num_runs}")


  Top-K = 1 ‚Äî Running 4 times
  Run 1: 4
  Run 2: 4
  Run 3: 4
  Run 4: 4
  ‚Üí Unique outputs: 1/4

  Top-K = 50 ‚Äî Running 4 times
  Run 1: 4
  Run 2: 4
  Run 3: 4
  Run 4: 4
  ‚Üí Unique outputs: 1/4


### Top-K vs. Top-P ‚Äî When to Use Which?

| Feature | Top-K | Top-P |
|---------|-------|-------|
| Pool size | **Fixed** (always K tokens) | **Dynamic** (varies per step) |
| Adapts to confidence? | No | Yes |
| Best for | Simple control | Nuanced generation |
| Common defaults | K=40 | P=0.9 |

---

## 4. Max Tokens ‚Äî Controlling Response Length

**Max Tokens** (called `num_predict` in Ollama) sets a hard ceiling on how many tokens the model generates. It does NOT guarantee that length ‚Äî the model may stop earlier if it finishes its thought.

```
1 token ‚âà 4 characters ‚âà ¬æ of a word (English)
```

### Experiment 4A: Varying Max Tokens

In [10]:
prompt = "Explain the theory of relativity in detail."

configs = [
    {"label": "üîπ Max Tokens = 20 (Very Short)",   "num_predict": 20},
    {"label": "üîπ Max Tokens = 80 (Short)",        "num_predict": 80},
    {"label": "üîπ Max Tokens = 300 (Medium)",      "num_predict": 300},
]

_ = compare(prompt, configs)


  üîπ Max Tokens = 20 (Very Short)
  Parameters: num_predict=20


The Theory of Relativity is a scientific theory developed by two physicists, Albert Einstein and Henri Poin


‚è±Ô∏è 0.91s | 107 chars

  üîπ Max Tokens = 80 (Short)
  Parameters: num_predict=80


The theory of relativity is a set of two interrelated theories proposed by Albert Einstein that describe how space and time behave under acceleration and at near light speeds, respectively. It was published in 1905 and revolutionized our understanding of physics.

Special Relativity

Special relativity applies to objects moving with negligible velocities (approximately less than 1% of the speed of light). It states


‚è±Ô∏è 0.77s | 418 chars

  üîπ Max Tokens = 300 (Medium)
  Parameters: num_predict=300


The theory of relativity is a set of scientific theories that describe how space and time behave under different conditions, including acceleration or motion relative to one another.

Albert Einstein developed two major theories: special relativity and general relativity. Special relativity describes the behavior of objects moving at constant speeds (relative to a particular inertial frame) in an environment where gravity is negligible. General relativity extends this idea to describe the behavior of objects under acceleration, which includes gravity.

In both cases, Einstein's theory states that space and time are not absolute but dependent on one's state of motion relative to other systems. This means that what you measure as your speed and direction is different from someone else who is moving at a different rate. The same applies to the measurement of distance and time itself.

The theory of relativity also implies some surprising effects, such as length contraction - objects in motion are shorter than those at rest relative to them. This leads to the famous idea that two events can be simultaneous for one observer but not another if they occur at different distances from an observer.

Overall, Einstein's theories have been proven correct by experiments and observations over time, including his prediction of the existence of black holes, which were later confirmed through various space missions such as LIGO.


‚è±Ô∏è 2.41s | 1435 chars


### Experiment 4B: Max Tokens ‚Äî Cutting Off Mid-Sentence

Watch what happens when the limit is too low ‚Äî the model gets cut off mid-thought.

In [11]:
prompt = "Tell me a short story about a brave knight."

for max_tok in [10, 30, 100]:
    print(f"\n{'=' * 60}")
    print(f"  num_predict = {max_tok}")
    print(f"{'=' * 60}")
    _ = generate(prompt, num_predict=max_tok)

print("\nüí° Notice how low limits produce incomplete responses!")


  num_predict = 10


Once upon a time, there was a brave knight


‚è±Ô∏è 0.18s | 42 chars

  num_predict = 30


Once upon a time, there was a young man named Sir William who had always dreamed of being a brave knight and riding into battle to defend his kingdom


‚è±Ô∏è 0.33s | 149 chars

  num_predict = 100


Once upon a time, there was a brave knight named Sir Roland who lived in the kingdom of Camelot. Sir Roland was known for his courage and bravery on the battlefield.

One day, the king called Sir Roland to go on a mission with him. The king had heard that an evil sorcerer was terrorizing the kingdom by casting dark magic and turning people into monsters. The king asked Sir Roland if he would be willing to put aside his love of adventure and help protect the kingdom from the harm


‚è±Ô∏è 0.97s | 483 chars

üí° Notice how low limits produce incomplete responses!


---

## 5. Frequency Penalty ‚Äî Reducing Repetition

Frequency Penalty penalizes tokens **proportionally to how many times** they've already appeared in the output. The more a word repeats, the harder it gets penalized.

```
Penalty = 0.0  ‚Üí  No penalty (default)
Penalty > 0    ‚Üí  Discourages repetition (higher = stronger)
Penalty < 0    ‚Üí  Encourages repetition (rarely useful)
```

In Ollama, this maps to the `repeat_penalty` parameter (default 1.1; values > 1.0 penalize repetition).

### Experiment 5A: Repetition With and Without Penalty

In [12]:
# A prompt that tends to cause repetitive output
prompt = "Write the word 'hello' in 10 different creative ways."

configs = [
    {"label": "üîÅ repeat_penalty = 1.0 (No penalty)",       "repeat_penalty": 1.0},
    {"label": "‚öñÔ∏è repeat_penalty = 1.1 (Default / Mild)",   "repeat_penalty": 1.1},
    {"label": "üö´ repeat_penalty = 1.5 (Strong penalty)",   "repeat_penalty": 1.5},
]

_ = compare(prompt, configs)


  üîÅ repeat_penalty = 1.0 (No penalty)
  Parameters: repeat_penalty=1.0


1. Hel-lo
2. Hola, world!
3. Hello, world!
4. Hello, universe!
5. Hello, earth!
6. Hello, day!
7. Hello, night!
8. Hello, time!
9. Hello, sky!
10. Hello, ocean!


‚è±Ô∏è 0.67s | 160 chars

  ‚öñÔ∏è repeat_penalty = 1.1 (Default / Mild)
  Parameters: repeat_penalty=1.1


1. Hola, amigo! 
2. Hello again, world!
3. Hallo zu dir!
4. Here's to you, hello!
5. Hello sunshine
6. Hola, how are you?
7. Hello, Universe!
8. Hello there, world.
9. Hello friends!
10. Hello and good night!


‚è±Ô∏è 0.71s | 208 chars

  üö´ repeat_penalty = 1.5 (Strong penalty)
  Parameters: repeat_penalty=1.5


Sure, here's my attempt to write "Hello" using various words and phrases:
- Hola (Spanish)
- Hallo! 
(Standard German greeting) - Hello!
Hillel: A famous Jewish rabbi.
L'illeh is a Hebrew word meaning 'goodbye'.
The phrase ‚Äúhello‚Äù means ‚Äòplease‚Äô in Hindi
Heelloo, for fun. An abbreviated form of "Hello".
"Hellewani" from Swahili language used to say hello.

A short version can also be heard as: Hello?
These are just some ideas and interpretations - you could come up with even more creative ways!


‚è±Ô∏è 1.22s | 499 chars


### Experiment 5B: Frequency Penalty on Longer Text

Repetition is more visible in longer outputs. Let's test with a paragraph-length prompt.

In [13]:
prompt = "Write a paragraph about the importance of reading books. Aim for about 100 words."

for penalty in [1.0, 1.3]:
    print(f"\n{'=' * 60}")
    print(f"  repeat_penalty = {penalty}")
    print(f"{'=' * 60}")
    result = generate(prompt, repeat_penalty=penalty, num_predict=200)

    # Count word frequency to show repetition
    words = result.lower().split()
    word_counts = {}
    for w in words:
        w_clean = w.strip('.,!?;:')
        if len(w_clean) > 3:  # skip short words
            word_counts[w_clean] = word_counts.get(w_clean, 0) + 1
    repeated = {w: c for w, c in word_counts.items() if c >= 3}
    if repeated:
        print(f"  üìä Words repeated 3+ times: {repeated}")
    else:
        print(f"  üìä No words repeated 3+ times ‚Äî good variety!")


  repeat_penalty = 1.0


Reading books is a fundamental activity that has immense importance in our lives. It provides knowledge, expands our vocabulary, and enhances our analytical skills. Reading books also helps in developing our imagination, as we are able to visualize and understand different scenarios through literature. It also helps us to develop empathy, as we can experience different emotions and viewpoints through books. Reading books can also help us to improve our critical thinking skills and enhance our creativity. In addition, reading books can help us to learn about different cultures and historical events, and it can also help us to improve our writing skills. In summary, reading books is an essential activity that has numerous benefits and should be encouraged in our daily lives.


‚è±Ô∏è 1.34s | 783 chars
  üìä Words repeated 3+ times: {'reading': 5, 'books': 6, 'skills': 3, 'also': 4, 'different': 3, 'help': 3}

  repeat_penalty = 1.3


Reading is an essential skill that provides numerous benefits, including expanding knowledge and vocabulary skills while also enhancing creativity by introducing new perspectives on familiar topics or challenging one's thoughts with different viewpoints through literature creation such as characters engaging in dialogue across cultures without biasing opinions based solely upon the author‚Äôs perspective alone.

Furthermore reading enriches our imagination leading us to visualize a scene when we read about it for example, imagine if you were walking down Main Street and see someone on their cellphone while an elderly couple with bags sitting outside talking quietly. You can picture them both in your mind's eye as they navigate different worlds through smartphones at the same time.

The ability of engaging one‚Äôs brain by reading also helps to build vocabulary skills that allow for better comprehension when listening or speaking, making conversations and discussions more interesting because you have a greater understanding on what is being discussed. Reading has been shown beneficial in reducing stress levels since it provides an escape from daily routines as well which can help reduce the impact of stressful events throughout your day


‚è±Ô∏è 1.87s | 1251 chars
  üìä Words repeated 3+ times: {'reading': 4}


---

## 6. Presence Penalty ‚Äî Encouraging Topic Diversity

Unlike Frequency Penalty (which scales with count), Presence Penalty applies a **flat penalty** to any token that has appeared **at least once**. It doesn't matter if it appeared 1 time or 50 ‚Äî the penalty is the same.

```
Frequency Penalty:  "the" appeared 5x ‚Üí penalized 5√ó as much
Presence  Penalty:  "the" appeared 5x ‚Üí same penalty as if it appeared 1x
```

This encourages the model to bring in **new topics and vocabulary** rather than just avoiding repetition.

> **Note:** In Ollama, `repeat_penalty` combines both frequency and presence penalty effects. We simulate the distinction below.

### Experiment 6A: Presence Penalty Effect on Vocabulary Diversity

In [14]:
prompt = "List 10 different animals. Just the names, one per line."

configs = [
    {"label": "repeat_penalty = 1.0 (No penalty)",    "repeat_penalty": 1.0, "temperature": 0.7},
    {"label": "repeat_penalty = 1.2 (Moderate)",      "repeat_penalty": 1.2, "temperature": 0.7},
    {"label": "repeat_penalty = 1.8 (Aggressive)",    "repeat_penalty": 1.8, "temperature": 0.7},
]

results = compare(prompt, configs)

# Analyze unique words in each
print(f"\n{'=' * 60}")
print("VOCABULARY DIVERSITY ANALYSIS")
print(f"{'=' * 60}")
for label, text in results.items():
    words = set(text.lower().split())
    print(f"  {label[:40]:40s} ‚Üí {len(words)} unique words")


  repeat_penalty = 1.0 (No penalty)
  Parameters: repeat_penalty=1.0, temperature=0.7


1. Lion
2. Tiger
3. Elephant
4. Giraffe
5. Zebra
6. Monkey
7. Koala
8. Penguin
9. Penguin
10. Kangaroo


‚è±Ô∏è 0.50s | 102 chars

  repeat_penalty = 1.2 (Moderate)
  Parameters: repeat_penalty=1.2, temperature=0.7


Sure! Here's my list of ten animal names:

Dog
Cat
Bird 
Rabbit  
Lion    
Elephant     
Monkey   
Dolphin   
Tiger      
Fish


‚è±Ô∏è 0.39s | 126 chars

  repeat_penalty = 1.8 (Aggressive)
  Parameters: repeat_penalty=1.8, temperature=0.7


- Lion 
Lioness  
Leopard   
Elephant    
Eagle      
Monkey     
Rhinoceros   Jaguar    Turtle     Fox      Monkey Dog Cat Tiger Rabbit WolfË±π ÁãÆÂ≠ê ËÄÅËôé Â§ßË±° Èπ∞ ÂÆ†Áâ©Áãó Â≠îÈõÄ Ê†ëÊáí ÈáéÁå™ ÂÖΩÁ±ªÁå´


‚è±Ô∏è 0.68s | 170 chars

VOCABULARY DIVERSITY ANALYSIS
  repeat_penalty = 1.0 (No penalty)        ‚Üí 19 unique words
  repeat_penalty = 1.2 (Moderate)          ‚Üí 18 unique words
  repeat_penalty = 1.8 (Aggressive)        ‚Üí 25 unique words


### Frequency vs. Presence Penalty ‚Äî Comparison

| Aspect | Frequency Penalty | Presence Penalty |
|--------|-------------------|------------------|
| Scales with count? | **Yes** ‚Äî more repetitions = more penalty | **No** ‚Äî flat penalty after first use |
| Best for | Reducing word-level repetition | Encouraging topic diversity |
| Use case | Preventing "the the the..." | Making model explore new ideas |

---

## 7. Stop Sequences ‚Äî Halting Generation

Stop sequences are strings that **immediately end** the model's generation when encountered. The model stops *before* including the stop string in the output.

Common uses:
- Stop at a newline (`\n`) for single-line answers
- Stop at a delimiter (`---`, `END`) for structured extraction
- Stop at a role marker (`User:`) to prevent the model from simulating conversation

### Experiment 7A: Stopping at a Newline (Single-Line Answers)

In [15]:
prompt = "Name a famous scientist and describe their contribution."

print("=" * 60)
print("WITHOUT stop sequence")
print("=" * 60)
_ = generate(prompt)

print("\n" + "=" * 60)
print("WITH stop=['.'] ‚Äî stops at first period")
print("=" * 60)
_ = generate(prompt, stop=["."])

WITHOUT stop sequence


One of the most famous scientists is Albert Einstein, known for his contributions to the fields of physics, particularly in the development of relativity theory.

Einstein's most well-known work was published during 1905, when he developed four significant theories that would change the way we understand our universe. The first was the photoelectric effect, which explained how light could be split into individual particles called photons. The second and third were his famous equations E=mc^2, which described the relationship between energy and mass, and the idea of special relativity.

Special relativity challenged Newton's ideas about space and time, suggesting that they are not independent but rather interconnected through the speed of light. Einstein then combined special relativity with gravity to create general relativity, a theory that explains how massive objects curve spacetime in their vicinity due to their mass.

Einstein also made significant contributions to quantum mechanics, developing statistical methods for calculating probabilities in atomic collisions and proposing that particles exist in multiple states until they are observed. His work has been instrumental in the development of modern physics and continues to influence research in fields ranging from astronomy to astrophysics.


‚è±Ô∏è 2.15s | 1318 chars

WITH stop=['.'] ‚Äî stops at first period


Albert Einstein is a famous scientist who made significant contributions to the field of physics


‚è±Ô∏è 0.20s | 96 chars


### Experiment 7B: Stop Sequences for Structured Output

In [16]:
prompt = """Extract the person's name from the text below.

Text: "Dr. Sarah Chen published her findings on climate change last Tuesday."

Name:"""

print("=" * 60)
print("WITH stop=['\\n'] ‚Äî stops after extracting the name")
print("=" * 60)
_ = generate(prompt, stop=["\n"], temperature=0.0)

WITH stop=['\n'] ‚Äî stops after extracting the name


Sarah Chen


‚è±Ô∏è 0.15s | 10 chars


### Experiment 7C: Stop Sequences to Prevent Role-Playing

In [17]:
prompt = """Answer the user's question in one sentence.

User: What is gravity?
Assistant:"""

print("=" * 60)
print("WITHOUT stop ‚Äî model might continue as 'User:' and 'Assistant:'")
print("=" * 60)
_ = generate(prompt, num_predict=200)

print("\n" + "=" * 60)
print("WITH stop=['User:', '\\n\\n'] ‚Äî halts after one response")
print("=" * 60)
_ = generate(prompt, stop=["User:", "\n\n"], num_predict=200)

WITHOUT stop ‚Äî model might continue as 'User:' and 'Assistant:'


Gravity is a fundamental force of nature that attracts two masses towards each other with an acceleration directly proportional to their masses and inversely proportional to the square of the distance between them.


‚è±Ô∏è 0.43s | 214 chars

WITH stop=['User:', '\n\n'] ‚Äî halts after one response


Gravity is a force that attracts two masses towards each other, with objects of greater mass having a stronger gravitational pull.


‚è±Ô∏è 0.28s | 130 chars


---

## 8. Combining Parameters ‚Äî Real-World Recipes

In practice, you'll combine multiple parameters together. Here are some common "recipes":

| Use Case | Temperature | Top-P | Top-K | Repeat Penalty | Max Tokens |
|----------|-------------|-------|-------|----------------|------------|
| Factual Q&A | 0.0 | 1.0 | 1 | 1.0 | 100-200 |
| Creative Writing | 0.9-1.2 | 0.9 | 50 | 1.2 | 500+ |
| Code Generation | 0.0-0.2 | 0.95 | 40 | 1.1 | 500 |
| Brainstorming | 1.0 | 0.95 | 100 | 1.3 | 300 |
| Data Extraction | 0.0 | 1.0 | 1 | 1.0 | 100 |

### Experiment 8A: Recipe Comparison

In [18]:
prompt = "Suggest 3 startup ideas related to artificial intelligence."

configs = [
    {
        "label": "üìã Conservative (Factual style)",
        "temperature": 0.1, "top_p": 1.0, "top_k": 1, "repeat_penalty": 1.0, "num_predict": 200
    },
    {
        "label": "‚öñÔ∏è Balanced (General purpose)",
        "temperature": 0.7, "top_p": 0.9, "top_k": 40, "repeat_penalty": 1.1, "num_predict": 200
    },
    {
        "label": "üöÄ Creative (Brainstorming)",
        "temperature": 1.1, "top_p": 0.95, "top_k": 100, "repeat_penalty": 1.3, "num_predict": 200
    }
]

_ = compare(prompt, configs)


  üìã Conservative (Factual style)
  Parameters: temperature=0.1, top_p=1.0, top_k=1, repeat_penalty=1.0, num_predict=200


1. Personalized Health Assistant: This startup idea focuses on developing an AI-powered health assistant that can help individuals manage their health and wellness. The assistant can provide personalized health recommendations based on the user's medical history, lifestyle, and preferences. It can also monitor the user's health metrics such as heart rate, blood pressure, and sleep patterns, and provide alerts if any issues are detected.

2. Smart Home Automation: This startup idea involves developing an AI-powered smart home automation system that can control various household appliances and devices using voice commands or smart home apps. The system can learn the user's preferences and habits, and automatically adjust the temperature, lighting, and security systems based on the user's schedule and preferences.

3. AI-Powered Language Translation: This startup idea focuses on developing an AI-powered language translation system that can translate text and speech between different languages in real-time. The system can be used for international communication, remote work, and education. It can also be used for language learning and


‚è±Ô∏è 1.84s | 1148 chars

  ‚öñÔ∏è Balanced (General purpose)
  Parameters: temperature=0.7, top_p=0.9, top_k=40, repeat_penalty=1.1, num_predict=200


1. AI-Powered Personalized Nutrition: Develop an app that uses machine learning algorithms to analyze your dietary habits and recommend personalized nutrition plans based on your health goals, preferences, and lifestyle.
2. Virtual Reality Therapy for Anxiety Disorders: Create a platform using virtual reality technology to simulate anxiety-inducing situations and provide therapy sessions in a safe environment. This could help individuals with anxiety disorders learn coping mechanisms without the need for physical exposure to stressful situations.
3. AI-Powered Language Translation Services: Develop an app that uses natural language processing techniques to translate between various languages in real-time, making it easy and convenient for people who communicate globally.


‚è±Ô∏è 1.20s | 781 chars

  üöÄ Creative (Brainstorming)
  Parameters: temperature=1.1, top_p=0.95, top_k=100, repeat_penalty=1.3, num_predict=200


Here is the list of three different startups which can benefit from Artificial Intelligence technology:

1) A start up that offers AI-powered chatbots for customer service: The goal here would be providing quick and efficient customer support via an intelligent agent, instead or calling a live human operator.

2) An app to help remote workers with time management through the use of project management skills assisted by machine learning algorithms. 3

3)a start up focused on creating personalized coaching plans that leverages AI-powered analysis - it offers insights based not only data collection but also cognitive ability and health factors, helping individual users achieve their training goals effectively while increasing productivity at work with no significant amount to pay for the service


‚è±Ô∏è 1.32s | 803 chars


### Experiment 8B: Code Generation Recipe

In [19]:
prompt = "Write a Python function that checks if a string is a palindrome."

system = "You are a Python developer. Write clean, well-commented code. Only output the code, nothing else."

configs = [
    {
        "label": "üéØ Precise Code (temp=0, top_k=1)",
        "temperature": 0.0, "top_k": 1, "num_predict": 300
    },
    {
        "label": "üé® Creative Code (temp=0.8, top_k=50)",
        "temperature": 0.8, "top_k": 50, "num_predict": 300
    },
]

_ = compare(prompt, configs, system=system)


  üéØ Precise Code (temp=0, top_k=1)
  Parameters: temperature=0.0, top_k=1, num_predict=300


```python
def is_palindrome(s):
    """
    Check if the given string s is a palindrome.
    
    A palindrome is a word, phrase, number, or other sequence of characters which reads the same backward as forward.
    
    Parameters:
    s (str): The string to check.
    
    Returns:
    bool: True if s is a palindrome, False otherwise.
    """
    # Normalize the string by removing spaces and converting to lowercase
    normalized_str = ''.join(e for e in s.lower() if e.isalnum())
    
    # Check if the normalized string is equal to its reverse
    return normalized_str == normalized_str[::-1]

# Example usage:
print(is_palindrome("A man, a plan, a canal: Panama"))  # True
print(is_palindrome("racecar"))  # True
print(is_palindrome("hello"))  # False
```


‚è±Ô∏è 1.77s | 766 chars

  üé® Creative Code (temp=0.8, top_k=50)
  Parameters: temperature=0.8, top_k=50, num_predict=300


```python
def is_palindrome(s):
    """
    Checks if a given string s is a palindrome.
    
    A palindrome reads the same backward as forward. This function will return True if the input string is a palindrome, and False otherwise.
    
    Parameters:
    s (str): The string to check.

    Returns:
    bool: True if s is a palindrome, False otherwise.
    """
    # Normalize the string by removing spaces and converting to lowercase
    normalized = ''.join(e for e in s.lower() if e.isalnum())
    
    # Check if the string reads the same forwards as backwards
    return normalized == normalized[::-1]

# Example usage
print(is_palindrome("A man, a plan, a canal: Panama"))  # True
print(is_palindrome("racecar"))  # True
print(is_palindrome("hello"))  # False
```


‚è±Ô∏è 1.72s | 774 chars


---

## 9. Sandbox ‚Äî Try It Yourself!

Experiment with any combination of parameters below.

In [20]:
# ============================================================
#  SANDBOX - Tweak these values and re-run!
# ============================================================

my_prompt     = "Describe the future of space travel in 3 sentences."
my_system     = "You are a futurist and science communicator."

my_params = {
    "temperature":    0.7,    # 0.0 to 2.0
    "top_p":          0.9,    # 0.0 to 1.0
    "top_k":          40,     # 1 to 100+
    "num_predict":    200,    # max tokens to generate
    "repeat_penalty": 1.1,    # 1.0 = off, higher = less repetition
    # "stop":         ["."],  # uncomment to stop at first period
}

# ============================================================

print("YOUR CUSTOM EXPERIMENT")
print("=" * 60)
params_str = '\n'.join(f"  {k:20s} = {v}" for k, v in my_params.items())
print(params_str)
print("=" * 60)
_ = generate(my_prompt, system=my_system, **my_params)

YOUR CUSTOM EXPERIMENT
  temperature          = 0.7
  top_p                = 0.9
  top_k                = 40
  num_predict          = 200
  repeat_penalty       = 1.1


The future of space travel will likely involve frequent, affordable trips to other planets with advanced propulsion technologies, sustainable habitats that minimize environmental impact, and increased accessibility for all, not just wealthy individuals or nations. As we explore further into deep space, we'll witness the evolution of space tourism becoming more inclusive and accessible, pushing boundaries in exploration and discovery beyond our current understanding of what is possible in interstellar travel.


‚è±Ô∏è 0.84s | 513 chars


---

## Key Takeaways

| Parameter | What It Does | Typical Range | When to Adjust |
|-----------|-------------|---------------|----------------|
| **Temperature** | Controls randomness | 0.0 ‚Äì 1.5 | Lower for facts, higher for creativity |
| **Top-P** | Dynamic probability cutoff | 0.1 ‚Äì 1.0 | Use ~0.9 for general; lower for precision |
| **Top-K** | Fixed candidate pool size | 1 ‚Äì 100 | 1 for greedy; 40-50 for balanced |
| **Max Tokens** | Hard output length limit | 10 ‚Äì 4096 | Match to your expected output length |
| **Frequency Penalty** | Penalizes repeated tokens proportionally | 1.0 ‚Äì 1.5 | Increase for less repetition |
| **Presence Penalty** | Flat penalty on any used token | 1.0 ‚Äì 1.5 | Increase for broader vocabulary |
| **Stop Sequences** | Halts generation at specific strings | N/A | Use for structured/single-line output |

### Rules of Thumb

1. **Start with defaults** ‚Äî temperature=0.7, top_p=0.9, top_k=40
2. **Adjust one parameter at a time** ‚Äî so you can see what each one does
3. **Temperature and Top-P overlap** ‚Äî usually tune one or the other, not both aggressively
4. **Low temperature + Top-K=1** ‚Äî effectively deterministic (greedy decoding)
5. **Stop sequences are underused** ‚Äî they're great for structured extraction tasks