# 2.1.3 Prompt Anatomy, Context Windows & Common Pitfalls

## Playground Notebook

This notebook covers three foundational topics that will make you a better prompt engineer:

| Topic | What You'll Learn |
|-------|------------------|
| **Anatomy of a Prompt** | The 5 building blocks of every effective prompt |
| **Context Window Management** | How to work within token limits and manage long conversations |
| **Common Pitfalls** | Mistakes that cause bad outputs — and how to fix them |

---

In [None]:
import json
import time
from IPython.display import display, Markdown, HTML
from langchain_ollama import ChatOllama
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage

# ============================================================
#  CONFIGURATION - Change the model name here if needed
# ============================================================
MODEL = "qwen2.5:1.5b"  # Options: "qwen2.5:1.5b", "llama3.2", "mistral", "gemma2", etc.

llm = ChatOllama(model=MODEL)

In [None]:
# ============================================================
#  HELPER FUNCTIONS
# ============================================================

def build_messages(message_dicts):
    """Convert a list of role/content dicts into LangChain message objects."""
    type_map = {
        "system": SystemMessage,
        "user": HumanMessage,
        "assistant": AIMessage,
    }
    return [type_map[m["role"]](content=m["content"]) for m in message_dicts]


def chat(messages, **kwargs):
    """Send messages to the model and return the response text."""
    _llm = ChatOllama(model=MODEL, **kwargs) if kwargs else llm
    lc_messages = build_messages(messages)
    start = time.time()
    response = _llm.invoke(lc_messages)
    elapsed = time.time() - start
    content = response.content
    display(Markdown(content))
    print(f"\n\u23f1\ufe0f {elapsed:.2f}s | {len(content)} chars")
    return content


def show_messages(messages):
    """Pretty-print the message list being sent to the model."""
    colors = {"system": "#e74c3c", "user": "#3498db", "assistant": "#2ecc71"}
    html = ""
    for msg in messages:
        role = msg["role"]
        color = colors.get(role, "#888")
        content_preview = msg["content"][:300] + ("..." if len(msg["content"]) > 300 else "")
        html += (
            f'<div style="margin:6px 0;padding:8px 12px;border-left:4px solid {color};'
            f'background:#1e1e1e;border-radius:4px;">'
            f'<strong style="color:{color};text-transform:uppercase;">{role}</strong>'
            f'<br><span style="color:#ccc;white-space:pre-wrap;">{content_preview}</span></div>'
        )
    display(HTML(html))


print(f"\u2705 Using model: {MODEL}")

---

## 1. Anatomy of an Effective Prompt

Every well-crafted prompt is built from up to **5 components**. You don't always need all five, but knowing them helps you write better prompts every time.

| Component | What It Does | Example |
|-----------|-------------|--------|
| **Task** | The core instruction — what you want the model to do | "Summarize this article" |
| **Context** | Background information the model needs | "This is a customer complaint about late delivery" |
| **Role** | Who the model should be (persona) | "You are a senior data scientist" |
| **Format** | How the output should be structured | "Respond as a numbered list" |
| **Constraints** | Boundaries and rules | "Use 3 sentences max. No jargon." |

```
\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510
\u2502  ROLE:        Who should the model be?       \u2502
\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524
\u2502  CONTEXT:     What does it need to know?     \u2502
\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524
\u2502  TASK:        What should it do?             \u2502
\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524
\u2502  FORMAT:      How should it respond?         \u2502
\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524
\u2502  CONSTRAINTS: What are the limits?           \u2502
\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518
```

### Experiment 1A: Bare Prompt vs. Well-Structured Prompt

Let's see the dramatic difference between a lazy prompt and one that uses all 5 components.

In [None]:
# --- Bare prompt: just the task, nothing else ---
print("=" * 60)
print("BARE PROMPT (task only)")
print("=" * 60)

bare = [
    {"role": "user", "content": "Write about machine learning."}
]
show_messages(bare)
_ = chat(bare)

In [None]:
# --- Well-structured prompt: all 5 components ---
print("=" * 60)
print("WELL-STRUCTURED PROMPT (all 5 components)")
print("=" * 60)

structured = [
    {
        "role": "system",
        "content": (
            # ROLE
            "You are a tech blogger who writes for beginners. "
            # CONSTRAINTS
            "Keep explanations simple. Avoid jargon. Use analogies from everyday life."
        )
    },
    {
        "role": "user",
        "content": (
            # CONTEXT
            "I'm writing a blog post for people who have never heard of AI. "
            # TASK
            "Write an introduction paragraph explaining what machine learning is. "
            # FORMAT
            "Use exactly 3 sentences. "
            # CONSTRAINTS
            "Do not use the words 'algorithm' or 'neural network'."
        )
    }
]
show_messages(structured)
_ = chat(structured)

print("\n\ud83d\udca1 Compare: The structured prompt gives a focused, audience-appropriate response.")

### Experiment 1B: Adding Components Incrementally

Watch how the output improves as we add each component one by one.

In [None]:
# We'll build up the same prompt step by step
levels = [
    {
        "label": "Task only",
        "system": "You are a helpful assistant.",
        "user": "Explain diabetes."
    },
    {
        "label": "Task + Context",
        "system": "You are a helpful assistant.",
        "user": "I'm a newly diagnosed Type 2 diabetes patient. Explain diabetes."
    },
    {
        "label": "Task + Context + Role",
        "system": "You are a compassionate family doctor speaking to a patient.",
        "user": "I'm a newly diagnosed Type 2 diabetes patient. Explain diabetes."
    },
    {
        "label": "Task + Context + Role + Format",
        "system": "You are a compassionate family doctor speaking to a patient.",
        "user": (
            "I'm a newly diagnosed Type 2 diabetes patient. Explain diabetes. "
            "Structure your response as: 1) What it is, 2) Why it happens, 3) What I can do."
        )
    },
    {
        "label": "All 5: Task + Context + Role + Format + Constraints",
        "system": (
            "You are a compassionate family doctor speaking to a patient. "
            "Use simple language. No medical jargon. Keep it under 150 words."
        ),
        "user": (
            "I'm a newly diagnosed Type 2 diabetes patient. Explain diabetes. "
            "Structure your response as: 1) What it is, 2) Why it happens, 3) What I can do."
        )
    }
]

for level in levels:
    print(f"\n{'=' * 60}")
    print(f"  {level['label']}")
    print(f"{'=' * 60}")
    messages = [
        {"role": "system", "content": level["system"]},
        {"role": "user", "content": level["user"]}
    ]
    show_messages(messages)
    result = chat(messages, temperature=0.3)
    print(f"  [Word count: {len(result.split())}]")

### Experiment 1C: Structure and Clarity Matter

Even with the same information, **how you organize** the prompt affects the output. Compare a messy prompt vs. a clearly structured one.

In [None]:
# --- Messy: all information crammed together ---
print("=" * 60)
print("MESSY PROMPT (same info, poor structure)")
print("=" * 60)

messy = [
    {"role": "user", "content": (
        "I need you to act like a nutritionist and give me a meal plan "
        "for someone who is vegetarian and wants to lose weight and "
        "make it for one day with breakfast lunch dinner and snacks "
        "and also mention calories for each meal and keep total under 1500 calories "
        "and format it nicely"
    )}
]
show_messages(messy)
_ = chat(messy, temperature=0.3)

In [None]:
# --- Clean: same info, well-structured with clear sections ---
print("=" * 60)
print("CLEAN PROMPT (same info, clear structure)")
print("=" * 60)

clean = [
    {
        "role": "system",
        "content": "You are a certified nutritionist. Provide evidence-based meal plans."
    },
    {
        "role": "user",
        "content": """Create a one-day meal plan with the following requirements:

**Patient profile:**
- Vegetarian
- Goal: weight loss

**Meals to include:**
- Breakfast, Lunch, Dinner, 2 Snacks

**Constraints:**
- Total calories: under 1500
- Show calorie count for each meal

**Format:**
- Use a Markdown table with columns: Meal, Items, Calories"""
    }
]
show_messages(clean)
_ = chat(clean, temperature=0.3)

print("\n\ud83d\udca1 Notice how the structured prompt gives a more organized, complete response.")

### Exercise 1: Build Your Own Prompt

Pick a task and build a prompt using all 5 components. Try:
- **Scenario A**: A travel itinerary for 2 days in Chennai
- **Scenario B**: A study plan for learning Python in 4 weeks
- **Scenario C**: A product comparison for choosing a laptop

In [None]:
# Fill in your prompt using all 5 components

my_role = ""        # Who should the model be?
my_constraints = "" # What limits or rules?
my_context = ""     # What background info is needed?
my_task = ""        # What should it do?
my_format = ""      # How should it format the response?

if my_task:
    messages = [
        {"role": "system", "content": f"{my_role} {my_constraints}"},
        {"role": "user", "content": f"{my_context} {my_task} {my_format}"}
    ]
    show_messages(messages)
    _ = chat(messages, temperature=0.5)
else:
    print("Fill in the 5 components above, then re-run!")

---

## 2. Context Window Management

Every LLM has a **context window** — a maximum number of tokens it can process (input + output combined). Think of it as the model's "working memory."

| Concept | Explanation |
|---------|------------|
| **Token** | A piece of a word. Roughly: 1 token \u2248 4 characters \u2248 \u00be of a word |
| **Context window** | Total tokens the model can handle at once (input + output) |
| **Input tokens** | System prompt + conversation history + user message |
| **Output tokens** | The model's generated response |

```
\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500 Context Window (e.g., 4096 tokens) \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510
\u2502  [System Prompt]  [Conversation History]  [User Msg]  [Response]  \u2502
\u2502  \u2190\u2500\u2500\u2500\u2500\u2500\u2500\u2500 Input Tokens \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2192  \u2190 Output \u2192  \u2502
\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518
```

**Common context window sizes:**

| Model | Context Window |
|-------|---------------|
| GPT-3.5 | 4,096 \u2013 16,384 tokens |
| GPT-4o | 128,000 tokens |
| Claude 3.5 Sonnet | 200,000 tokens |
| Llama 3.2 | 128,000 tokens |
| Qwen 2.5 (1.5B) | 32,768 tokens |

### Experiment 2A: Understanding Token Counts

Let's see how different texts translate into tokens.

In [None]:
import ollama

# Ollama doesn't expose a direct tokenizer, so we estimate:
# Rule of thumb: 1 token ≈ 4 characters (English)

def estimate_tokens(text):
    """Rough token estimate based on character count."""
    return len(text) // 4

texts = [
    ("Short sentence", "Hello, how are you?"),
    ("Medium paragraph", "Machine learning is a subset of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed. It focuses on developing computer programs that can access data and use it to learn for themselves."),
    ("Long passage", """The history of artificial intelligence dates back to the 1950s when pioneers like Alan Turing and John McCarthy laid the groundwork for what would become one of the most transformative fields in computer science. Turing proposed his famous test in 1950, asking whether machines could think. McCarthy coined the term 'artificial intelligence' at the Dartmouth Conference in 1956. Over the following decades, AI experienced cycles of optimism and disappointment known as 'AI winters.' The field saw renewed interest in the 2010s with the rise of deep learning, fueled by large datasets, powerful GPUs, and advances in neural network architectures. Today, AI powers everything from voice assistants to autonomous vehicles to medical diagnosis systems."""),
]

print(f"{'Text':<20} {'Characters':<12} {'Est. Tokens':<12} {'Words':<8}")
print("-" * 55)
for label, text in texts:
    chars = len(text)
    tokens = estimate_tokens(text)
    words = len(text.split())
    print(f"{label:<20} {chars:<12} {tokens:<12} {words:<8}")

print("\n\ud83d\udca1 Rule of thumb: 1 token \u2248 4 characters \u2248 \u00be of a word (in English)")
print("   A 4096-token window \u2248 ~3000 words \u2248 ~6 pages of text")

### Experiment 2B: What Happens When Context Gets Too Long

When conversation history grows large, the model may:
1. **Forget earlier context** — information at the beginning gets less attention
2. **Produce lower quality responses** — too much noise in the input
3. **Hit the limit and fail** — request gets rejected

Let's simulate a long conversation and test if the model remembers early information.

In [None]:
# Build a conversation where important info is mentioned early,
# then buried under many turns of filler

conversation = [
    {"role": "system", "content": "You are a helpful assistant. Answer questions based on the conversation."},
    # Important info planted at the start
    {"role": "user", "content": "My name is Priya and I'm allergic to peanuts. Remember this."},
    {"role": "assistant", "content": "Got it, Priya! I'll remember that you're allergic to peanuts."},
]

# Add filler conversation turns
filler_topics = [
    ("What's the capital of France?", "The capital of France is Paris."),
    ("Tell me a fun fact about octopuses.", "Octopuses have three hearts and blue blood!"),
    ("What's 17 times 23?", "17 times 23 equals 391."),
    ("Recommend a good book.", "I recommend 'Sapiens' by Yuval Noah Harari."),
    ("What causes rainbows?", "Rainbows are caused by light refracting through water droplets."),
    ("Name 3 programming languages.", "Python, JavaScript, and Rust are popular programming languages."),
    ("Who painted the Mona Lisa?", "Leonardo da Vinci painted the Mona Lisa."),
    ("What's the tallest mountain?", "Mount Everest is the tallest mountain at 8,849 meters."),
    ("Explain photosynthesis briefly.", "Plants convert sunlight, water, and CO2 into glucose and oxygen."),
    ("What year did India gain independence?", "India gained independence in 1947."),
]

for user_msg, asst_msg in filler_topics:
    conversation.append({"role": "user", "content": user_msg})
    conversation.append({"role": "assistant", "content": asst_msg})

# Now ask about the info from the beginning
conversation.append({"role": "user", "content": "What's my name and what am I allergic to?"})

print(f"Total messages in conversation: {len(conversation)}")
total_chars = sum(len(m['content']) for m in conversation)
print(f"Estimated total tokens: ~{total_chars // 4}")
print("\nAsking: 'What's my name and what am I allergic to?'")
print("=" * 60)
_ = chat(conversation, temperature=0.0)

print("\n\ud83d\udca1 With a small context, the model should remember.")
print("   But as conversations grow to thousands of tokens, early info can get lost.")

### Experiment 2C: Strategy — Summarize Previous Context

Instead of passing the entire conversation history, **summarize** older turns and keep only recent ones. This saves tokens while preserving important information.

In [None]:
# Strategy: Summarize old context into a compact system message

# Full conversation approach (expensive)
print("=" * 60)
print("APPROACH 1: Full conversation history (23 messages)")
print("=" * 60)

full_conversation = [
    {"role": "system", "content": "You are a project planning assistant."},
    {"role": "user", "content": "I'm building an e-commerce app using React and Node.js."},
    {"role": "assistant", "content": "Great choice! React for the frontend and Node.js for the backend is a popular stack."},
    {"role": "user", "content": "The database will be PostgreSQL."},
    {"role": "assistant", "content": "PostgreSQL is excellent for e-commerce — it handles complex queries and transactions well."},
    {"role": "user", "content": "My team has 3 developers and we have 8 weeks."},
    {"role": "assistant", "content": "With 3 devs and 8 weeks, you'll want to prioritize core features first."},
    {"role": "user", "content": "The budget is $15,000."},
    {"role": "assistant", "content": "That's a reasonable budget for an MVP. Let's focus on essential features."},
    # ... imagine many more turns ...
    {"role": "user", "content": "Give me a sprint plan for the first 2 weeks."}
]

total_chars = sum(len(m['content']) for m in full_conversation)
print(f"Estimated tokens: ~{total_chars // 4}")
_ = chat(full_conversation, temperature=0.3)

In [None]:
# Summarized approach (token-efficient)
print("=" * 60)
print("APPROACH 2: Summarized context (3 messages)")
print("=" * 60)

summarized_conversation = [
    {
        "role": "system",
        "content": (
            "You are a project planning assistant. "
            "Here is a summary of the conversation so far: "
            "The user is building an e-commerce app with React (frontend), "
            "Node.js (backend), and PostgreSQL (database). "
            "Team: 3 developers. Timeline: 8 weeks. Budget: $15,000."
        )
    },
    {"role": "user", "content": "Give me a sprint plan for the first 2 weeks."}
]

total_chars = sum(len(m['content']) for m in summarized_conversation)
print(f"Estimated tokens: ~{total_chars // 4}")
_ = chat(summarized_conversation, temperature=0.3)

print("\n\ud83d\udca1 The summarized version uses far fewer tokens but preserves key facts!")

### Experiment 2D: Sliding Window — Keep Only Last N Turns

Another strategy: only keep the **most recent N message pairs** in the conversation. Older messages get dropped.

In [None]:
def sliding_window(messages, max_pairs=3):
    """
    Keep the system message + only the last N user-assistant pairs.
    """
    system_msgs = [m for m in messages if m["role"] == "system"]
    non_system = [m for m in messages if m["role"] != "system"]
    # Keep last max_pairs * 2 messages (each pair = user + assistant)
    trimmed = non_system[-(max_pairs * 2):]
    return system_msgs + trimmed


# Simulate a long conversation
long_conversation = [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "How do I create a list in Python?"},
    {"role": "assistant", "content": "Use square brackets: my_list = [1, 2, 3]"},
    {"role": "user", "content": "How do I add an item?"},
    {"role": "assistant", "content": "Use .append(): my_list.append(4)"},
    {"role": "user", "content": "How do I remove an item?"},
    {"role": "assistant", "content": "Use .remove(value) or .pop(index)"},
    {"role": "user", "content": "How do I sort the list?"},
    {"role": "assistant", "content": "Use .sort() for in-place or sorted() for a new list"},
    {"role": "user", "content": "How do I reverse it?"},
    {"role": "assistant", "content": "Use .reverse() or slicing: my_list[::-1]"},
    # New question
    {"role": "user", "content": "Now how do I find the length?"}
]

print("FULL CONVERSATION")
print(f"Messages: {len(long_conversation)}")
print()

# Apply sliding window
windowed = sliding_window(long_conversation, max_pairs=2)

print("AFTER SLIDING WINDOW (keep last 2 pairs)")
print(f"Messages: {len(windowed)}")
print("-" * 40)
for m in windowed:
    print(f"  [{m['role']:>9}] {m['content'][:60]}")

print(f"\n{'=' * 60}")
print("Response with windowed context:")
_ = chat(windowed, temperature=0.0)

### Context Window Management — Summary of Strategies

| Strategy | How It Works | Pros | Cons |
|----------|-------------|------|------|
| **Full history** | Pass everything | Complete context | Expensive, may hit limits |
| **Sliding window** | Keep last N turns | Simple, predictable | Loses old info |
| **Summarization** | Summarize old turns | Compact, preserves key facts | Requires extra LLM call |
| **Hybrid** | Summarize old + keep recent | Best of both | More complex to implement |

### Exercise 2: Token Budget Calculator

Calculate how many conversation turns you can fit in different context windows.

In [None]:
# How many turns fit in a context window?

system_prompt_tokens = 50     # Typical system prompt
avg_user_tokens = 30          # Average user message
avg_assistant_tokens = 150    # Average assistant response
output_reserve = 500          # Tokens reserved for the next response

tokens_per_turn = avg_user_tokens + avg_assistant_tokens  # One turn = user + assistant

context_windows = [4096, 8192, 32768, 128000]

print(f"Assumptions:")
print(f"  System prompt: ~{system_prompt_tokens} tokens")
print(f"  Avg user msg:  ~{avg_user_tokens} tokens")
print(f"  Avg assistant: ~{avg_assistant_tokens} tokens")
print(f"  Output reserve: {output_reserve} tokens")
print(f"  Per turn:      ~{tokens_per_turn} tokens")
print()
print(f"{'Context Window':<18} {'Available':<12} {'Max Turns':<12}")
print("-" * 42)
for window in context_windows:
    available = window - system_prompt_tokens - output_reserve
    max_turns = available // tokens_per_turn
    print(f"{window:>10} tokens  {available:>8}       {max_turns:>5} turns")

print("\n\ud83d\udca1 This is why context management matters — even 128K windows fill up in long sessions!")

---

## 3. Common Prompt Pitfalls

Even experienced developers write bad prompts sometimes. Here are the most common mistakes and how to fix them.

| Pitfall | Problem | Fix |
|---------|---------|-----|
| **Vague instructions** | Model guesses what you want | Be specific about task, format, and length |
| **Overloaded prompts** | Too many tasks in one prompt | Break into separate prompts |
| **Leading/biased phrasing** | Model follows your bias instead of being objective | Use neutral language |
| **Missing constraints** | Model rambles or adds unwanted content | Set explicit boundaries |
| **Conflicting instructions** | System and user prompts contradict | Align system and user messages |

### Experiment 3A: Vague Instructions \u2192 Inconsistent Output

In [None]:
# Run the same vague prompt 3 times — watch how different the outputs are
print("=" * 60)
print("PITFALL: Vague instructions")
print("=" * 60)

vague_prompt = [{"role": "user", "content": "Write something about dogs."}]

print("Prompt: 'Write something about dogs.'\n")
for run in range(3):
    print(f"--- Run {run + 1} ---")
    result = chat(vague_prompt, temperature=0.8)
    print()

In [None]:
# --- Fixed version ---
print("=" * 60)
print("FIXED: Specific instructions")
print("=" * 60)

specific_prompt = [
    {"role": "user", "content": (
        "Write 3 fun facts about golden retrievers. "
        "Use a numbered list. One sentence per fact."
    )}
]

print("Prompt: 'Write 3 fun facts about golden retrievers...'\n")
for run in range(3):
    print(f"--- Run {run + 1} ---")
    result = chat(specific_prompt, temperature=0.8)
    print()

print("\ud83d\udca1 Much more consistent! Specificity reduces variance.")

### Experiment 3B: Overloaded Prompts \u2192 Incomplete Output

Asking the model to do too many things in one prompt often leads to some tasks being done poorly or skipped.

In [None]:
# --- Overloaded: 5 tasks crammed into one prompt ---
print("=" * 60)
print("PITFALL: Overloaded prompt (too many tasks)")
print("=" * 60)

overloaded = [
    {"role": "user", "content": (
        "Explain what a REST API is, give me a Python code example, "
        "list the HTTP methods, compare REST vs GraphQL, "
        "and suggest when to use each one."
    )}
]
show_messages(overloaded)
_ = chat(overloaded, temperature=0.3, num_predict=300)

print("\n\ud83d\udca1 Notice: With a token limit, some tasks get skipped or rushed.")

In [None]:
# --- Fixed: Break into focused prompts ---
print("=" * 60)
print("FIXED: One task at a time")
print("=" * 60)

focused_tasks = [
    "Explain what a REST API is in 2-3 sentences.",
    "List the 5 main HTTP methods with a one-line description each.",
    "Compare REST vs GraphQL in a 3-row table (columns: Feature, REST, GraphQL).",
]

for i, task in enumerate(focused_tasks, 1):
    print(f"\n--- Task {i} ---")
    messages = [{"role": "user", "content": task}]
    _ = chat(messages, temperature=0.3)

print("\n\ud83d\udca1 Each response is now complete and focused!")

### Experiment 3C: Leading / Biased Prompts

The way you phrase a question can **bias** the model's response. This is especially dangerous for tasks requiring objectivity.

In [None]:
system = {"role": "system", "content": "You are an objective technology analyst. Give balanced assessments."}

prompts = [
    {
        "label": "BIASED (leading question)",
        "content": "Why is Python the best programming language for everything?"
    },
    {
        "label": "NEUTRAL (balanced question)",
        "content": "What are the strengths and weaknesses of Python compared to other languages?"
    }
]

for p in prompts:
    print(f"\n{'=' * 60}")
    print(f"  {p['label']}")
    print(f"{'=' * 60}")
    messages = [system, {"role": "user", "content": p["content"]}]
    show_messages(messages)
    _ = chat(messages, temperature=0.3, num_predict=250)

print("\n\ud83d\udca1 The biased prompt pushes the model to agree. Neutral phrasing gets balanced analysis.")

### Experiment 3D: Missing Constraints \u2192 Model Rambles

In [None]:
# --- No constraints: model decides everything ---
print("=" * 60)
print("PITFALL: No constraints")
print("=" * 60)

no_constraints = [{"role": "user", "content": "Tell me about climate change."}]
result_nc = chat(no_constraints, temperature=0.5)
print(f"\n  [Word count: {len(result_nc.split())}]")

In [None]:
# --- With constraints: focused and controlled ---
print("=" * 60)
print("FIXED: Clear constraints")
print("=" * 60)

with_constraints = [
    {
        "role": "system",
        "content": "You are a science communicator. Keep responses concise and factual."
    },
    {
        "role": "user",
        "content": (
            "Explain the top 3 causes of climate change. "
            "Use a numbered list. One sentence per cause. "
            "No more than 50 words total."
        )
    }
]
result_wc = chat(with_constraints, temperature=0.3)
print(f"\n  [Word count: {len(result_wc.split())}]")

print(f"\n\ud83d\udca1 Unconstrained: {len(result_nc.split())} words vs. Constrained: {len(result_wc.split())} words")

### Experiment 3E: Conflicting Instructions

What happens when the system prompt says one thing and the user prompt says another?

In [None]:
# --- Conflicting: system says formal, user says casual ---
print("=" * 60)
print("PITFALL: Conflicting instructions")
print("=" * 60)

conflicting = [
    {
        "role": "system",
        "content": "You are a formal academic professor. Always use scholarly language and cite sources."
    },
    {
        "role": "user",
        "content": "Hey bro, explain quantum physics like I'm 5. Keep it super chill and funny. Use slang."
    }
]
show_messages(conflicting)
_ = chat(conflicting, temperature=0.5)

print("\n\ud83d\udca1 The model is confused \u2014 it tries to satisfy both, resulting in an awkward mix.")

In [None]:
# --- Fixed: aligned instructions ---
print("=" * 60)
print("FIXED: Aligned instructions")
print("=" * 60)

aligned = [
    {
        "role": "system",
        "content": "You are a fun, casual science explainer. Use simple language and humor."
    },
    {
        "role": "user",
        "content": "Explain quantum physics like I'm 5. Keep it super chill and funny."
    }
]
show_messages(aligned)
_ = chat(aligned, temperature=0.5)

print("\n\ud83d\udca1 When system and user are aligned, the output is coherent and on-tone.")

### Exercise 3: Fix the Broken Prompts

Each prompt below has a problem. Run it, identify the issue, then fix it in the cell after.

In [None]:
# --- Broken Prompt 1: Too vague ---
print("BROKEN PROMPT 1")
print("=" * 40)
broken_1 = [{"role": "user", "content": "Help me with my project."}]
_ = chat(broken_1)

In [None]:
# Fix broken prompt 1 here:
fixed_1 = [{"role": "user", "content": ""}]  # <-- Write your fix

if fixed_1[0]["content"]:
    print("YOUR FIX:")
    _ = chat(fixed_1)
else:
    print("Write your improved prompt above!")

In [None]:
# --- Broken Prompt 2: Conflicting tone ---
print("BROKEN PROMPT 2")
print("=" * 40)
broken_2 = [
    {"role": "system", "content": "You are a children's storyteller. Use fun, playful language."},
    {"role": "user", "content": "Write a technical analysis of TCP/IP networking protocols with RFC references."}
]
_ = chat(broken_2)

In [None]:
# Fix broken prompt 2 here:
fixed_2 = [
    {"role": "system", "content": ""},  # <-- Fix the system prompt
    {"role": "user", "content": ""}      # <-- Fix the user prompt
]

if fixed_2[0]["content"] and fixed_2[1]["content"]:
    print("YOUR FIX:")
    _ = chat(fixed_2)
else:
    print("Write your improved prompts above!")

In [None]:
# --- Broken Prompt 3: No format, no constraints ---
print("BROKEN PROMPT 3")
print("=" * 40)
broken_3 = [{"role": "user", "content": "Compare React and Angular."}]
_ = chat(broken_3)

In [None]:
# Fix broken prompt 3 here:
fixed_3 = [{"role": "user", "content": ""}]  # <-- Add format and constraints

if fixed_3[0]["content"]:
    print("YOUR FIX:")
    _ = chat(fixed_3)
else:
    print("Write your improved prompt above!")

---

## Key Takeaways

### Prompt Anatomy

| Component | Purpose | Impact on Output |
|-----------|---------|------------------|
| **Task** | What to do | Determines the core response |
| **Context** | Background info | Makes response relevant |
| **Role** | Who to be | Shapes tone and expertise level |
| **Format** | How to structure | Controls output layout |
| **Constraints** | What limits to follow | Prevents rambling and off-topic content |

### Context Window Management

| Strategy | When to Use |
|----------|-------------|
| **Full history** | Short conversations (< 10 turns) |
| **Sliding window** | Chatbots where recent context matters most |
| **Summarization** | Long sessions where early facts are important |
| **Token budgeting** | Production systems with cost constraints |

### Common Pitfalls Cheat Sheet

| Pitfall | Quick Fix |
|---------|----------|
| Vague instructions | Add task + format + constraints |
| Overloaded prompt | One task per prompt |
| Biased/leading phrasing | Use neutral "what are the pros and cons" |
| No constraints | Set length, format, and scope limits |
| Conflicting instructions | Align system and user prompts |

---

*Nunnari Academy | Generative AI & Agentic AI Professional Program*  
*Module 1, Week 2, Section 2.1 | Prompt Anatomy, Context Windows & Common Pitfalls*