# 2.3 Advanced Prompt Engineering Strategies

## Playground Notebook

This notebook covers **advanced prompting strategies** that unlock complex reasoning, multi-path exploration, and self-improving systems.

| Strategy | Core Idea |
|----------|-----------|
| **Chain of Thought (CoT)** | Ask the model to reason step-by-step before answering |
| **Tree of Thought (ToT)** | Explore multiple reasoning paths, evaluate, then commit |
| **Self-Consistency** | Run the same prompt N times, pick the majority answer |
| **ReAct** | Interleave Reasoning and Acting in a loop |
| **Prompt Chaining** | Pipeline where each step feeds into the next |
| **Meta-Prompting** | Use the model to generate and improve its own prompts |

---

In [3]:
import json
import time
import re
from collections import Counter
from IPython.display import display, Markdown, HTML
from langchain_ollama import ChatOllama
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage

# ============================================================
#  CONFIGURATION
# ============================================================
MODEL = "deepseek-r1:8b"  # Good for reasoning tasks. Alternatives: gpt-oss:20b, qwen2.5:1.5b

llm = ChatOllama(model=MODEL)

# ============================================================
#  HELPER FUNCTIONS
# ============================================================

def build_messages(message_dicts):
    """Convert role/content dicts into LangChain message objects."""
    type_map = {"system": SystemMessage, "user": HumanMessage, "assistant": AIMessage}
    return [type_map[m["role"]](content=m["content"]) for m in message_dicts]


def chat(messages, show=True, **kwargs):
    """Send messages to the model and return the response text."""
    _llm = ChatOllama(model=MODEL, **kwargs) if kwargs else llm
    lc_messages = build_messages(messages)
    start = time.time()
    response = _llm.invoke(lc_messages)
    elapsed = time.time() - start
    content = response.content
    if show:
        display(Markdown(content))
        print(f"\n‚è±Ô∏è {elapsed:.2f}s | {len(content)} chars")
    return content


def show_messages(messages):
    """Pretty-print the message list."""
    colors = {"system": "#e74c3c", "user": "#3498db", "assistant": "#2ecc71"}
    html = ""
    for msg in messages:
        role = msg["role"]
        color = colors.get(role, "#888")
        preview = msg["content"][:400] + ("..." if len(msg["content"]) > 400 else "")
        html += (
            f'<div style="margin:6px 0;padding:8px 12px;border-left:4px solid {color};'
            f'background:#1e1e1e;border-radius:4px;">'
            f'<strong style="color:{color};text-transform:uppercase;">{role}</strong>'
            f'<br><span style="color:#ccc;white-space:pre-wrap;">{preview}</span></div>'
        )
    display(HTML(html))


def section_header(title, subtitle=""):
    html = f"""
    <div style="background:linear-gradient(135deg,#1a1a2e,#16213e);padding:16px 20px;
                border-radius:8px;border-left:5px solid #e94560;margin:20px 0;">
      <h2 style="color:#e94560;margin:0;">{title}</h2>
      <p style="color:#aaa;margin:6px 0 0;">{subtitle}</p>
    </div>"""
    display(HTML(html))


print(f"‚úÖ Using model: {MODEL}")


‚úÖ Using model: deepseek-r1:8b


---

## 1. Chain of Thought (CoT) Prompting

**Chain of Thought** prompting asks the model to *show its work* ‚Äî reasoning step-by-step before reaching a final answer. This dramatically improves accuracy on tasks requiring arithmetic, logic, or multi-step inference.

```
Standard:  Question ‚Üí Answer
CoT:       Question ‚Üí Step 1 ‚Üí Step 2 ‚Üí Step 3 ‚Üí Answer
```

There are two flavours:
- **Zero-shot CoT**: Just add *"Let's think step by step"* ‚Äî no examples needed
- **Few-shot CoT**: Provide examples that demonstrate full reasoning chains

### Experiment 1A: Standard vs. Zero-Shot CoT on a Math Problem

In [4]:
problem = (
    "A store sells apples for $0.75 each and oranges for $1.20 each. "
    "If Sara buys 8 apples and 5 oranges, then uses a 15% discount coupon, "
    "how much does she pay in total?"
)

# ---- STANDARD (no reasoning) ----
print("=" * 60)
print("  STANDARD PROMPT (direct answer)")
print("=" * 60)
messages_std = [
    {"role": "system", "content": "You are a helpful math assistant. Give only the final numeric answer."},
    {"role": "user", "content": problem}
]
show_messages(messages_std)
_ = chat(messages_std, temperature=0.0)

print()

# ---- ZERO-SHOT CoT ----
print("=" * 60)
print("  ZERO-SHOT CoT (step by step)")
print("=" * 60)
messages_cot = [
    {"role": "system", "content": "You are a helpful math assistant."},
    {"role": "user", "content": problem + "\n\nLet's think step by step."}
]
show_messages(messages_cot)
_ = chat(messages_cot, temperature=0.0)

print("\nüí° Zero-shot CoT added 'Let's think step by step' ‚Äî that's the entire trick for zero-shot!")

  STANDARD PROMPT (direct answer)


<think>
First, I need to calculate the total cost of the apples Sara buys.

Each apple costs \$0.75, so multiplying that by 8 gives me the total cost for apples: 0.75 * 8 = \$6.00.

Next, I'll calculate the total cost of the oranges.

Each orange costs \$1.20, and Sara buys 5 of them. So, 1.20 * 5 = \$6.00.

Adding the cost of the apples and oranges together gives me the subtotal: \$6.00 + \$6.00 = \$12.00.

Now, I'll apply the 15% discount coupon to this subtotal.

A 15% discount means Sara pays 85% of the original amount. To find 85%, I can multiply the subtotal by 0.85: 12.00 * 0.85 = \$10.20.

Therefore, after applying the discount, Sara will pay a total of \$10.20.
</think>

**Solution:**

1. **Calculate the cost of apples:**
   
   \[
   \text{Cost of apples} = 8 \times \$0.75 = \$6.00
   \]

2. **Calculate the cost of oranges:**
   
   \[
   \text{Cost of oranges} = 5 \times \$1.20 = \$6.00
   \]

3. **Calculate the total cost before discount:**
   
   \[
   \text{Total cost before discount} = \$6.00 + \$6.00 = \$12.00
   \]

4. **Apply the 15% discount:**
   
   - A 15% discount means Sara pays 85% of the original amount.
   - Calculate 85% of \$12.00:
     
     \[
     \$12.00 \times (1 - 0.15) = \$12.00 \times 0.85 = \$10.20
     \]

5. **Final amount Sara pays:**
   
   \[
   \boxed{\$10.20}
   \]


‚è±Ô∏è 21.57s | 1330 chars

  ZERO-SHOT CoT (step by step)


<think>
First, I need to calculate the total cost of the apples Sara is buying. Each apple costs \$0.75, and she is purchasing 8 apples.

Next, I'll determine the total cost of the oranges. Each orange costs \$1.20, and Sara buys 5 oranges.

After finding the individual totals for apples and oranges, I'll add them together to get the combined cost before applying any discounts.

Then, I'll apply a 15% discount to the combined total. To do this, I'll calculate 15% of the total cost and subtract it from the original amount.

Finally, I'll sum up the discounted amount to find out how much Sara pays in total.
</think>

Let's break down the problem step by step to determine how much Sara pays in total for her purchase.

### **Step 1: Calculate the Total Cost of Apples**

- **Cost per apple:** \$0.75
- **Number of apples bought:** 8

\[
\text{Total cost of apples} = 8 \times \$0.75 = \$6.00
\]

### **Step 2: Calculate the Total Cost of Oranges**

- **Cost per orange:** \$1.20
- **Number of oranges bought:** 5

\[
\text{Total cost of oranges} = 5 \times \$1.20 = \$6.00
\]

### **Step 3: Combine the Costs Before Discount**

\[
\text{Combined total before discount} = \$6.00 + \$6.00 = \$12.00
\]

### **Step 4: Apply a 15% Discount Coupon**

- **Discount percentage:** 15%
- **Total cost before discount:** \$12.00

First, calculate the discount amount:

\[
\text{Discount} = 15\% \times \$12.00 = 0.15 \times 12 = \$1.80
\]

Now, subtract the discount from the total cost:

\[
\text{Total after discount} = \$12.00 - \$1.80 = \$10.20
\]

### **Final Amount Paid**

Sara pays a total of:

\[
\boxed{\$10.20}
\]


‚è±Ô∏è 15.10s | 1620 chars

üí° Zero-shot CoT added 'Let's think step by step' ‚Äî that's the entire trick for zero-shot!


### Experiment 1B: Few-Shot CoT ‚Äî Teaching a Reasoning Style

In **few-shot CoT**, you provide complete examples including the intermediate reasoning steps. This is more reliable and lets you enforce a specific reasoning format.

In [5]:
# Few-shot CoT: give examples with full reasoning chains
messages = [
    {
        "role": "system",
        "content": "Solve math word problems. Always show your reasoning step by step, then state the final answer clearly."
    },
    # Example 1
    {
        "role": "user",
        "content": "A train travels 60 km/h for 2.5 hours. How far does it travel?"
    },
    {
        "role": "assistant",
        "content": (
            "Step 1: Identify given values.\n"
            "  Speed = 60 km/h, Time = 2.5 hours\n\n"
            "Step 2: Apply the formula Distance = Speed √ó Time.\n"
            "  Distance = 60 √ó 2.5 = 150 km\n\n"
            "**Final Answer: 150 km**"
        )
    },
    # Example 2
    {
        "role": "user",
        "content": "If 3 workers can build a wall in 12 days, how many days will 6 workers take?"
    },
    {
        "role": "assistant",
        "content": (
            "Step 1: Find the total work in worker-days.\n"
            "  Total work = 3 workers √ó 12 days = 36 worker-days\n\n"
            "Step 2: Divide by the new number of workers.\n"
            "  Days = 36 worker-days √∑ 6 workers = 6 days\n\n"
            "**Final Answer: 6 days**"
        )
    },
    # Actual question
    {"role": "user", "content": problem}
]

print("FEW-SHOT CoT ‚Äî Structured Reasoning Chain")
print("=" * 60)
show_messages(messages)
_ = chat(messages, temperature=0.0)

FEW-SHOT CoT ‚Äî Structured Reasoning Chain


<think>
Okay, so I need to figure out how much Sara pays for the apples and oranges after using the discount. Let me start by calculating the cost of the apples and oranges separately.

First, the apples are $0.75 each, and Sara buys 8 of them. So, I'll multiply 0.75 by 8 to find the total cost for the apples.

Next, the oranges are $1.20 each, and she buys 5 of them. I'll multiply 1.20 by 5 to get the total cost for the oranges.

After finding the costs for both fruits, I'll add them together to get the total amount before applying the discount.

Now, Sara uses a 15% discount coupon. To calculate the discount, I'll find 15% of the total cost and then subtract that from the original total.

Finally, I'll add the discounted amount to get the final amount Sara pays.
</think>

**Solution:**

1. **Calculate the cost of apples:**
   - Price per apple = \$0.75
   - Number of apples = 8
   - Total cost for apples = \$0.75 √ó 8 = \$6.00

2. **Calculate the cost of oranges:**
   - Price per orange = \$1.20
   - Number of oranges = 5
   - Total cost for oranges = \$1.20 √ó 5 = \$6.00

3. **Calculate the total cost before discount:**
   - Total cost = Cost of apples + Cost of oranges
   - Total cost = \$6.00 + \$6.00 = \$12.00

4. **Apply the 15% discount:**
   - Discount rate = 15% = 0.15
   - Discount amount = Total cost √ó Discount rate
   - Discount amount = \$12.00 √ó 0.15 = \$1.80

5. **Calculate the final amount after discount:**
   - Final amount = Total cost - Discount amount
   - Final amount = \$12.00 - \$1.80 = \$10.20

**Final Answer:** Sara pays a total of \boxed{\$10.20}.


‚è±Ô∏è 15.87s | 1598 chars


### Experiment 1C: CoT for Logical Reasoning

CoT isn't just for math ‚Äî it helps on **logical deduction** and **commonsense reasoning** tasks too.

In [None]:
logic_puzzle = """
Alice, Bob, Carol, and Dave each have a different pet: a cat, a dog, a rabbit, and a fish.

Clues:
1. Alice does not have the cat or the fish.
2. Bob has the dog.
3. Carol does not have the rabbit.

Who has which pet?
"""

# Without CoT
print("=" * 60)
print("  WITHOUT CoT")
print("=" * 60)
msg_no_cot = [{"role": "user", "content": logic_puzzle}]
show_messages(msg_no_cot)
_ = chat(msg_no_cot, temperature=0.0)

print()

# With CoT
print("=" * 60)
print("  WITH CoT")
print("=" * 60)
msg_cot = [{
    "role": "user",
    "content": logic_puzzle + "\nWork through each clue systematically, eliminating possibilities, before giving the final assignment."
}]
show_messages(msg_cot)
_ = chat(msg_cot, temperature=0.0)

  WITHOUT CoT


<think>
Okay, so I need to figure out who has which pet among Alice, Bob, Carol, and Dave. The pets are a cat, a dog, a rabbit, and a fish. Each person has a different pet, so no overlaps there. Let me go through the clues one by one.

First clue: Alice does not have the cat or the fish. So, Alice must have either the dog or the rabbit. But wait, Bob is known to have the dog (from clue 2), so if Alice can't have the dog because Bob has it, then Alice must have the rabbit. That seems straightforward.

Second clue: Bob has the dog. So that's settled; Bob = dog.

Third clue: Carol does not have the rabbit. Since we've already determined Alice has the rabbit, Carol can't have it either. So Carol must have one of the remaining pets: cat or fish.

Now, let's see what's left. The pets are cat, dog (Bob), rabbit (Alice), so that leaves fish and... wait, no, the four pets are cat, dog, rabbit, and fish. So if Alice has the rabbit, Bob has the dog, then Carol can't have the rabbit, so she must have either the cat or the fish.

But let's think about what's left for Dave. Since each person has a different pet, and we've assigned Alice = rabbit, Bob = dog, Carol could have cat or fish, leaving the other for Dave.

Wait, but clue 1 says Alice doesn't have cat or fish, so she must have rabbit. So that leaves cat, dog (Bob), rabbit (Alice). So the remaining pets are cat and fish. Therefore, Carol and Dave must have those two.

But clue 3 says Carol does not have the rabbit, which we've already considered. So Carol can't have the rabbit, but she could have either cat or fish. Let's see if there's more clues to narrow it down.

Looking back at clue 1: Alice doesn't have cat or fish, so she has rabbit. Clue 2: Bob has dog. Clue 3: Carol doesn't have rabbit, so she can have cat or fish.

Is there any other clue? The problem only gives three clues, so maybe we need to deduce based on that.

So, if Carol can't have the rabbit, and Alice already has it, then Carol must have either cat or fish. Similarly, Dave would have the remaining pet.

But wait, is there a way to determine whether Carol has the cat or the fish? Let me think.

Clue 1 says Alice doesn't have the cat or the fish, so she has rabbit. Clue 2: Bob has dog. So the pets left are cat and fish for Carol and Dave.

Is there any other information that can help us determine who has which?

Wait, perhaps we need to consider if there's a clue about someone not having a specific pet beyond what's given. Let me check:

Clues:
1. Alice: no cat or fish ‚Üí rabbit.
2. Bob: dog.
3. Carol: no rabbit.

So, from that, Carol can have cat or fish. Since the problem doesn't provide more clues, maybe we need to assign based on the remaining options.

If Carol has the cat, then Dave would have the fish. If Carol has the fish, then Dave would have the cat.

But is there a way to determine which? Let me think again.

Wait, perhaps I missed something. The problem says each person has a different pet: cat, dog, rabbit, and fish. So we've assigned:

- Alice: rabbit
- Bob: dog

So remaining pets are cat and fish for Carol and Dave.

Clue 3: Carol does not have the rabbit ‚Üí which is already satisfied because she can't have it, so she must have either cat or fish.

Is there any other clue that could help? Let me read the clues again:

1. Alice does not have the cat or the fish.
2. Bob has the dog.
3. Carol does not have the rabbit.

No, that's all. So perhaps we need to make an assumption here because without more clues, we can't determine for sure whether Carol has the cat or the fish.

Wait, but maybe I'm overcomplicating it. Let me try to list out the possibilities:

- Alice: rabbit
- Bob: dog

So remaining pets: cat and fish.

Carol cannot have rabbit, so she must have either cat or fish.

If Carol has the cat, then Dave has the fish.

If Carol has the fish, then Dave has the cat.

But without more clues, we can't determine which is which. So perhaps the answer expects that Carol has the cat and Dave has the fish, but I'm not sure.

Wait, maybe there's another angle. Let me think about the initial problem again.

Each person has a different pet: cat, dog, rabbit, fish.

Clues:

1. Alice ‚â† cat or fish ‚Üí Alice = rabbit.
2. Bob = dog.
3. Carol ‚â† rabbit.

So, from clue 3, Carol can have cat or fish.

But since each person must have a unique pet, and the remaining pets are cat and fish, we need to assign them.

If Carol has the cat, then Dave has the fish.

If Carol has the fish, then Dave has the cat.

Is there any way to determine which is correct? Let me think if I missed any clues.

Wait, perhaps the problem expects that each person's pet is unique and based on the given clues. Since Alice can't have cat or fish, she must have rabbit. Bob has dog. Carol can't have rabbit, so she must have either cat or fish. Without more info, we might need to assume that Carol has the cat because it's a common pet, but I'm not sure.

Alternatively, maybe the answer is that Carol has the fish and Dave has the cat, but again, without more clues, it's ambiguous.

Wait, perhaps I should consider that the problem expects us to deduce based on the given clues only. So since Alice can't have cat or fish, she must have rabbit. Bob has dog. Carol can't have rabbit, so she must have either cat or fish. Since the problem doesn't specify further, maybe we need to list all possibilities.

But in the context of such puzzles, usually, there's a unique solution. So perhaps I'm missing something.

Wait, let me think again:

- Alice: rabbit
- Bob: dog

So remaining pets: cat and fish.

Carol can't have rabbit, so she has either cat or fish.

If Carol had the cat, then Dave would have the fish.

If Carol had the fish, then Dave would have the cat.

But is there any clue that could help us decide between these two?

Wait, perhaps the initial problem didn't mention anything else, so maybe we need to accept that it's ambiguous. But in reality, such puzzles usually have a unique solution, so I must be missing something.

Wait, let me check the clues again:

1. Alice does not have the cat or the fish.
2. Bob has the dog.
3. Carol does not have the rabbit.

So, from clue 1: Alice = rabbit or neither? Wait, no, clue 1 says she doesn't have cat or fish, so she must have the remaining pets, which are dog and rabbit. But clue 2 says Bob has the dog, so Alice can't have the dog because each person has a different pet. Therefore, Alice must have the rabbit.

So:

- Alice: rabbit

Then, Bob: dog (clue 2)

Carol cannot have rabbit (clue 3), so she must have either cat or fish.

Similarly, Dave would have the remaining pet.

But without more clues, we can't determine whether Carol has the cat or the fish. So perhaps the answer is that Carol has the cat and Dave has the fish, but I'm not entirely sure.

Wait, maybe there's another way to look at it. Let me think about the possible assignments:

If Carol had the cat, then Dave would have the fish.

But if Carol had the fish, then Dave would have the cat.

Is there any clue that could indicate which is correct? For example, does anyone else have a restriction on having a fish or a cat?

Looking back at the clues:

- Alice doesn't have cat or fish ‚Üí so she has rabbit.
- Bob has dog.
- Carol doesn't have rabbit.

No other restrictions. So perhaps both possibilities are valid, but in such puzzles, usually, there's a unique solution, so maybe I'm missing something.

Wait, perhaps the key is that each person must have a different pet, and we've assigned:

Alice: rabbit

Bob: dog

So remaining pets: cat and fish.

Carol can't have rabbit, so she has either cat or fish.

If Carol had the cat, then Dave would have the fish.

But if Carol had the fish, then Dave would have the cat.

Is there any way to determine which is correct? Maybe not, but perhaps in the context of the problem, it's expected that Carol has the cat because she can't have the rabbit, and maybe the fish is less likely for some reason. But I'm not sure.

Alternatively, maybe the answer expects that Carol has the fish because the cat is more common, but again, that's just a guess.

Wait, perhaps I should consider that the problem might be expecting us to realize that since Alice can't have the cat or fish, and Bob has the dog, then Carol must have the cat, leaving Dave with the fish. But why would Carol necessarily have the cat?

Alternatively, maybe it's based on the order of the names: Alice, Bob, Carol, Dave. Maybe the pets are assigned in that order as well? Not sure.

Wait, perhaps I should think about the possible permutations:

After assigning Alice and Bob:

Alice: rabbit

Bob: dog

Remaining pets: cat and fish.

Carol can't have rabbit, so she has either cat or fish.

If Carol has the cat, then Dave has the fish.

If Carol has the fish, then Dave has the cat.

But without more clues, both are possible. So perhaps the answer is that Carol has the cat and Dave has the fish, but I'm not entirely certain.

Wait, maybe I should consider that the problem might be expecting us to realize that since Alice can't have the cat or fish, and Bob has the dog, then Carol must have the cat because it's the only other option besides the fish. But without more info, it's a 50-50 chance.

Alternatively, perhaps the answer is that Carol has the fish and Dave has the cat.

Wait, but let me think again: Alice can't have cat or fish ‚Üí rabbit.

Bob has dog.

Carol can't have rabbit ‚Üí so she must have either cat or fish.

If we assume that Carol has the cat, then Dave has the fish.

But if we assume that Carol has the fish, then Dave has the cat.

Is there any way to determine which is correct? Maybe not, but perhaps in such puzzles, they expect you to realize that since Alice can't have the cat or fish, and Bob has the dog, then Carol must have the cat because it's the only other option besides the fish. But I'm not sure.

Alternatively, maybe the answer is that Carol has the fish and Dave has the cat.

Wait, perhaps I should consider that the problem might be expecting us to realize that since Alice can't have the cat or fish, and Bob has the dog, then Carol must have the cat because it's the only other pet besides the fish. But again, without more clues, it's ambiguous.

I think I need to make a decision here. Let me try to list the assignments:

- Alice: rabbit

- Bob: dog

Remaining pets: cat and fish.

Carol can't have rabbit ‚Üí so she has either cat or fish.

If Carol has the cat, then Dave has the fish.

If Carol has the fish, then Dave has the cat.

Since the problem doesn't provide more clues, perhaps both are possible. But in such puzzles, usually, there's a unique solution, so maybe I'm missing something.

Wait, perhaps the key is that each person's pet is different, and we've assigned:

Alice: rabbit

Bob: dog

So remaining pets: cat and fish.

Carol can't have rabbit ‚Üí she has either cat or fish.

If Carol had the cat, then Dave would have the fish.

But if Carol had the fish, then Dave would have the cat.

Is there any clue that could help us decide? Let me think again.

Clue 1: Alice doesn't have cat or fish ‚Üí so she must have rabbit.

Clue 2: Bob has dog.

Clue 3: Carol doesn't have rabbit.

No other clues. So perhaps both possibilities are valid, but in the context of such puzzles, usually, there's a unique answer. Maybe I need to consider that since Alice can't have the cat or fish, and Bob has the dog, then Carol must have the cat because it's the only other option besides the fish.

Alternatively, maybe the answer is that Carol has the fish and Dave has the cat.

Wait, perhaps I should think about the order of the names: Alice, Bob, Carol, Dave. Maybe the pets are assigned in a similar order? Not necessarily, but sometimes that helps.

If we assign:

Alice: rabbit

Bob: dog

Carol: cat

Dave: fish

That works because each has a unique pet and all clues are satisfied.

Alternatively,

Alice: rabbit

Bob: dog

Carol: fish

Dave: cat

Also works.

But without more clues, both are possible. However, in such puzzles, usually, the answer is that Carol has the cat and Dave has the fish.

Wait, but why? Maybe because the cat is a common pet, so it's less likely to be assigned to someone else. But I'm not sure.

Alternatively, perhaps the problem expects us to realize that since Alice can't have the cat or fish, and Bob has the dog, then Carol must have the cat because it's the only other option besides the fish.

But again, without more clues, it's ambiguous.

I think I need to make a decision here. Let me go with:

Alice: rabbit

Bob: dog

Carol: cat

Dave: fish

Because that seems like a logical assignment based on the given clues.
</think>

Alice has the rabbit, Bob has the dog, Carol has the cat, and Dave has the fish.

**Step-by-Step Explanation:**

1. **Clue 1:** Alice does not have the cat or the fish. Therefore, Alice must have either the dog or the rabbit. However, since Bob is known to have the dog (from Clue 2), Alice cannot have the dog. Thus, Alice must have the rabbit.

2. **Clue 2:** Bob has the dog. This directly assigns the dog to Bob.

3. **Clue 3:** Carol does not have the rabbit. Since Alice already has the rabbit, Carol can only have either the cat or the fish.

4. **Remaining Pets:** After assigning the rabbit to Alice and the dog to Bob, the remaining pets are the cat and the fish, which must be assigned to Carol and Dave.

5. **Assignment of Remaining Pets:**
   - Since Carol cannot have the rabbit (from Clue 3), she must have either the cat or the fish.
   - If Carol has the cat, then Dave must have the fish.
   - If Carol has the fish, then Dave must have the cat.

6. **Determining the Assignment:** Without additional clues, both possibilities are valid. However, in such puzzles, a unique solution is often expected. Assigning the cat to Carol and the fish to Dave is a logical conclusion based on typical pet ownership scenarios.

**Final Answer:**

- Alice has the rabbit.
- Bob has the dog.
- Carol has the cat.
- Dave has the fish.


‚è±Ô∏è 126.28s | 13989 chars

  WITH CoT


### Experiment 1D: CoT with a Scratchpad Pattern

The **scratchpad** variation explicitly separates the reasoning from the final answer using XML-like tags. This makes it easy to parse the answer programmatically.

In [None]:
scratchpad_system = """
You are a careful reasoning assistant. For every question:
1. Use <thinking> tags to work through the problem step by step.
2. Use <answer> tags to state the final answer only.

Format:
<thinking>
your reasoning here...
</thinking>
<answer>
final answer only
</answer>
"""

question = "If you have a 3-litre jug and a 5-litre jug, and unlimited water, how do you measure exactly 4 litres?"

messages = [
    {"role": "system", "content": scratchpad_system},
    {"role": "user", "content": question}
]

print("CoT SCRATCHPAD PATTERN")
print("=" * 60)
show_messages(messages)
result = chat(messages, temperature=0.0)

# Extract just the answer programmatically
answer_match = re.search(r'<answer>(.*?)</answer>', result, re.DOTALL)
if answer_match:
    print("\n" + "=" * 60)
    print("  EXTRACTED ANSWER (programmatic parsing):")
    print("=" * 60)
    print(answer_match.group(1).strip())

CoT SCRATCHPAD PATTERN


To measure exactly 4 liters using a 3-liter jug and a 5-liter jug with an unlimited supply of water, follow these steps:

1. **Fill the 5-liter jug completely:**
   - The 5-liter jug is now full.
   - The 3-liter jug remains empty.

2. **Pour water from the 5-liter jug into the 3-liter jug until the 3-liter jug is full:**
   - Pouring 3 liters of water will fill the 3-liter jug completely, leaving 2 liters in the 5-liter jug.
   - Now you have 3 liters in the 3-liter jug and 2 liters in the 5-liter jug.

3. **Empty the 3-liter jug:**
   - The 3-liter jug is now empty again.

4. **Pour the remaining 2 liters from the 5-liter jug into the 3-liter jug:**
   - Now you have exactly 2 liters left in the 5-liter jug and 1 liter in the 3-liter jug (since the 3-liter jug was previously full).

5. **Fill the 5-liter jug again completely:**
   - The 5-liter jug is now full.
   - Pour water from the 5-liter jug into the 3-liter jug until it's full:
     - You can pour 2 liters of water (since you already have 1 liter in the 3-liter jug).
     - This will fill the 3-liter jug completely, leaving exactly 4 liters in the 5-liter jug.

Now, you have exactly 4 liters of water in the 5-liter jug.


‚è±Ô∏è 4.07s | 1197 chars


---

## 2. Tree of Thought (ToT) ‚Äî Multi-Path Exploration

**Tree of Thought** extends CoT by exploring *multiple reasoning paths* simultaneously, evaluating each, and selecting the best one. Instead of a single chain, you build a tree of possibilities.

```
CoT:  Question ‚Üí Path A ‚Üí Answer A

ToT:  Question ‚îÄ‚îÄ‚ñ∂ Path A ‚îÄ‚îÄ‚ñ∂ Eval A ‚îÄ‚îÄ‚ñ∂ (selected)
              ‚îÄ‚îÄ‚ñ∂ Path B ‚îÄ‚îÄ‚ñ∂ Eval B
              ‚îÄ‚îÄ‚ñ∂ Path C ‚îÄ‚îÄ‚ñ∂ Eval C
```

We implement ToT in three steps:
1. **Generate** multiple candidate solutions
2. **Evaluate** each candidate
3. **Select** the best one

### Experiment 2A: ToT for a Creative Strategy Problem

In [None]:
problem_tot = (
    "A small coffee shop is losing customers to a new competitor nearby. "
    "They have a $500 budget and need to win customers back within 30 days."
)

# ---- STEP 1: Generate multiple candidate strategies ----
print("STEP 1: Generate 3 distinct strategies")
print("=" * 60)

gen_messages = [
    {
        "role": "system",
        "content": (
            "You are a business strategist. Generate exactly 3 DISTINCT strategies to solve the problem. "
            "Each strategy must take a fundamentally different approach. "
            "Format each strategy as:\n"
            "Strategy N: [Name]\n"
            "Approach: [2-sentence description]\n"
            "Key Actions: [3 bullet points]\n"
        )
    },
    {"role": "user", "content": f"Problem: {problem_tot}"}
]

show_messages(gen_messages)
strategies_text = chat(gen_messages, temperature=0.7)

STEP 1: Generate 3 distinct strategies


Strategy 1: Social Media Campaign
Approach: Utilize the full potential of digital marketing platforms to reach out directly to current customers and convert them into loyal patrons.
Key Actions:
- Create engaging, visually appealing social media posts about the unique aspects of your coffee shop's offerings, including the staff, ambiance, and brewing methods. 
- Implement a referral program that offers customers incentives like discounts or free drinks for bringing in friends.
- Collaborate with local influencers who are passionate about coffee to promote your establishment.

Strategy 2: Personalized Loyalty Program
Approach: Develop a loyalty system where frequent customers can earn rewards points which they can redeem for exclusive benefits, such as early access to new products and special events. This will encourage repeat visits.
Key Actions:
- Create a detailed loyalty program with multiple tiers that reward different levels of customer engagement and spending.
- Start offering some initial perks such as free coffee or a discount on their next visit in exchange for signing up.
- Track the redemption rates to see which incentives are most effective, and adjust accordingly.

Strategy 3: Improving Customer Experience
Approach: Enhance your existing service quality by focusing on customer satisfaction. This could be done through training staff members, upgrading equipment or offering new menu items that meet current trends in coffee making.
Key Actions:
- Hire additional staff to ensure a quick response time and better handling of customers' requests.
- Update the kitchen with modern appliances that can produce different blends quickly and efficiently.
- Offer custom blends based on customer feedback so you can cater to their individual tastes.


‚è±Ô∏è 3.96s | 1775 chars


In [None]:
# ---- STEP 2: Evaluate each strategy ----
print("STEP 2: Evaluate all strategies")
print("=" * 60)

eval_messages = [
    {
        "role": "system",
        "content": (
            "You are a business evaluator. Given a set of strategies and a problem, "
            "evaluate each strategy on three criteria (score 1-10 each):\n"
            "- Feasibility within $500 budget\n"
            "- Likely impact on customer retention within 30 days\n"
            "- Differentiation from the competitor\n"
            "End with a total score and a one-sentence verdict."
        )
    },
    {"role": "user", "content": f"Problem: {problem_tot}\n\nStrategies to evaluate:\n{strategies_text}"}
]

show_messages(eval_messages)
evaluations_text = chat(eval_messages, temperature=0.3)

STEP 2: Evaluate all strategies


### Strategy 1: Social Media Campaign

**Feasibility within $500 budget:** High (Score: 8/10)
- **Key Actions:**
  - Create engaging, visually appealing social media posts about the unique aspects of your coffee shop's offerings.
  - Implement a referral program with incentives like discounts or free drinks for bringing in friends.

**Likely impact on customer retention within 30 days:** Medium (Score: 6/10)
- **Key Actions:**
  - Collaborate with local influencers to promote the establishment.

**Differentiation from the competitor:** Low (Score: 2/10)
- The social media campaign is likely to reach a broader audience but may not differentiate your coffee shop significantly from competitors who also use similar marketing strategies.

### Strategy 2: Personalized Loyalty Program

**Feasibility within $500 budget:** Medium (Score: 6/10)
- **Key Actions:**
  - Create a detailed loyalty program with multiple tiers.
  - Offer initial perks such as free coffee or discounts on their next visit.

**Likely impact on customer retention within 30 days:** High (Score: 8/10)
- **Key Actions:**
  - Track the redemption rates to see which incentives are most effective, and adjust accordingly.

**Differentiation from the competitor:** Medium (Score: 5/10)
- The loyalty program is likely to attract repeat customers but may not differentiate your coffee shop significantly from competitors who also offer similar programs.

### Strategy 3: Improving Customer Experience

**Feasibility within $500 budget:** High (Score: 8/10)
- **Key Actions:**
  - Hire additional staff to ensure a quick response time and better handling of customers' requests.
  - Update the kitchen with modern appliances that can produce different blends quickly and efficiently.

**Likely impact on customer retention within 30 days:** High (Score: 8/10)
- **Key Actions:**
  - Offer custom blends based on customer feedback, catering to individual tastes.

**Differentiation from the competitor:** Medium (Score: 4/10)
- The improvement in service quality is likely to attract more customers but may not significantly differentiate your coffee shop from competitors who also focus on improving their services.


‚è±Ô∏è 7.29s | 2187 chars


In [None]:
# ---- STEP 3: Select and expand the winner ----
print("STEP 3: Select best strategy and build an action plan")
print("=" * 60)

select_messages = [
    {
        "role": "system",
        "content": (
            "You are a business consultant. Based on the evaluation scores, "
            "select the BEST strategy and create a concrete 30-day action plan with:\n"
            "- Week-by-week breakdown\n"
            "- Budget allocation\n"
            "- Success metrics\n"
            "- Key risks and mitigations"
        )
    },
    {
        "role": "user",
        "content": (
            f"Problem: {problem_tot}\n\n"
            f"Strategies:\n{strategies_text}\n\n"
            f"Evaluations:\n{evaluations_text}\n\n"
            "Select the best strategy and create the action plan."
        )
    }
]

show_messages(select_messages)
_ = chat(select_messages, temperature=0.3)

print("\nüí° ToT explored 3 paths, evaluated all, then committed to the best one.")

STEP 3: Select best strategy and build an action plan


### Best Strategy Selection

Based on the evaluations, **Strategy 2: Personalized Loyalty Program** appears to be the most effective within the $500 budget for winning back customers in 30 days. Here‚Äôs a detailed 30-day action plan:

#### Week-by-Week Breakdown and Budget Allocation

| Week | Action | Cost |
|------|--------|-------|
| **1** | Create engaging social media posts about unique coffee shop offerings, implement referral program with incentives (e.g., discounts or free drinks for bringing in friends) | $200 |
| **2-3** | Collaborate with local influencers to promote the establishment | $50 |
| **4-6** | Develop a detailed loyalty program with multiple tiers and offer initial perks such as free coffee or discounts on next visit | $150 |
| **7-8** | Track redemption rates of loyalty program incentives, adjust accordingly | $25 |
| **9** | Hire additional staff to ensure quick response time and better handling of customer requests (e.g., 3 new staff members) | $400 |
| **10-12** | Update kitchen with modern appliances that can produce different blends quickly and efficiently (e.g., $1,500 for equipment upgrades) | $1,500 |

#### Success Metrics

- **Week 1:** Monitor social media engagement, referral program sign-ups, and initial loyalty program redemption rates.
- **Weeks 2-3:** Continue with the collaboration with influencers to increase brand visibility.
- **Weeks 4-6:** Track and adjust loyalty program incentives based on redemption rates.
- **Weeks 7-8:** Analyze tracking data from previous weeks to optimize the loyalty program.
- **Weeks 9-12:** Evaluate customer satisfaction through surveys, reviews, and direct feedback.

#### Key Risks and Mitigations

**Risk 1: Inadequate Social Media Engagement**
- **Mitigation:** Increase content creation efforts by collaborating with influencers more frequently or creating high-quality visual posts that stand out.
  
**Risk 2: Slow Redemption Rates of Loyalty Program Incentives**
- **Mitigation:** Implement a feedback loop where customers can provide suggestions for improvements and rewards, ensuring the program remains relevant.

**Risk 3: Staffing Shortages**
- **Mitigation:** Hire additional staff gradually to avoid sudden increases in demand. Consider part-time or temporary workers if needed.
  
By following this action plan, you should see significant improvement in customer retention within 30 days, effectively differentiating your coffee shop from the new competitor and attracting loyal customers back to your establishment.


‚è±Ô∏è 20.71s | 2528 chars

üí° ToT explored 3 paths, evaluated all, then committed to the best one.


### Experiment 2B: ToT for a Logic Puzzle ‚Äî Backtracking

ToT is especially powerful for **search problems** where some paths hit dead ends and you need to backtrack.

In [None]:
# Simulate ToT with explicit backtracking instructions
tot_system = """
You are a systematic problem solver using Tree of Thought reasoning.

Process:
1. EXPLORE: List 3 possible first moves or approaches
2. EVALUATE: Rate each with [PROMISING] / [DEAD END] / [UNCERTAIN]
3. EXPAND: Take the most promising path and explore next steps
4. BACKTRACK if you hit a dead end and try another branch
5. CONCLUDE with the solution found

Label each phase clearly.
"""

puzzle = """
You have 9 coins. One is counterfeit and slightly heavier.
You have a balance scale with only 2 uses.
How do you find the counterfeit coin?
"""

messages = [
    {"role": "system", "content": tot_system},
    {"role": "user", "content": puzzle}
]

print("ToT ‚Äî BALANCE SCALE PUZZLE (with backtracking)")
print("=" * 60)
show_messages(messages)
_ = chat(messages, temperature=0.4)

ToT ‚Äî BALANCE SCALE PUZZLE (with backtracking)


### Phase 1: Explore Possible First Moves or Approaches

**Approach 1:** Divide the 9 coins into three groups of 3 each (A, B, and C). Weigh group A against group B. If they balance:
- The counterfeit coin is in group C.
- Take group C and weigh any two coins from it. The heavier one is counterfeit.

If they don't balance:
- The counterfeit coin is in the lighter group (A or B).
- Take that group and weigh any two coins from it. The heavier one is counterfeit.

**Approach 2:** Divide the 9 coins into three groups of 3 each (A, B, and C). Weigh group A against group B. If they balance:
- The counterfeit coin is in group C.
- Take group C and weigh any two coins from it. The heavier one is counterfeit.

If they don't balance:
- The counterfeit coin is in the lighter group (A or B).
- Take that group and weigh any two coins from it. The heavier one is counterfeit.

**Approach 3:** Divide the 9 coins into three groups of 3 each (A, B, and C). Weigh group A against group B. If they balance:
- The counterfeit coin is in group C.
- Take group C and weigh any two coins from it. The heavier one is counterfeit.

If they don't balance:
- The counterfeit coin is in the lighter group (A or B).
- Take that group and weigh any two coins from it. The heavier one is counterfeit.

### Phase 2: Evaluate Each Approach

**Approach 1:** [PROMISING]

**Approach 2:** [PROMISING]

**Approach 3:** [PROMISING]

Since all three approaches are promising, we will explore the first approach for simplicity:

- **Exploring Approach 1:**
  - Divide into groups A (A1), B (B1), and C (C).
  - Weigh A against B.
    - If balanced, counterfeit is in C.
      - Take C, weigh any two coins from it.
        - Heavier = counterfeit.
      - Otherwise, take the lighter group (either A or B) and weigh any two coins from it.
        - Heavier = counterfeit.

  - If not balanced:
    - The counterfeit coin is in the lighter group (A or B).
      - Take that group and weigh any two coins from it.
        - Heavier = counterfeit.

### Phase 3: Expand on the Most Promising Path

**Expanding Approach 1:**
- **Step 1:** Divide A into A1, B into B1, C into C.
- **Step 2:** Weigh A1 against B1. If balanced, weigh any two coins from C. The heavier one is counterfeit.

### Phase 4: Backtrack if Necessary

Since Approach 3 was the most promising initially and we have a clear path to follow, no further backtracking is necessary.

### Conclusion

The first approach (dividing into groups of three) works perfectly with minimal steps:
- **Step 1:** Divide A into A1, B into B1, C into C.
- **Step 2:** Weigh A1 against B1. If balanced, weigh any two coins from C. The heavier one is counterfeit.

Thus, the counterfeit coin can be found in just a few steps with minimal effort.


‚è±Ô∏è 49.02s | 2762 chars


---

## 3. Self-Consistency & Ensemble Prompting

**Self-Consistency** runs the same prompt multiple times at a non-zero temperature, then takes a **majority vote** over the outputs. This exploits the fact that correct reasoning paths converge more often than incorrect ones.

```
Run 1 ‚Üí Answer A
Run 2 ‚Üí Answer A   ‚Üí  Majority Vote ‚Üí A  ‚úÖ
Run 3 ‚Üí Answer B
Run 4 ‚Üí Answer A
Run 5 ‚Üí Answer A
```

**Ensemble Prompting** uses *different phrasings* of the same question and aggregates, reducing sensitivity to prompt wording.

### Experiment 3A: Self-Consistency on an Ambiguous Problem

In [None]:
# A problem where individual CoT runs might disagree
sc_problem = (
    "A bat and a ball together cost $1.10. The bat costs $1.00 more than the ball. "
    "How much does the ball cost? Give only the final dollar amount."
)

N_RUNS = 5
answers = []

print(f"SELF-CONSISTENCY ‚Äî Running {N_RUNS} times with temperature=0.7")
print("=" * 60)

for i in range(N_RUNS):
    messages = [
        {"role": "system", "content": "You are a careful math assistant. Think step by step, then give only the final dollar amount on the last line."},
        {"role": "user", "content": sc_problem}
    ]
    result = chat(messages, show=False, temperature=0.7)
    # Extract the last line as the answer
    lines = [l.strip() for l in result.strip().split('\n') if l.strip()]
    final = lines[-1] if lines else result.strip()
    answers.append(final)
    print(f"  Run {i+1}: {final}")

# Majority vote
vote = Counter(answers).most_common(1)[0]
print(f"\nüìä Vote distribution: {dict(Counter(answers))}")
print(f"‚úÖ Majority answer: {vote[0]} (appeared {vote[1]}/{N_RUNS} times)")
print(f"\nüí° Correct answer: $0.05")

SELF-CONSISTENCY ‚Äî Running 5 times with temperature=0.7
  Run 1: So, the ball costs $0.05.
  Run 2: So, the ball costs **$0.05** (five cents).
  Run 3: The final dollar amount is: **$0.50**.
  Run 4: So, the ball costs \(\boxed{0.05}\) dollars.


### Experiment 3B: Ensemble Prompting ‚Äî Multiple Framings

In [None]:
# Same core question, different framings
base_topic = "Is it better to rent or buy a home?"

framings = [
    "From a purely financial perspective, is it better to rent or buy a home? Give a one-word verdict: RENT or BUY, then a brief reason.",
    "A young professional asks: should I rent or own property? Reply with RENT or BUY and your main reason.",
    "Assuming standard market conditions, which builds more wealth long-term: renting and investing the difference, or buying? Answer: RENT or BUY.",
]

ensemble_answers = []
print("ENSEMBLE PROMPTING ‚Äî Same question, different framings")
print("=" * 60)

for i, framing in enumerate(framings):
    print(f"\n--- Framing {i+1} ---")
    messages = [{"role": "user", "content": framing}]
    result = chat(messages, temperature=0.3)
    # Detect verdict
    verdict = "RENT" if "RENT" in result.upper() else ("BUY" if "BUY" in result.upper() else "UNCLEAR")
    ensemble_answers.append(verdict)
    print(f"  Detected verdict: {verdict}")

final_verdict = Counter(ensemble_answers).most_common(1)[0][0]
print(f"\nüìä Ensemble vote: {dict(Counter(ensemble_answers))}")
print(f"‚úÖ Ensemble verdict: {final_verdict}")

### Experiment 3C: Self-Consistency for Sentiment Analysis

Self-consistency improves reliability on **borderline cases** where the model might flip between answers.

In [None]:
# A genuinely ambiguous review
ambiguous_review = (
    "The food arrived cold and the waiter forgot our drinks twice, but the "
    "manager personally apologized and gave us a 50% discount. The pasta "
    "itself was actually delicious once we warmed it up."
)

RUNS = 6
sentiments = []

print(f"SELF-CONSISTENCY ‚Äî Ambiguous Review ({RUNS} runs)")
print(f"Review: {ambiguous_review}")
print("=" * 60)

for i in range(RUNS):
    messages = [
        {"role": "system", "content": "Classify the sentiment as exactly one word: POSITIVE, NEGATIVE, or MIXED."},
        {"role": "user", "content": f"Review: {ambiguous_review}"}
    ]
    result = chat(messages, show=False, temperature=0.8)
    label = result.strip().upper().split()[0] if result.strip() else "UNCLEAR"
    sentiments.append(label)
    print(f"  Run {i+1}: {label}")

top = Counter(sentiments).most_common(1)[0]
print(f"\nüìä Distribution: {dict(Counter(sentiments))}")
print(f"‚úÖ Self-consistent label: {top[0]} (confidence: {top[1]}/{RUNS})")

---

## 4. ReAct ‚Äî Reasoning + Acting

**ReAct** interleaves *thinking* (Reasoning) with *doing* (Acting) in a loop. Each iteration:

```
Thought  ‚Üí What do I need to know or do next?
Action   ‚Üí Execute a tool or retrieve information
Observation ‚Üí What did I learn?
Thought  ‚Üí What does this mean? What's next?
...repeat until done...
Final Answer
```

Below we simulate this loop with mock "tools" (functions the model can call).

### Experiment 4A: ReAct with Simulated Tools

In [None]:
# ---- Simulated tool implementations ----

TOOL_REGISTRY = {}

def tool(name):
    def decorator(fn):
        TOOL_REGISTRY[name] = fn
        return fn
    return decorator


@tool("search")
def search(query: str) -> str:
    """Simulate a web search with pre-loaded answers."""
    db = {
        "eiffel tower height": "The Eiffel Tower is 330 metres (1,083 ft) tall including antennas.",
        "eiffel tower built": "The Eiffel Tower was built between 1887 and 1889 by Gustave Eiffel.",
        "empire state building height": "The Empire State Building is 443 metres (1,454 ft) to its roof.",
        "empire state building built": "The Empire State Building was built in 1930-1931.",
        "burj khalifa height": "The Burj Khalifa is 828 metres (2,717 ft) tall.",
    }
    query_lower = query.lower()
    for key, val in db.items():
        if all(word in query_lower for word in key.split()):
            return val
    return f"No results found for: {query}"


@tool("calculate")
def calculate(expression: str) -> str:
    """Safely evaluate a math expression."""
    try:
        result = eval(expression, {"__builtins__": {}}, {})
        return f"{expression} = {result}"
    except Exception as e:
        return f"Error: {e}"


def execute_tool(action_text: str) -> str:
    """Parse and execute a tool call from the model's action text."""
    action_text = action_text.strip()
    for tool_name in TOOL_REGISTRY:
        pattern = rf'{tool_name}\[(.+?)\]'
        match = re.search(pattern, action_text, re.IGNORECASE)
        if match:
            arg = match.group(1).strip()
            return TOOL_REGISTRY[tool_name](arg)
    return "Unknown action. Use format: search[query] or calculate[expression]"


print("‚úÖ Tools registered:", list(TOOL_REGISTRY.keys()))

# Test tools directly
print("  search[eiffel tower height]:", search("eiffel tower height"))
print("  calculate[330 - 443]:", calculate("330 - 443"))

In [None]:
# ---- ReAct system prompt ----
REACT_SYSTEM = """
You are a research assistant that answers questions using tools.

Available tools:
- search[query]     ‚Üí Search for factual information
- calculate[expr]   ‚Üí Evaluate a math expression (use Python syntax)

You MUST follow this format strictly, one step at a time:

Thought: [what you need to find out or do next]
Action: [tool_name[argument]]
Observation: [result of the action]
... (repeat Thought/Action/Observation as needed)
Thought: I now have enough information to answer.
Final Answer: [your complete answer]

Do NOT make up facts. Only use what you learn from tools.
"""


def react_loop(question: str, max_steps: int = 6):
    """Run the full ReAct loop for a given question."""
    messages = [
        {"role": "system", "content": REACT_SYSTEM},
        {"role": "user", "content": question}
    ]

    print(f"\nüîç QUESTION: {question}")
    print("=" * 60)

    for step in range(max_steps):
        response = chat(messages, show=False, temperature=0.0)

        # Check if we reached a Final Answer
        if "Final Answer:" in response:
            display(Markdown(response))
            print(f"\n‚úÖ Completed in {step + 1} step(s).")
            return response

        # Parse the action
        action_match = re.search(r'Action:\s*(.+?)(?:\n|$)', response)
        if action_match:
            action_str = action_match.group(1).strip()
            observation = execute_tool(action_str)

            # Show this step
            display(Markdown(response))
            print(f"\nüõ†Ô∏è  Executed: {action_str}")
            print(f"üìã Observation: {observation}")
            print("-" * 40)

            # Append the assistant turn and the observation
            messages.append({"role": "assistant", "content": response})
            messages.append({"role": "user", "content": f"Observation: {observation}"})
        else:
            # No action found ‚Äî show response and stop
            display(Markdown(response))
            break

    print("\n‚ö†Ô∏è Max steps reached.")
    return None


# Run the ReAct loop
_ = react_loop("How much taller is the Empire State Building than the Eiffel Tower in metres?")

### Experiment 4B: ReAct with a Multi-Hop Question

Multi-hop questions require **chaining multiple lookups** ‚Äî exactly what ReAct excels at.

In [None]:
_ = react_loop(
    "When was the Eiffel Tower built, and how many years after its construction "
    "did the Empire State Building get built?"
)

---

## 5. Prompt Chaining ‚Äî Multi-Step Pipelines

**Prompt Chaining** breaks a complex task into a **sequence of smaller prompts** where each step's output becomes the next step's input. This makes complex workflows:
- Easier to debug (inspect output at each stage)
- More reliable (each prompt is focused on one thing)
- Easier to modify (swap or improve individual steps)

```
Input ‚Üí [Step 1 Prompt] ‚Üí Output 1
                              ‚Üì
       [Step 2 Prompt] ‚Üí Output 2
                              ‚Üì
       [Step 3 Prompt] ‚Üí Final Output
```

### Experiment 5A: Research Report Pipeline

In [None]:
raw_topic = "The impact of remote work on urban housing markets"

# ---- STEP 1: Extract key sub-topics ----
print("CHAIN STEP 1 ‚Äî Identify key sub-topics")
print("=" * 60)

step1_messages = [
    {"role": "system", "content": "You are a research analyst. Identify exactly 4 key sub-topics for the given topic. Return them as a numbered list, one per line."},
    {"role": "user", "content": f"Topic: {raw_topic}"}
]
show_messages(step1_messages)
subtopics = chat(step1_messages, temperature=0.5)

print("\nüì§ Output feeds into Step 2...")

In [None]:
# ---- STEP 2: Generate key findings for each sub-topic ----
print("CHAIN STEP 2 ‚Äî Generate key findings")
print("=" * 60)

step2_messages = [
    {
        "role": "system",
        "content": "You are a research analyst. For each sub-topic provided, write 2 key findings supported by logical reasoning. Format: Sub-topic ‚Üí Finding 1, Finding 2."
    },
    {
        "role": "user",
        "content": f"Main topic: {raw_topic}\n\nSub-topics:\n{subtopics}\n\nGenerate 2 key findings per sub-topic."
    }
]
show_messages(step2_messages)
findings = chat(step2_messages, temperature=0.5)

print("\nüì§ Output feeds into Step 3...")

In [None]:
# ---- STEP 3: Write an executive summary ----
print("CHAIN STEP 3 ‚Äî Write executive summary")
print("=" * 60)

step3_messages = [
    {
        "role": "system",
        "content": "You are a business writer. Write a concise executive summary (3 paragraphs) synthesizing the provided research findings. Use professional language suitable for senior management."
    },
    {
        "role": "user",
        "content": f"Topic: {raw_topic}\n\nSub-topics:\n{subtopics}\n\nKey Findings:\n{findings}\n\nWrite the executive summary."
    }
]
show_messages(step3_messages)
summary = chat(step3_messages, temperature=0.4)

print("\nüì§ Output feeds into Step 4...")

In [None]:
# ---- STEP 4: Format as a structured Markdown report ----
print("CHAIN STEP 4 ‚Äî Format as Markdown report")
print("=" * 60)

step4_messages = [
    {
        "role": "system",
        "content": "You are a document formatter. Convert the research content into a clean Markdown report with: Title, Executive Summary section, Sub-topics as H2 headers with their findings as bullet points, and a Conclusions section with 3 bullet points."
    },
    {
        "role": "user",
        "content": (
            f"Topic: {raw_topic}\n\n"
            f"Sub-topics:\n{subtopics}\n\n"
            f"Findings:\n{findings}\n\n"
            f"Executive Summary:\n{summary}\n\n"
            "Format this as a complete Markdown report."
        )
    }
]
show_messages(step4_messages)
final_report = chat(step4_messages, temperature=0.2)

print("\n\nüìã RENDERED FINAL REPORT:")
print("=" * 60)
display(Markdown(final_report))

### Experiment 5B: Conditional Branching in a Chain

Chains don't have to be linear ‚Äî you can branch based on intermediate outputs.

In [None]:
def classify_and_respond(customer_message: str):
    """Classify a customer message, then route to the appropriate response chain."""

    # Step 1: Classify the intent
    classify_messages = [
        {"role": "system", "content": "Classify the customer message into exactly one category: COMPLAINT, QUESTION, COMPLIMENT, or REFUND_REQUEST. Reply with just the category."},
        {"role": "user", "content": customer_message}
    ]
    intent = chat(classify_messages, show=False, temperature=0.0).strip().upper()
    print(f"  Step 1 ‚Äî Classified as: {intent}")

    # Step 2: Branch based on intent
    branch_prompts = {
        "COMPLAINT": (
            "You are a customer service agent handling a complaint. "
            "Respond empathetically: acknowledge the issue, apologise sincerely, "
            "and offer a concrete resolution. Keep it under 3 sentences."
        ),
        "QUESTION": (
            "You are a helpful support agent. Answer the customer's question "
            "clearly and concisely. Offer to help further at the end. Under 3 sentences."
        ),
        "COMPLIMENT": (
            "You are a delighted customer service agent. Thank the customer warmly, "
            "tell them their feedback matters, and invite them back. Under 3 sentences."
        ),
        "REFUND_REQUEST": (
            "You are a customer service agent processing refund requests. "
            "Acknowledge the request, confirm it will be processed within 5-7 business days, "
            "and provide a reference format. Under 3 sentences."
        ),
    }

    system_prompt = branch_prompts.get(intent, branch_prompts["QUESTION"])

    response_messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": customer_message}
    ]

    print(f"  Step 2 ‚Äî Routed to '{intent}' handler")
    print(f"  Step 3 ‚Äî Generating response...")
    return chat(response_messages, temperature=0.5)


test_messages = [
    "Your app crashed and I lost 3 hours of work! This is unacceptable.",
    "Do you support two-factor authentication?",
    "I just wanted to say your support team was amazing today!",
    "I need a refund for my last payment ‚Äî I was double charged."
]

for msg in test_messages:
    print(f"\n{'=' * 60}")
    print(f"CUSTOMER: {msg}")
    print("-" * 40)
    _ = classify_and_respond(msg)

---

## 6. Meta-Prompting ‚Äî Prompts That Generate Prompts

**Meta-prompting** uses the LLM itself to *generate, optimise, and refine prompts*. Instead of hand-crafting every prompt, you describe what you want and let the model build the optimal prompt for you.

Use cases:
- **Prompt generation**: Describe a task ‚Üí get an optimised prompt
- **Prompt optimisation**: Give a bad prompt ‚Üí get an improved version
- **Prompt critique**: Identify weaknesses in an existing prompt
- **Prompt expansion**: Turn a vague idea into a detailed, production-ready prompt

### Experiment 6A: Generate an Optimised Prompt from a Task Description

In [None]:
META_PROMPT_GENERATOR = """
You are an expert prompt engineer. When given a task description, you generate a complete, optimised prompt that:
- Has a clear system message defining the AI's role and constraints
- Specifies the exact output format
- Includes edge case handling
- Is unambiguous and precise

Output format:
## Generated Prompt

**System:**
[system message here]

**User Template:**
[user message template with {placeholders}]

**Notes:**
[brief explanation of key design decisions]
"""

task_descriptions = [
    "I need a prompt that extracts action items from meeting transcripts and formats them as a prioritised task list with owner and deadline.",
    "I want a prompt that reviews Python code and gives structured feedback covering correctness, performance, and style."
]

for task in task_descriptions:
    print("=" * 60)
    print(f"TASK DESCRIPTION:\n{task}")
    print("=" * 60)
    messages = [
        {"role": "system", "content": META_PROMPT_GENERATOR},
        {"role": "user", "content": f"Generate an optimised prompt for this task:\n\n{task}"}
    ]
    _ = chat(messages, temperature=0.4)
    print()

### Experiment 6B: Prompt Optimisation ‚Äî Fix a Bad Prompt

In [None]:
PROMPT_OPTIMIZER = """
You are a prompt engineering expert. Analyse the given prompt and:
1. Identify at least 3 specific weaknesses
2. Explain why each is problematic
3. Provide a fully rewritten, optimised version

Format:
## Weaknesses
1. [weakness] ‚Äî [why it's a problem]
2. ...
3. ...

## Optimised Prompt
[complete rewritten prompt]
"""

bad_prompt = "Write me something about climate change that's good for my blog."

print("ORIGINAL PROMPT:")
print(f"  '{bad_prompt}'")
print("=" * 60)
print("OPTIMISED VERSION:")
print("=" * 60)

messages = [
    {"role": "system", "content": PROMPT_OPTIMIZER},
    {"role": "user", "content": f"Analyse and optimise this prompt:\n\n\"{bad_prompt}\""}
]
show_messages(messages)
optimised = chat(messages, temperature=0.4)

### Experiment 6C: Iterative Prompt Refinement Loop

Combine meta-prompting with self-critique for **iterative refinement** ‚Äî the model tests its own prompt and improves it.

In [None]:
def iterative_prompt_refinement(task_description: str, n_iterations: int = 2):
    """Iteratively generate, test, and refine a prompt."""

    # Step 1: Generate initial prompt
    print("STEP 1 ‚Äî Generate initial prompt")
    print("=" * 60)

    gen_messages = [
        {"role": "system", "content": "You are a prompt engineer. Write an initial system prompt for the described task. Output only the prompt text, no explanations."},
        {"role": "user", "content": f"Task: {task_description}"}
    ]
    current_prompt = chat(gen_messages, temperature=0.5)

    for iteration in range(n_iterations):
        print(f"\nITERATION {iteration + 1} ‚Äî Test and critique")
        print("=" * 60)

        # Step 2: Test the prompt on a sample input
        test_input = f"Give me the output of this prompt on a typical input. Task context: {task_description}"
        test_messages = [
            {"role": "system", "content": current_prompt},
            {"role": "user", "content": test_input}
        ]
        print("  Testing prompt output:")
        test_output = chat(test_messages, temperature=0.5)

        # Step 3: Critique and improve
        print(f"\n  Critiquing and refining (iteration {iteration + 1})...")
        critique_messages = [
            {
                "role": "system",
                "content": (
                    "You are a prompt quality evaluator. Given a prompt, its output, and the intended task, "
                    "identify specific gaps in the output quality, then rewrite the prompt to address them. "
                    "Output only the improved prompt text."
                )
            },
            {
                "role": "user",
                "content": (
                    f"Task goal: {task_description}\n\n"
                    f"Current prompt:\n{current_prompt}\n\n"
                    f"Output produced:\n{test_output}\n\n"
                    "Identify weaknesses and produce an improved prompt."
                )
            }
        ]
        current_prompt = chat(critique_messages, temperature=0.4)

    print(f"\n{'=' * 60}")
    print(f"FINAL REFINED PROMPT (after {n_iterations} iterations):")
    print("=" * 60)
    display(Markdown(f"```\n{current_prompt}\n```"))
    return current_prompt


_ = iterative_prompt_refinement(
    "Summarise technical documentation into a beginner-friendly overview "
    "with key concepts, analogies, and a quick-start checklist.",
    n_iterations=2
)

### Experiment 6D: Prompt Template Generator ‚Äî Domain-Specific Prompts

In [None]:
DOMAIN_PROMPT_BUILDER = """
You are a prompt library architect. For the given domain and use case, generate a ready-to-use 
prompt template with {placeholder} variables. Include:
- A complete system prompt defining expertise and constraints
- A user message template with clearly labelled placeholders
- Example values for each placeholder
- Expected output format

Make the template immediately usable in a production system.
"""

domains = [
    {"domain": "Legal", "use_case": "Drafting non-disclosure agreement summaries for non-lawyers"},
    {"domain": "Education", "use_case": "Generating adaptive quiz questions at different difficulty levels"},
]

for d in domains:
    print("=" * 60)
    print(f"DOMAIN: {d['domain']} | USE CASE: {d['use_case']}")
    print("=" * 60)
    messages = [
        {"role": "system", "content": DOMAIN_PROMPT_BUILDER},
        {"role": "user", "content": f"Domain: {d['domain']}\nUse case: {d['use_case']}"}
    ]
    _ = chat(messages, temperature=0.5)
    print()

---

## 7. Putting It All Together ‚Äî Advanced Strategy Combinator

In production, these strategies are **combined**. Here's a full pipeline using CoT + Self-Consistency + Prompt Chaining.

In [None]:
def advanced_analysis_pipeline(user_question: str):
    """
    Full pipeline:
    1. Meta-prompt: Generate analysis sub-questions
    2. CoT: Reason through each sub-question
    3. Self-consistency: Vote on key conclusions (2 runs for speed)
    4. Chain: Synthesise into a final answer
    """

    print(f"\nüéØ QUESTION: {user_question}")
    print("=" * 60)

    # === STAGE 1: Meta-prompt ‚Äî decompose the question ===
    print("\nüìå STAGE 1: Decompose into sub-questions (Meta-Prompting)")
    stage1 = chat([
        {"role": "system", "content": "Break the question into exactly 3 focused sub-questions that together fully answer it. Return as a numbered list."},
        {"role": "user", "content": user_question}
    ], temperature=0.4)

    # === STAGE 2: CoT reasoning on all sub-questions ===
    print("\nüìå STAGE 2: Reason through sub-questions (Chain of Thought)")
    stage2 = chat([
        {"role": "system", "content": "For each sub-question, reason step by step, then give a clear finding. Use format: Sub-question N ‚Üí [reasoning] ‚Üí Finding: [result]"},
        {"role": "user", "content": f"Main question: {user_question}\n\nSub-questions:\n{stage1}"}
    ], temperature=0.3)

    # === STAGE 3: Self-consistency check on key conclusion ===
    print("\nüìå STAGE 3: Verify overall conclusion (Self-Consistency, 2 runs)")
    verdicts = []
    for _ in range(2):
        v = chat([
            {"role": "system", "content": "Based on the analysis, give a one-sentence overall verdict. Be direct."},
            {"role": "user", "content": f"Analysis:\n{stage2}\n\nWhat is the overall verdict on: {user_question}"}
        ], show=False, temperature=0.6)
        verdicts.append(v)
        print(f"  Verdict run: {v[:80]}...")

    # === STAGE 4: Final synthesis ===
    print("\nüìå STAGE 4: Synthesise final answer (Prompt Chaining)")
    final = chat([
        {"role": "system", "content": "You are a clear, concise analyst. Write a final, well-structured answer using the provided analysis and verdicts. Use markdown headers."},
        {
            "role": "user",
            "content": (
                f"Question: {user_question}\n\n"
                f"Analysis:\n{stage2}\n\n"
                f"Verdict samples:\n" + "\n".join(f"- {v}" for v in verdicts)
            )
        }
    ], temperature=0.3)

    print("\n‚úÖ Pipeline complete.")
    return final


_ = advanced_analysis_pipeline(
    "Should a startup prioritise product quality or rapid growth in its first year?"
)

---

## 8. Sandbox ‚Äî Try It Yourself!

Experiment with any technique from this notebook.

In [None]:
# ============================================================
#  SANDBOX ‚Äî Combine any strategy!
# ============================================================

# Try: CoT
# messages = [
#     {"role": "user", "content": "Your question here. Let's think step by step."}
# ]

# Try: ReAct (use react_loop)
# react_loop("Your multi-hop question here")

# Try: Self-consistency
# answers = [chat([{"role": "user", "content": "Your question"}], show=False, temperature=0.7) for _ in range(5)]
# print(Counter(answers).most_common(1))

# Try: Meta-prompting
messages = [
    {"role": "system", "content": "You are an expert prompt engineer."},
    {"role": "user", "content": "Generate an optimised prompt for: explaining a complex topic to a complete beginner."}
]

print("SANDBOX")
print("=" * 60)
show_messages(messages)
_ = chat(messages, temperature=0.5)

---

## Key Takeaways

| Strategy | When to Use | Complexity | Reliability Gain |
|----------|-------------|------------|------------------|
| **Chain of Thought** | Math, logic, multi-step reasoning | Low ‚Äî just add a phrase | High |
| **Tree of Thought** | Creative problems, strategy, search with dead ends | Medium ‚Äî multiple API calls | Very High |
| **Self-Consistency** | Ambiguous inputs, high-stakes decisions | Medium ‚Äî N √ó API calls | High |
| **Ensemble Prompting** | When prompt wording is uncertain | Medium ‚Äî N √ó API calls | Medium |
| **ReAct** | Tasks requiring external data, multi-hop lookup | High ‚Äî iterative loop | Very High |
| **Prompt Chaining** | Complex workflows, long-form generation, pipelines | Medium ‚Äî sequential calls | Very High |
| **Meta-Prompting** | Automating prompt creation, optimisation at scale | Medium | High |

### Decision Guide

```
Single reasoning task?      ‚Üí Chain of Thought (CoT)
Multiple valid solutions?   ‚Üí Tree of Thought (ToT)
Uncertain/ambiguous input?  ‚Üí Self-Consistency
Need external information?  ‚Üí ReAct
Complex multi-stage task?   ‚Üí Prompt Chaining
Building a prompt system?   ‚Üí Meta-Prompting
High-stakes production use? ‚Üí Combine ALL of the above
```

### The Advanced Stack

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  META-PROMPT   ‚Üí  Generate the optimal prompt       ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  ToT           ‚Üí  Explore multiple solution paths   ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  CoT           ‚Üí  Reason through the chosen path    ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  ReAct         ‚Üí  Gather facts from tools           ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  CHAIN         ‚Üí  Refine through sequential steps   ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  SELF-CONSIST. ‚Üí  Vote across runs for reliability  ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                          ‚Üì
              Reliable, Reasoned Output
```