# Notebook 7: Step-by-Step Reasoning

In this notebook, you'll learn to improve accuracy on complex tasks by prompting the model to reason through problems systematically.

## What You'll Learn

- Why reasoning helps accuracy
- Chain-of-thought prompting
- Structured reasoning formats
- When reasoning helps vs hurts
- Separating reasoning from final output

## Reference

- [Mistral Prompting Documentation](https://docs.mistral.ai/guides/prompting/)

---
## Setup

In [None]:
%run 00_setup.ipynb

---
## Section 1: Why Reasoning Helps

Without explicit reasoning, models can "jump to conclusions" on complex tasks.

**Problems with direct answers:**
- Skips important intermediate steps
- More prone to errors in math/logic
- Harder to verify the thinking

**Benefits of explicit reasoning:**
- Forces model to work through each step
- Reduces errors in multi-step problems
- Makes verification possible

In [None]:
# A math problem where direct answering might fail
problem = """A store sells apples for $2 each and oranges for $3 each.
If Sarah buys 4 apples and some oranges, and spends exactly $20,
how many oranges did she buy?"""

# Without reasoning
print("WITHOUT REASONING:")
print("-" * 40)
direct_prompt = f"{problem}\n\nAnswer with just the number:"
response_direct = call_mistral(user_prompt=direct_prompt, temperature=0)
print(f"Answer: {response_direct}")

print("\n" + "=" * 50 + "\n")

# With reasoning
print("WITH REASONING:")
print("-" * 40)
reasoning_prompt = f"{problem}\n\nThink step by step, then give your final answer."
response_reasoning = call_mistral(user_prompt=reasoning_prompt, temperature=0)
print(response_reasoning)

---
## Section 2: Basic Chain-of-Thought

The simplest way to enable reasoning is to add a phrase like:

- "Think step by step"
- "Let's work through this systematically"
- "Before answering, reason through the problem"
- "Show your reasoning"

In [None]:
# Different reasoning triggers
logic_problem = """All roses are flowers. Some flowers fade quickly. Can we conclude that some roses fade quickly?"""

triggers = [
    "Answer yes or no.",
    "Think step by step, then answer yes or no.",
    "Before answering, analyze the logical structure of this argument. Then answer yes or no."
]

for trigger in triggers:
    prompt = f"{logic_problem}\n\n{trigger}"
    print(f"Trigger: {trigger[:50]}...")
    print("-" * 40)
    response = call_mistral(user_prompt=prompt, temperature=0)
    print(response)
    print("\n" + "=" * 50 + "\n")

---
## Section 3: Structured Reasoning Formats

You can guide the reasoning into a specific structure.

### Numbered Steps
```
Work through this problem:
1. First, identify the given information
2. Then, determine what we need to find
3. Apply the relevant formulas/rules
4. Calculate the answer
5. Verify the result
```

### Tagged Sections
```
<reasoning>
[Your step-by-step analysis]
</reasoning>

<answer>
[Your final answer]
</answer>
```

In [None]:
# Structured with numbered steps
math_problem = """A train travels from City A to City B at 60 mph. 
The return trip is at 40 mph. 
What is the average speed for the entire round trip?

(Hint: It's not 50 mph!)"""

structured_prompt = f"""{math_problem}

Solve this step by step:
1. Define variables for the unknown distance
2. Calculate time for each leg of the trip
3. Calculate total distance and total time
4. Compute average speed
5. State your final answer"""

response = call_mistral(user_prompt=structured_prompt, temperature=0)
print(response)

In [None]:
# Structured with XML tags (easier to parse)
analysis_problem = """Should a small startup invest in building their own ML infrastructure 
or use a cloud ML service? Consider costs, time-to-market, and technical complexity."""

tagged_prompt = f"""{analysis_problem}

Structure your response as:

<reasoning>
- Analyze each factor (costs, time-to-market, complexity)
- Consider pros and cons of each option
- Weigh the factors for a typical startup
</reasoning>

<recommendation>
Your recommendation with brief justification
</recommendation>"""

response = call_mistral(user_prompt=tagged_prompt, temperature=0)
print(response)

---
## Section 4: When Reasoning Helps vs Hurts

### ‚úÖ Reasoning HELPS:
- Math and calculations
- Logic puzzles and deduction
- Multi-step analysis
- Complex comparisons
- Tasks where accuracy matters more than speed

### ‚ùå Reasoning HURTS (or is unnecessary):
- Simple factual questions ("What is the capital of France?")
- Creative writing tasks
- When you need fast, short responses
- When extra tokens add unwanted cost

In [None]:
# Where reasoning helps: complex analysis
print("REASONING HELPS - Complex Analysis:")
print("-" * 40)

complex_task = """Compare these two investment options for a risk-averse retiree:
Option A: 3% guaranteed return, no risk
Option B: Expected 7% return, but could range from -10% to +20%

Think step by step about the trade-offs, then make a recommendation."""

print(call_mistral(user_prompt=complex_task, temperature=0))

In [None]:
# Where reasoning is overkill: simple factual
print("REASONING IS OVERKILL - Simple Factual:")
print("-" * 40)

simple_task = "What is the capital of Japan? Think step by step."
print(call_mistral(user_prompt=simple_task, temperature=0))
print("\n(This could have been just 'Tokyo'!)")

---
## Section 5: Separating Reasoning from Final Output

Sometimes you want the model to reason but only need the final answer. Options:

1. **Use tags, parse out the answer**
2. **Two-step: reason first, then summarize**
3. **Ask for reasoning "internally" then just the answer**

In [None]:
# Using tags to separate reasoning from answer
problem = """If a shirt originally costs $80 and is on sale for 25% off, 
and you have a coupon for an additional 10% off the sale price, 
what is the final price?"""

tagged_prompt = f"""{problem}

Put your reasoning in <reasoning> tags, then put ONLY the final price in <answer> tags.

Example format:
<reasoning>
Step-by-step work...
</reasoning>

<answer>
$XX.XX
</answer>"""

response = call_mistral(user_prompt=tagged_prompt, temperature=0)
print("Full response:")
print(response)

# Extract just the answer
import re
answer_match = re.search(r'<answer>\s*(.+?)\s*</answer>', response, re.DOTALL)
if answer_match:
    print(f"\nüìç Extracted answer: {answer_match.group(1).strip()}")

In [None]:
# Alternative: Ask for internal reasoning, external answer only
concise_prompt = f"""{problem}

Think through this carefully before answering, but only output the final price.
Output format: $XX.XX"""

response = call_mistral(user_prompt=concise_prompt, temperature=0)
print(f"Concise response: {response}")

---
## Section 6: What NOT to Do (Negative Examples)

### ‚ùå Vague reasoning request
```
Think about it and give me an answer.
```

### ‚ùå Reasoning on trivial tasks
```
What is 2 + 2? Think step by step and show all your work.
```

### ‚ùå Conflicting instructions
```
In one word, what's the capital of France? Think step by step.
```

In [None]:
# Bad: conflicting instructions
print("BAD - Conflicting instructions:")
print("-" * 40)
bad_prompt = "In exactly one word, what is 15 * 23? Think step by step and show your work."
print(call_mistral(user_prompt=bad_prompt, temperature=0))

print("\n" + "=" * 50 + "\n")

# Good: clear separation
print("GOOD - Clear separation:")
print("-" * 40)
good_prompt = """What is 15 * 23?

Think step by step in <reasoning> tags, then give just the number in <answer> tags."""
print(call_mistral(user_prompt=good_prompt, temperature=0))

---
## Exercise 1: With vs Without Reasoning

Compare accuracy on a multi-step problem.

In [None]:
# A problem that benefits from reasoning
word_problem = """
A bookstore has a sale where you buy 2 books, get 1 free (the cheapest one).
Alice picks books priced at $15, $20, and $12.
She also has a membership card that gives her 10% off the total after the sale.
How much does Alice pay?
"""

# Without reasoning
print("WITHOUT REASONING:")
print("-" * 40)
direct_response = call_mistral(
    user_prompt=f"{word_problem}\n\nWhat is the final price? Just give the amount.",
    temperature=0
)
print(f"Answer: {direct_response}")

print("\n" + "=" * 50 + "\n")

# With reasoning
print("WITH REASONING:")
print("-" * 40)
reasoning_response = call_mistral(
    user_prompt=f"{word_problem}\n\nSolve step by step, then state the final price.",
    temperature=0
)
print(reasoning_response)

# Correct answer: Books are $15 + $20 + $12 = $47, free book is $12, so $35, then 10% off = $31.50

---
## Exercise 2: Structured Reasoning with Tag Parsing

Create a reasoning prompt with tags and extract just the answer.

In [None]:
# Task: Determine if an email is spam
email = """
Subject: URGENT: Your account will be suspended!!!

Dear Valued Customer,

We have detected unusual activity on your account. Click here immediately 
to verify your identity or your account will be PERMANENTLY DELETED within 24 hours!

Act now: http://totally-legit-bank.suspicious-domain.com/verify

Best regards,
Your Bank Security Team
"""

# TODO: Create a prompt that:
# 1. Analyzes the email for spam indicators (in <reasoning> tags)
# 2. Returns a verdict (in <verdict> tags): SPAM or NOT_SPAM
# 3. Returns a confidence (in <confidence> tags): high/medium/low

spam_analysis_prompt = f"""Analyze this email to determine if it's spam.

<email>
{email}
</email>

# Your structured format here...
"""

# Uncomment to test
# response = call_mistral(user_prompt=spam_analysis_prompt, temperature=0)
# print(response)
# 
# # Parse out the verdict
# verdict_match = re.search(r'<verdict>\s*(.+?)\s*</verdict>', response, re.DOTALL)
# if verdict_match:
#     print(f"\nüìç Verdict: {verdict_match.group(1).strip()}")

---
## Key Takeaways

1. **Step-by-step reasoning improves accuracy** on complex tasks

2. **Simple triggers work**: "Think step by step" is often enough

3. **Structure the reasoning** if you need consistent format (numbered steps, XML tags)

4. **Know when to use it**: Math, logic, multi-step analysis = yes; simple factual = no

5. **Separate reasoning from answer** using tags when you only need the final result

---

## Next Steps

Now that you can improve reasoning, let's learn how to reduce hallucinations and ground responses in facts.

üìö [Continue to Notebook 8: Reducing Hallucinations ‚Üí](08_reducing_hallucinations.ipynb)