# 🧠 Reasoning Models: Teaching AI to Think Step-by-Step

**Scenario:** Imagine you're building a tutoring app for students learning math and logic. When a student asks "If a train leaves Station A at 60 mph and another leaves Station B at 80 mph, when do they meet?", a regular AI might guess or give a wrong answer. But a **reasoning model** shows its work, thinks step-by-step, and arrives at the correct answer — just like a good teacher would!

**The Problem:** Regular AI models are pattern matchers. They're great at language but can struggle with:
- Complex math problems
- Multi-step logical reasoning
- Problems requiring careful analysis

**The Solution:** **Reasoning models** like DeepSeek-R1, Qwen-QwQ, OpenAI o1, and Gemini 2.0 with thinking mode can show their reasoning process, break down complex problems, and verify their own answers.

In this notebook, you'll learn how to:
- Understand what makes reasoning models different
- Use chain-of-thought prompting
- Solve complex problems step-by-step
- Build applications that require deep reasoning

---

## 🎯 What You'll Build

By the end of this notebook, you'll understand how to:
1. **Reasoning Basics** - How reasoning models work differently
2. **Chain-of-Thought** - Get AI to show its thinking
3. **Math & Logic** - Solve complex problems accurately
4. **Code Debugging** - Find and fix bugs with reasoning
5. **Real-world Applications** - Tutoring bot, problem solver, code reviewer

---

## 📦 Setup: Install Required Packages

We'll use:
- **`litellm`** - Works with reasoning models (DeepSeek, Qwen, OpenAI o1, Gemini)
- **`python-dotenv`** - Load API keys

In [13]:
# Install required packages
!pip install -q litellm python-dotenv

In [14]:
import os
from dotenv import load_dotenv

# Load API key from .env file (if it exists)
load_dotenv()

# Configuration
DEFAULT_MODEL = os.getenv("DEFAULT_MODEL")
DEFAULT_TEMPERATURE = 0.7
DEFAULT_MAX_TOKENS = 500

print(f"✅ Using model: {DEFAULT_MODEL}")

✅ Using model: openrouter/google/gemini-2.0-flash-001


## 🛠️ Helper Functions

Let's create functions to work with reasoning models.

In [33]:
from litellm import completion

def solve_with_reasoning(problem: str, temperature: float = DEFAULT_TEMPERATURE, max_tokens: float = DEFAULT_MAX_TOKENS) -> str:
    """
    Solve a problem using chain-of-thought reasoning.
    
    Args:
        problem: The problem to solve
        temperature: Lower = more focused, higher = more creative
    
    Returns:
        Solution with reasoning steps
    """
    # Add chain-of-thought prompt
    prompt = f"""{problem}

Please solve this step-by-step:
1. First, understand what is being asked
2. Break down the problem into steps
3. Solve each step
4. Verify your answer
5. Provide the final answer

Show your reasoning for each step."""
    
    response = completion(
        model=DEFAULT_MODEL,
        messages=[{"role": "user", "content": prompt}],
        temperature=temperature,
        max_tokens=max_tokens
    )
    
    return response.choices[0].message.content

print("✅ Helper functions loaded!")

✅ Helper functions loaded!


---

## 🤔 Part 1: Understanding Reasoning Models

**What makes reasoning models special?**

Regular AI models predict the next word based on patterns. Reasoning models:
- **Think before answering**: They generate internal reasoning steps
- **Show their work**: You can see how they arrived at the answer
- **Self-verify**: They check their own answers
- **Handle complexity**: Better at multi-step problems

**Real-World Analogy:** It's like the difference between:
- **Regular AI**: A student who memorized answers
- **Reasoning AI**: A student who understands the concepts and can explain their thinking

### 💡 Key Insight

**Chain-of-Thought (CoT) Prompting** is the key technique. Instead of asking "What's the answer?", you ask "Solve this step-by-step and show your work."

This simple change dramatically improves accuracy on complex problems!

---

## 🧮 Part 2: Simple Reasoning Test

**Scenario:** A student asks your tutoring app: "If it takes 5 minutes to boil one egg, how long does it take to boil three eggs?"

This is a classic trick question! Let's see how reasoning helps.

In [16]:
problem = "If it takes 5 minutes to boil one egg, how long does it take to boil three eggs?"

print(f"[Problem:] {problem}\n")

solution = solve_with_reasoning(problem)

print(f"✓ Solution with Reasoning:")
print(solution)

[Problem:] If it takes 5 minutes to boil one egg, how long does it take to boil three eggs?

✓ Solution with Reasoning:
Okay, let's break down this classic riddle step-by-step.

**1. Understand what is being asked:**

The question asks how long it takes to boil three eggs *simultaneously*. It's not asking how long it takes to boil one egg, then another, then another.

**2. Break down the problem into steps:**

*   **Step 1:** Determine the boiling time for a single egg.
*   **Step 2:** Consider if boiling multiple eggs at the same time changes the boiling time.

**3. Solve each step:**

*   **Step 1:** The problem states it takes 5 minutes to boil one egg.
*   **Step 2:** Boiling multiple eggs simultaneously in the same pot of boiling water doesn't change the amount of time it takes for each individual egg to cook. The water needs to reach and maintain a certain temperature regardless of how many eggs are in it (assuming there's enough water to cover all the eggs and maintain a rolling

### ❓ Discussion Question #1

Did the AI fall for the trick? The correct answer is **5 minutes** (you can boil all three eggs at the same time!).

Notice how the reasoning process helps the AI think through the problem logically instead of just pattern-matching to "3 eggs × 5 minutes = 15 minutes".

---

## 🔢 Part 3: Complex Math Problem

**Scenario:** A student needs help with a challenging math problem. Regular AI might get it wrong, but reasoning models can work through it step-by-step.

**What this means:** Instead of guessing, the AI will:
1. Identify what needs to be calculated
2. Break it into smaller steps
3. Solve each step
4. Verify the final answer

In [25]:
problem = "What is the sum of the first 10 prime numbers? Keep the answer concise under 100 words."

print(f"[Problem:] {problem}\n")

solution = solve_with_reasoning(problem, temperature=0.3)  # Lower temp for math

print(f"✓ Solution:")
print(solution)

[Problem:] What is the sum of the first 10 prime numbers? Keep the answer concise under 100 words.

✓ Solution:
Here's the solution:

1. **Understand:** We need to find the first 10 prime numbers and then add them together.

2. **Breakdown:**
   * Identify the first 10 prime numbers.
   * Sum those numbers.

3. **Solve:**
   * Prime numbers are numbers greater than 1, divisible only by 1 and themselves. The first 10 are: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29.
   * Sum: 2 + 3 + 5 + 7 + 11 + 13 + 17 + 19 + 23 + 29 = 129

4. **Verify:** Double-check the list of prime numbers and the addition.

5. **Final Answer:** The sum of the first 10 prime numbers is 129.



### 💡 Key Insight

For math problems, use **lower temperature** (0.2-0.3) to get more focused, deterministic reasoning. Higher temperature can lead to creative but incorrect solutions!

---

## 🐛 Part 4: Code Debugging with Reasoning

**Scenario:** A student wrote code that has a bug. Instead of just fixing it, your tutoring app should explain what's wrong and why.

**Why reasoning helps:** The AI needs to:
1. Understand what the code is supposed to do
2. Trace through the logic
3. Identify where it goes wrong
4. Explain the fix

In [18]:
buggy_code = '''
def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-3)  # Bug here!
'''

debug_prompt = f"""Find and explain the bug in this code:
{buggy_code}

1. First, explain what the code is trying to do
2. Trace through an example (e.g., fibonacci(5))
3. Identify where the bug is
4. Explain why it's wrong
5. Provide the corrected version

Show your reasoning for each step. Keep it concise under 100 words."""

print(f"[Buggy Code:]")
print(buggy_code)
print()

solution = solve_with_reasoning(debug_prompt, temperature=0.3)

print(f"✓ Debug Analysis:")
print(solution)

[Buggy Code:]

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-3)  # Bug here!


✓ Debug Analysis:
Okay, let's break this down.

1. **Understanding the Task:** The task is to identify and correct a bug in a Fibonacci sequence implementation.

2. **Breaking Down the Problem:**
    *   Explain the code's intended purpose.
    *   Trace an example to observe the behavior.
    *   Pinpoint the bug's location.
    *   Explain why the bug causes incorrect results.
    *   Provide a corrected version of the code.

3. **Solving Each Step:**

    *   **Intended Purpose:** The code aims to calculate the nth Fibonacci number.

    *   **Example Trace (fibonacci(5)):**
        *   fibonacci(5) -> fibonacci(4) + fibonacci(2)
        *   fibonacci(4) -> fibonacci(3) + fibonacci(1)
        *   fibonacci(3) -> fibonacci(2) + fibonacci(0)
        *   fibonacci(2) -> fibonacci(1) + fibonacci(-1)  <-- Problem!
        *   fibonacci(1) -> 1
        *   fibonacci(0

### 🎯 Challenge Task #1: Test Your Understanding

Create your own buggy code example and use reasoning to debug it. Try:
- An off-by-one error in a loop
- A logic error in a conditional
- A mathematical mistake in a calculation

See if the AI can identify and explain the bug!

In [19]:
# Your experimentation space for Challenge #1

# Example:
# my_buggy_code = '''
# def is_even(n):
#     return n % 2 == 1  # Bug!
# '''
# 
# debug_result = solve_with_reasoning(f"Find the bug: {my_buggy_code}")
# print(debug_result)

---

## 🧩 Part 5: Logical Reasoning Puzzle

**Scenario:** Your app needs to solve logic puzzles that require careful step-by-step thinking.

**What this means:** Problems like "If all A are B, and all B are C, what can we conclude about A and C?" require logical deduction, not just pattern matching.

In [27]:
logic_puzzle = """
Three friends - Alice, Bob, and Carol - are standing in a line.
- Alice is not first
- Bob is not last
- Carol is not in the middle

What is the order of the three friends? Make sure answer is concise under 200 words.
"""

print(f"[Logic Puzzle:]")
print(logic_puzzle)

solution = solve_with_reasoning(logic_puzzle, temperature=0.3)

print(f"✓ Solution:")
print(solution)

[Logic Puzzle:]

Three friends - Alice, Bob, and Carol - are standing in a line.
- Alice is not first
- Bob is not last
- Carol is not in the middle

What is the order of the three friends? Make sure answer is concise under 200 words.

✓ Solution:
Here's the step-by-step solution:

1.  **Understand:** We need to find the order of Alice, Bob, and Carol in a line, given three constraints.

2.  **Breakdown:**
    *   List all possible arrangements.
    *   Eliminate arrangements that violate the given constraints.

3.  **Solve:**
    *   Possible arrangements: ABC, ACB, BAC, BCA, CAB, CBA
    *   Alice is not first: Eliminate ABC, ACB
    *   Bob is not last: Eliminate ACB, CAB
    *   Carol is not in the middle: Eliminate BAC, CBA
    *   Remaining arrangement: BCA

4.  **Verify:**
    *   BCA: Alice is last (not first), Bob is first (not last), Carol is in the middle (false).
    *   BCA: Alice is last (satisfies), Bob is first (satisfies), Carol is in the middle (does not satisfy).
   

### ❓ Discussion Question #2

Notice how the AI needs to:
1. List the constraints
2. Try different arrangements
3. Check each against the constraints
4. Find the valid solution

This is exactly how a human would solve it! What other types of logic puzzles could benefit from this approach?

---

## 💼 Real-World Applications

### 1. Math Tutoring Bot
**What it does:** Helps students learn by showing step-by-step solutions

**How to build it:**
- Use chain-of-thought prompting for all problems
- Show reasoning steps, not just answers
- Ask follow-up questions to check understanding

In [28]:
# Example: Math Tutoring Bot
def math_tutor(problem: str) -> str:
    """Tutoring bot that explains step-by-step."""
    tutor_prompt = f"""
You are a patient math tutor. A student asks:

{problem}

Provide a step-by-step explanation that helps them learn, not just the answer.
Use simple language and encourage them to think through each step. Make the answer under 200 words.
"""
    
    response = completion(
        model=DEFAULT_MODEL,
        messages=[{"role": "user", "content": tutor_prompt}],
        temperature=0.5,
        max_tokens=DEFAULT_MAX_TOKENS
    )
    
    return response.choices[0].message.content

# Test it
student_question = "How do I solve: 2x + 5 = 15?"
print(f"[Student:] {student_question}\n")

tutor_response = math_tutor(student_question)
print(f"✓ Tutor:")
print(tutor_response)

[Student:] How do I solve: 2x + 5 = 15?

✓ Tutor:
Okay, let's solve this together! The goal is to get 'x' all by itself on one side of the equation.

Think of the equation like a balanced scale. To keep it balanced, whatever we do to one side, we *must* do to the other.

1. **Isolate the term with 'x':** Right now, we have "2x + 5". We want to get rid of that "+ 5".  The opposite of adding 5 is subtracting 5. So, let's subtract 5 from *both* sides of the equation:

   2x + 5 - 5 = 15 - 5

   This simplifies to: 2x = 10

2. **Solve for 'x':** Now we have "2x = 10". This means "2 times x equals 10". To get 'x' alone, we need to undo the multiplication. The opposite of multiplying by 2 is dividing by 2. So, let's divide *both* sides by 2:

   2x / 2 = 10 / 2

   This simplifies to: x = 5

So, the answer is x = 5!  Does that make sense? We basically undid the operations that were happening to 'x' one by one until it was all alone.



### 2. Code Review Assistant
**What it does:** Reviews code and explains potential issues with reasoning

**What this means:** Instead of just saying "This is wrong," it explains:
- What the code does
- What could go wrong
- Why it's a problem
- How to fix it

In [37]:
# Example: Code Review Assistant
code_to_review = '''
def divide_list(numbers, divisor):
    result = []
    for num in numbers:
        result.append(num / divisor)
    return result
'''

review_prompt = f"""
Review this code for potential issues:
{code_to_review}

Think through:
1. What does this code do?
2. What inputs could cause problems?
3. What are the edge cases?
4. How should it be improved?

Provide a thorough code review with reasoning.
"""

print(f"[Code to Review:]")
print(code_to_review)
print()

review = solve_with_reasoning(review_prompt, temperature=0.3, max_tokens=2000)
print(f"✓ Code Review:")
print(review)

[Code to Review:]

def divide_list(numbers, divisor):
    result = []
    for num in numbers:
        result.append(num / divisor)
    return result


✓ Code Review:
Okay, let's break down this code review step-by-step.

**1. Understanding the Task**

The task is to review the provided Python code snippet `divide_list(numbers, divisor)` for potential issues, edge cases, and suggest improvements.  We need to consider what the code does, what could go wrong, and how to make it more robust.

**2. Breaking Down the Problem**

Here's how we'll approach the review:

*   **Functionality:** Describe what the code is intended to do.
*   **Input Validation:** Identify potential problematic inputs.
*   **Edge Cases:** Consider specific scenarios that might lead to unexpected behavior.
*   **Error Handling:**  Determine if the code handles errors gracefully.
*   **Improvements:** Suggest modifications to enhance robustness, readability, or efficiency.
*   **Revised Code:** Provide an improved vers

### 3. Decision Analysis Tool
**What it does:** Helps users make complex decisions by analyzing pros/cons

**How to build it:**
- Present the decision scenario
- Ask AI to list all factors
- Weigh pros and cons systematically
- Provide a reasoned recommendation

In [39]:
# Example: Decision Analysis
decision = """
I'm deciding between two job offers:

Job A:
- Salary: $120,000
- Location: San Francisco (high cost of living)
- Startup with 20 people
- Equity: 0.5%
- Exciting product but risky

Job B:
- Salary: $100,000
- Location: Austin (moderate cost of living)
- Established company with 500 people
- No equity
- Stable but less exciting

Help me think through this decision.
"""

analysis_prompt = f"""
{decision}

Analyze this decision step-by-step:
1. List all important factors
2. Compare each factor
3. Consider long-term implications
4. Identify what matters most
5. Provide a reasoned recommendation

Keep the analysis per step concise.
"""

print(f"[Decision Scenario:]")
print(decision)

analysis = solve_with_reasoning(analysis_prompt, temperature=0.5, max_tokens=2000)
print(f"✓ Analysis:")
print(analysis)

[Decision Scenario:]

I'm deciding between two job offers:

Job A:
- Salary: $120,000
- Location: San Francisco (high cost of living)
- Startup with 20 people
- Equity: 0.5%
- Exciting product but risky

Job B:
- Salary: $100,000
- Location: Austin (moderate cost of living)
- Established company with 500 people
- No equity
- Stable but less exciting

Help me think through this decision.

✓ Analysis:
Okay, let's break down this job offer decision.

**1. Understand what is being asked:**

The task is to analyze two job offers, considering various factors like salary, location, company size, equity, and job stability/excitement, and ultimately recommend which offer to accept based on a reasoned analysis.

**2. Break down the problem into steps:**

As requested, we'll follow these steps:

1.  List all important factors
2.  Compare each factor
3.  Consider long-term implications
4.  Identify what matters most
5.  Provide a reasoned recommendation

**3. Solve each step:**

**Step 1: List all

---

## 🚀 Final Challenge: Build a Problem-Solving Assistant

**Your Task:** Create an assistant that can solve different types of problems:
1. Math problems
2. Logic puzzles
3. Code debugging

**Requirements:**
- Detect the problem type
- Use appropriate reasoning strategy for each type
- Show step-by-step reasoning
- Verify the answer

**Test with:**
- "What is 15% of 240?"
- "If all cats are animals, and some animals are pets, are all cats pets?"
- A buggy code snippet

In [24]:
# Your experimentation space for the Final Challenge

# Example structure:
# def problem_solver(problem: str):
#     # Detect problem type
#     # Choose reasoning strategy
#     # Solve with step-by-step reasoning
#     # Return solution

# Test with different problem types

---

## 🎓 Key Takeaways

Congratulations! You now understand reasoning models:

1. **Chain-of-Thought** - Asking AI to show its work improves accuracy
2. **Step-by-Step** - Breaking problems into steps helps solve complex tasks
3. **Self-Verification** - AI can check its own answers
4. **Transparency** - Seeing reasoning helps trust and debug
5. **Real-world Power** - Tutoring, debugging, decision analysis

### 🔑 Best Practices:

| Problem Type | Temperature | Max Tokens | Reasoning Depth |
|--------------|-------------|------------|----------------|
| Math | Low (0.2-0.3) | 2000+ | High - show all steps |
| Logic Puzzles | Low (0.3) | 1500+ | High - systematic |
| Code Debugging | Low (0.3) | 2000+ | High - trace execution |
| Decision Analysis | Medium (0.5) | 2000+ | Medium - balanced |
| Creative Problem | Higher (0.7) | 1500+ | Medium - flexible |

### ⚠️ Important Considerations:

- **Cost**: Reasoning models use more tokens (longer responses)
- **Speed**: Step-by-step reasoning takes longer
- **Not always needed**: Simple questions don't need complex reasoning
- **Verification**: AI can still make mistakes - always verify critical answers

### 🚀 Next Steps:

- **Try** specialized reasoning models (DeepSeek-R1, Qwen-QwQ, OpenAI o1)
- **Combine** reasoning with tools for more powerful agents
- **Build** educational applications that teach through reasoning
- **Experiment** with different prompting strategies

---

## 📚 Additional Resources

- [OpenAI o1 System Card](https://openai.com/index/learning-to-reason-with-llms/)
- [Chain-of-Thought Prompting Paper](https://arxiv.org/abs/2201.11903)
- [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1)

Happy building! 🎉