# üß† Reasoning Models: Teaching AI to Think Step-by-Step

**Scenario:** Imagine you're building a tutoring app for students learning math and logic. When a student asks "If a train leaves Station A at 60 mph and another leaves Station B at 80 mph, when do they meet?", a regular AI might guess or give a wrong answer. But a **reasoning model** shows its work, thinks step-by-step, and arrives at the correct answer ‚Äî just like a good teacher would!

**The Problem:** Regular AI models are pattern matchers. They're great at language but can struggle with:
- Complex math problems
- Multi-step logical reasoning
- Problems requiring careful analysis

**The Solution:** **Reasoning models** like DeepSeek-R1, Qwen-QwQ, OpenAI o1, and Gemini 2.0 with thinking mode can show their reasoning process, break down complex problems, and verify their own answers.

In this notebook, you'll learn how to:
- Understand what makes reasoning models different
- Use chain-of-thought prompting
- Solve complex problems step-by-step
- Build applications that require deep reasoning

---

## üéØ What You'll Build

By the end of this notebook, you'll understand how to:
1. **Reasoning Basics** - How reasoning models work differently
2. **Chain-of-Thought** - Get AI to show its thinking
3. **Math & Logic** - Solve complex problems accurately
4. **Code Debugging** - Find and fix bugs with reasoning
5. **Real-world Applications** - Tutoring bot, problem solver, code reviewer

---

## üì¶ Setup: Install Required Packages

We'll use:
- **`litellm`** - Works with reasoning models (DeepSeek, Qwen, OpenAI o1, Gemini)
- **`python-dotenv`** - Load API keys

In [2]:
# Install required packages
!pip install -q litellm python-dotenv

In [3]:
import os
from dotenv import load_dotenv

# Load API key from .env file (if it exists)
load_dotenv()

# Configuration
DEFAULT_MODEL = os.getenv("DEFAULT_MODEL")
DEFAULT_TEMPERATURE = 0.7
DEFAULT_MAX_TOKENS = 500

print(f"‚úÖ Using model: {DEFAULT_MODEL}")

‚úÖ Using model: openrouter/google/gemini-2.0-flash-001


## üõ†Ô∏è Helper Functions

Let's create functions to work with reasoning models.

In [4]:
from litellm import completion

import litellm
import logging

litellm.suppress_debug_info = True
logging.getLogger("litellm").setLevel(logging.CRITICAL)

def solve_with_reasoning(problem: str, temperature: float = DEFAULT_TEMPERATURE, max_tokens: float = DEFAULT_MAX_TOKENS) -> str:
    """
    Solve a problem using chain-of-thought reasoning.
    
    Args:
        problem: The problem to solve
        temperature: Lower = more focused, higher = more creative
    
    Returns:
        Solution with reasoning steps
    """
    # Add chain-of-thought prompt
    prompt = f"""{problem}

Please solve this step-by-step:
1. First, understand what is being asked
2. Break down the problem into steps
3. Solve each step
4. Verify your answer
5. Provide the final answer

Show your reasoning for each step."""
    
    response = completion(
        model=DEFAULT_MODEL,
        messages=[{"role": "user", "content": prompt}],
        temperature=temperature,
        max_tokens=max_tokens
    )
    
    return response.choices[0].message.content

print("‚úÖ Helper functions loaded!")

‚úÖ Helper functions loaded!


---

## ü§î Part 1: Understanding Reasoning Models

**What makes reasoning models special?**

Regular AI models predict the next word based on patterns. Reasoning models:
- **Think before answering**: They generate internal reasoning steps
- **Show their work**: You can see how they arrived at the answer
- **Self-verify**: They check their own answers
- **Handle complexity**: Better at multi-step problems

**Real-World Analogy:** It's like the difference between:
- **Regular AI**: A student who memorized answers
- **Reasoning AI**: A student who understands the concepts and can explain their thinking

### üí° Key Insight

**Chain-of-Thought (CoT) Prompting** is the key technique. Instead of asking "What's the answer?", you ask "Solve this step-by-step and show your work."

This simple change dramatically improves accuracy on complex problems!

---

## üßÆ Part 2: Simple Reasoning Test

**Scenario:** A student asks your tutoring app: "If it takes 5 minutes to boil one egg, how long does it take to boil three eggs?"

This is a classic trick question! Let's see how reasoning helps.

In [16]:
problem = "If it takes 5 minutes to boil one egg, how long does it take to boil three eggs?"

print(f"[Problem:] {problem}\n")

solution = solve_with_reasoning(problem)

print(f"‚úì Solution with Reasoning:\n")
print(solution)

[Problem:] If it takes 5 minutes to boil one egg, how long does it take to boil three eggs?

‚úì Solution with Reasoning:

Here's the step-by-step solution:

1. **Understand what is being asked:** The question asks how long it takes to boil three eggs *at the same time*. It's implied that you can boil all three eggs in the same pot.

2. **Break down the problem into steps:**
   *  The time it takes to boil an egg depends on reaching a certain internal temperature.
   *  If you boil multiple eggs in the same pot, the water temperature is the same for all eggs.
   *  Therefore, all eggs will cook at the same rate.

3. **Solve each step:**
   *  It takes 5 minutes to boil one egg.
   *  Since all three eggs are in the same pot, they all experience the same boiling water temperature.
   *  Therefore, it will still take 5 minutes to boil all three eggs.

4. **Verify your answer:** Imagine you put one egg in a pot and start a timer. Then, after the water is boiling, you add two more eggs. Th

### ‚ùì Discussion Question #1

Did the AI fall for the trick? The correct answer is **5 minutes** (you can boil all three eggs at the same time!).

Notice how the reasoning process helps the AI think through the problem logically instead of just pattern-matching to "3 eggs √ó 5 minutes = 15 minutes".

---

## üî¢ Part 3: Complex Math Problem

**Scenario:** A student needs help with a challenging math problem. Regular AI might get it wrong, but reasoning models can work through it step-by-step.

**What this means:** Instead of guessing, the AI will:
1. Identify what needs to be calculated
2. Break it into smaller steps
3. Solve each step
4. Verify the final answer

In [6]:
problem = "What is the sum of the first 10 prime numbers? Keep the answer concise under 100 words."

print(f"[Problem:] {problem}\n")

solution = solve_with_reasoning(problem, temperature=0.3)  # Lower temp for math

print(f"‚úì Solution:")
print(solution)

[Problem:] What is the sum of the first 10 prime numbers? Keep the answer concise under 100 words.

‚úì Solution:
Here's the solution:

1.  **Understand:** We need to find the first 10 prime numbers and then add them together.

2.  **Steps:**
    *   Identify the first 10 prime numbers.
    *   Sum those numbers.

3.  **Solve:**
    *   Prime numbers are numbers greater than 1 that are only divisible by 1 and themselves. The first 10 prime numbers are: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29.
    *   Sum: 2 + 3 + 5 + 7 + 11 + 13 + 17 + 19 + 23 + 29 = 129

4.  **Verify:** Double-check the list of prime numbers and the addition.

5.  **Answer:** The sum of the first 10 prime numbers is 129.



### üí° Key Insight

For math problems, use **lower temperature** (0.2-0.3) to get more focused, deterministic reasoning. Higher temperature can lead to creative but incorrect solutions!

---

## üêõ Part 4: Code Debugging with Reasoning

**Scenario:** A student wrote code that has a bug. Instead of just fixing it, your tutoring app should explain what's wrong and why.

**Why reasoning helps:** The AI needs to:
1. Understand what the code is supposed to do
2. Trace through the logic
3. Identify where it goes wrong
4. Explain the fix

In [19]:
buggy_code = '''
def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-3)  # Bug here!
'''

debug_prompt = f"""Find and explain the bug in this code:
{buggy_code}

1. First, explain what the code is trying to do
2. Trace through an example (e.g., fibonacci(5))
3. Identify where the bug is
4. Explain why it's wrong
5. Provide the corrected version

Show your reasoning for each step. Keep it concise under 100 words."""

print(f"[Buggy Code:]")
print(buggy_code)
print()

solution = solve_with_reasoning(debug_prompt, temperature=0.3, max_tokens=1000)

print(f"‚úì Debug Analysis:\n")
print(solution)

[Buggy Code:]

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-3)  # Bug here!


‚úì Debug Analysis:

Okay, let's break this down.

1. **Understanding the task:** The code aims to calculate the nth Fibonacci number. The task is to find the bug, explain it, and provide a corrected version.

2. **Breaking down the problem:**
    *   Explain the code's intended functionality.
    *   Trace an example execution.
    *   Pinpoint the bug.
    *   Explain why the bug causes incorrect results.
    *   Provide the corrected code.

3. **Solving each step:**

    *   **Intended functionality:** The code *attempts* to calculate the nth Fibonacci number, where each number is the sum of the two preceding ones.

    *   **Example trace (fibonacci(5)):**
        *   fibonacci(5) -> fibonacci(4) + fibonacci(2)
        *   fibonacci(4) -> fibonacci(3) + fibonacci(1)
        *   fibonacci(3) -> fibonacci(2) + fibonacci(0)
        *   fibonacci(2) -> fibonacci(1)

### üéØ Challenge Task #1: Test Your Understanding

Create your own buggy code example and use reasoning to debug it. Try:
- An off-by-one error in a loop
- A logic error in a conditional
- A mathematical mistake in a calculation

See if the AI can identify and explain the bug!

In [8]:
# Your experimentation space for Challenge #1

# Example:
# my_buggy_code = '''
# def is_even(n):
#     return n % 2 == 1  # Bug!
# '''
# 
# debug_result = solve_with_reasoning(f"Find the bug: {my_buggy_code}")
# print(debug_result)

---

## üß© Part 5: Logical Reasoning Puzzle

**Scenario:** Your app needs to solve logic puzzles that require careful step-by-step thinking.

**What this means:** Problems like "If all A are B, and all B are C, what can we conclude about A and C?" require logical deduction, not just pattern matching.

In [20]:
logic_puzzle = """
Three friends - Alice, Bob, and Carol - are standing in a line.
- Alice is not first
- Bob is not last
- Carol is not in the middle

What is the order of the three friends? Make sure answer is concise under 200 words.
"""

print(f"[Logic Puzzle:]")
print(logic_puzzle)

solution = solve_with_reasoning(logic_puzzle, temperature=0.3)

print(f"‚úì Solution:\n")
print(solution)

[Logic Puzzle:]

Three friends - Alice, Bob, and Carol - are standing in a line.
- Alice is not first
- Bob is not last
- Carol is not in the middle

What is the order of the three friends? Make sure answer is concise under 200 words.

‚úì Solution:

Here's the solution:

1.  **Understand the problem:** We need to determine the order of Alice, Bob, and Carol in a line, given three constraints.

2.  **Break down the problem:**
    *   List all possible orders.
    *   Eliminate orders based on each constraint.

3.  **Solve each step:**

    *   **Possible orders:** ABC, ACB, BAC, BCA, CAB, CBA (6 possibilities)
    *   **Alice is not first:** Eliminate ABC and ACB. Remaining: BAC, BCA, CAB, CBA
    *   **Bob is not last:** Eliminate CAB. Remaining: BAC, BCA, CBA
    *   **Carol is not in the middle:** Eliminate BCA. Remaining: BAC, CBA

4.  **Verify the answer:**
    *   **BAC:** Alice is last, Bob is first, Carol is in the middle. This satisfies all conditions
    *   **CBA:** Alice is

### ‚ùì Discussion Question #2

Notice how the AI needs to:
1. List the constraints
2. Try different arrangements
3. Check each against the constraints
4. Find the valid solution

This is exactly how a human would solve it! What other types of logic puzzles could benefit from this approach?

---

## üíº Real-World Applications

### 1. Math Tutoring Bot
**What it does:** Helps students learn by showing step-by-step solutions

**How to build it:**
- Use chain-of-thought prompting for all problems
- Show reasoning steps, not just answers
- Ask follow-up questions to check understanding

In [21]:
# Example: Math Tutoring Bot
def math_tutor(problem: str) -> str:
    """Tutoring bot that explains step-by-step."""
    tutor_prompt = f"""
You are a patient math tutor. A student asks:

{problem}

Provide a step-by-step explanation that helps them learn, not just the answer.
Use simple language and encourage them to think through each step. Make the answer under 200 words.
"""
    
    response = completion(
        model=DEFAULT_MODEL,
        messages=[{"role": "user", "content": tutor_prompt}],
        temperature=0.5,
        max_tokens=DEFAULT_MAX_TOKENS
    )
    
    return response.choices[0].message.content

# Test it
student_question = "How do I solve: 2x + 5 = 15?"
print(f"[Student:] {student_question}\n")

tutor_response = math_tutor(student_question)
print(f"‚úì Tutor:\n")
print(tutor_response)

[Student:] How do I solve: 2x + 5 = 15?

‚úì Tutor:

Okay, let's tackle this equation together! Think of an equation like a balanced scale. We want to find the value of 'x' that keeps the scale balanced.

**Step 1: Isolate the term with 'x'.**

Right now, we have "2x + 5". We want to get rid of that "+ 5". The opposite of adding 5 is subtracting 5. So, let's subtract 5 from *both* sides of the equation to keep it balanced:

2x + 5 - 5 = 15 - 5

This simplifies to:

2x = 10

**Step 2: Solve for 'x'.**

Now we have "2x = 10". This means "2 times x equals 10". To find 'x', we need to do the opposite of multiplying by 2, which is dividing by 2. So, divide *both* sides by 2:

2x / 2 = 10 / 2

This simplifies to:

x = 5

**Therefore, the solution is x = 5!**

Does that make sense? We essentially "undid" the operations to get 'x' by itself. Let me know if you'd like to try another example!



### 2. Code Review Assistant
**What it does:** Reviews code and explains potential issues with reasoning

**What this means:** Instead of just saying "This is wrong," it explains:
- What the code does
- What could go wrong
- Why it's a problem
- How to fix it

In [22]:
# Example: Code Review Assistant
code_to_review = '''
def divide_list(numbers, divisor):
    result = []
    for num in numbers:
        result.append(num / divisor)
    return result
'''

review_prompt = f"""
Review this code for potential issues:
{code_to_review}

Think through:
1. What does this code do?
2. What inputs could cause problems?
3. What are the edge cases?
4. How should it be improved?

Provide a thorough code review with reasoning.
"""

print(f"[Code to Review:]")
print(code_to_review)
print()

review = solve_with_reasoning(review_prompt, temperature=0.3, max_tokens=2000)
print(f"‚úì Code Review:")
print(review)

[Code to Review:]

def divide_list(numbers, divisor):
    result = []
    for num in numbers:
        result.append(num / divisor)
    return result


‚úì Code Review:
Okay, let's break down this code review step-by-step.

**1. Understanding the Task**

The task is to review the provided Python code snippet, identify potential issues, edge cases, and suggest improvements.  We need to consider what the code does, what could go wrong, and how to make it more robust and reliable.

**2. Breaking Down the Problem**

Here's a breakdown of the steps we'll take:

*   **Analyze the Code:** Understand the code's purpose and functionality.
*   **Identify Potential Issues:** Look for potential errors, exceptions, or unexpected behavior.
*   **Consider Edge Cases:** Think about unusual or boundary conditions that might cause problems.
*   **Suggest Improvements:** Propose modifications to address the identified issues and improve the code's quality.
*   **Provide Reasoning:** Explain the rationale 

### 3. Decision Analysis Tool
**What it does:** Helps users make complex decisions by analyzing pros/cons

**How to build it:**
- Present the decision scenario
- Ask AI to list all factors
- Weigh pros and cons systematically
- Provide a reasoned recommendation

In [23]:
# Example: Decision Analysis
decision = """
I'm deciding between two job offers:

Job A:
- Salary: $120,000
- Location: San Francisco (high cost of living)
- Startup with 20 people
- Equity: 0.5%
- Exciting product but risky

Job B:
- Salary: $100,000
- Location: Austin (moderate cost of living)
- Established company with 500 people
- No equity
- Stable but less exciting

Help me think through this decision.
"""

analysis_prompt = f"""
{decision}

Analyze this decision step-by-step:
1. List all important factors
2. Compare each factor
3. Consider long-term implications
4. Identify what matters most
5. Provide a reasoned recommendation

Keep the analysis per step concise.
"""

print(f"[Decision Scenario:]")
print(decision)

analysis = solve_with_reasoning(analysis_prompt, temperature=0.5, max_tokens=2000)
print(f"‚úì Analysis:\n")
print(analysis)

[Decision Scenario:]

I'm deciding between two job offers:

Job A:
- Salary: $120,000
- Location: San Francisco (high cost of living)
- Startup with 20 people
- Equity: 0.5%
- Exciting product but risky

Job B:
- Salary: $100,000
- Location: Austin (moderate cost of living)
- Established company with 500 people
- No equity
- Stable but less exciting

Help me think through this decision.

‚úì Analysis:

Okay, let's break down this job offer decision step-by-step.

**1. Understand what is being asked:**

The user wants help deciding between two job offers by analyzing the important factors, comparing them, considering long-term implications, identifying priorities, and ultimately receiving a recommendation.

**2. Break down the problem into steps:**

We will follow the user's suggested steps:

*   **Step 1:** List all important factors.
*   **Step 2:** Compare each factor.
*   **Step 3:** Consider long-term implications.
*   **Step 4:** Identify what matters most.
*   **Step 5:** Provide

---

## üöÄ Final Challenge: Build a Problem-Solving Assistant

**Your Task:** Create an assistant that can solve different types of problems:
1. Math problems
2. Logic puzzles
3. Code debugging

**Requirements:**
- Detect the problem type
- Use appropriate reasoning strategy for each type
- Show step-by-step reasoning
- Verify the answer

**Test with:**
- "What is 15% of 240?"
- "If all cats are animals, and some animals are pets, are all cats pets?"
- A buggy code snippet

In [13]:
# Your experimentation space for the Final Challenge

# Example structure:
# def problem_solver(problem: str):
#     # Detect problem type
#     # Choose reasoning strategy
#     # Solve with step-by-step reasoning
#     # Return solution

# Test with different problem types

---

## üéì Key Takeaways

Congratulations! You now understand reasoning models:

1. **Chain-of-Thought** - Asking AI to show its work improves accuracy
2. **Step-by-Step** - Breaking problems into steps helps solve complex tasks
3. **Self-Verification** - AI can check its own answers
4. **Transparency** - Seeing reasoning helps trust and debug
5. **Real-world Power** - Tutoring, debugging, decision analysis

### üîë Best Practices:

| Problem Type | Temperature | Max Tokens | Reasoning Depth |
|--------------|-------------|------------|----------------|
| Math | Low (0.2-0.3) | 2000+ | High - show all steps |
| Logic Puzzles | Low (0.3) | 1500+ | High - systematic |
| Code Debugging | Low (0.3) | 2000+ | High - trace execution |
| Decision Analysis | Medium (0.5) | 2000+ | Medium - balanced |
| Creative Problem | Higher (0.7) | 1500+ | Medium - flexible |

### ‚ö†Ô∏è Important Considerations:

- **Cost**: Reasoning models use more tokens (longer responses)
- **Speed**: Step-by-step reasoning takes longer
- **Not always needed**: Simple questions don't need complex reasoning
- **Verification**: AI can still make mistakes - always verify critical answers

### üöÄ Next Steps:

- **Try** specialized reasoning models (DeepSeek-R1, Qwen-QwQ, OpenAI o1)
- **Combine** reasoning with tools for more powerful agents
- **Build** educational applications that teach through reasoning
- **Experiment** with different prompting strategies

---

## üìö Additional Resources

- [OpenAI o1 System Card](https://openai.com/index/learning-to-reason-with-llms/)
- [Chain-of-Thought Prompting Paper](https://arxiv.org/abs/2201.11903)
- [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1)

Happy building! üéâ