# 🤖 Build a Self-Reflecting AI Agent
**UofT AI Agents Club - Advanced Track**

Build an AI that can **critique and improve its own responses** - just like how you debug code!

## What We'll Build
- AI that generates solutions → critiques them → improves them
- See it work on real CS problems
- Experiment with different strategies

**Time**: 20-25 minutes | **Level**: Intermediate

Let's code! 🚀

# 📑 Table of Contents
- [Setup & Imports](#setup)
- [Core Demo: Reflection Process](#core-demo)
- [Style Comparison](#style-comparison)
- [Domain Adaptation](#domain-adaptation)
- [Custom Experiment](#custom-experiment)
- [Summary & Next Steps](#summary)

---


## 📚 Step 1: Understanding Self-Reflection

Think about how you solve problems:
1. **First attempt** - You try something
2. **Check your work** - You review what you did
3. **Improve it** - You make it better
4. **Repeat** until satisfied

That's exactly what we're teaching AI to do!

## 💡 The Core Idea

It's like **automated code review** but for any problem:

```
1. AI writes initial solution
2. AI critiques its own work  
3. AI improves based on critique
4. Repeat until good enough
```

**Real-world analogy**: When you write code, run tests, find bugs, fix them, repeat.

**Why it works**: Multiple passes catch more issues than one-shot responses.

## 🛠️ Setup & Imports

Let's import what we need and set up our environment:

<a id="core-demo"></a>

## 🚀 Core Demo: See the Full Reflection Process

This section demonstrates the full self-reflection loop on a real problem. You'll see:
- The initial response
- Each critique and refinement
- The final improved answer

Expand the details below to see the full trace, or just read the summary for a quick overview.


In [1]:
import sys
import os

# Add the parent directory so we can import our simple library
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..')))

# Import our simple library
from lib.agents import SimpleReflectiveAgent, AISimulator

print("\U0001F4E6 Simple library imported successfully!")
print("\U0001F9F1 Ready to build our reflective agent!")

# Quick test to make sure everything works
test_simulator = AISimulator()
test_response = test_simulator.generate_response("test problem", "balanced")
print(f"✅ Test response: {test_response}")

📦 Simple library imported successfully!
🧱 Ready to build our reflective agent!
✅ Test response: Balanced Solution: Analyze the problem systematically by breaking it into smaller components, researching existing solutions, and implementing a well-tested approach. Consider both immediate implementation needs and long-term maintainability.


In [2]:
# Helper function to print the full trace of the reflection process

def print_full_trace(agent):
    trace = agent.get_trace()
    for step in trace:
        print(f"\n--- Iteration {step['iteration']} ---")
        print(f"Response:\n{step['response']}\n")
        if step['critiques']:
            print("Critiques:")
            for c in step['critiques']:
                print(f"- {c}")
        print("-----------------------------")


## 💡 Understanding the Full Response Display

**Good News**: You'll now see the **complete final response** after each reflection process!

**What You'll See**:
- 🔄 **During iterations**: Abbreviated progress (to keep output clean)
- 📋 **At the end**: **COMPLETE FINAL RESULT** with the full improved response
- 📊 **Analysis**: Metrics showing how the response evolved

This way you can follow the thinking process AND see the complete final answer!

## 💡 Example Questions to Try

Here are some great problems to test our reflective agent. Each is designed to showcase different domain-specific reasoning:

**🤖 Machine Learning Questions:**
- "How should I approach a classification problem with unbalanced data?"
- "What's the best way to prevent overfitting in neural networks?"
- "How do I choose between different ML algorithms for my dataset?"

**⚙️ Systems & Architecture:**
- "How do I design a scalable microservices architecture?"
- "What's the best way to handle database scaling in a growing application?"
- "How should I implement caching in a distributed system?"

**🐛 Debugging & Problem-Solving:**
- "My Python application is running slowly, how do I diagnose the issue?"
- "How do I debug a memory leak in my program?"
- "What's the systematic approach to troubleshooting network connectivity issues?"

**📊 Algorithms & Data Structures:**
- "How do I optimize a recursive algorithm to avoid stack overflow?"
- "What's the best approach to implement a LRU cache efficiently?"
- "How should I choose between different sorting algorithms for my use case?"

**🚀 Learning & Career:**
- "What's the most effective way to learn system design for interviews?"
- "How should I structure my coding practice for technical interviews?"
- "What's the best roadmap for transitioning from web development to AI/ML?"

Feel free to use any of these or create your own! The agent will adapt its reasoning style to each domain.

## 🔧 Core Components

Our agent has 3 simple parts:
- **Generator**: Creates responses
- **Evaluator**: Finds problems  
- **Refiner**: Fixes issues

Let's build them:

# 🧪 Test It Out: See the Agent in Action!

from lib.agents import SimpleReflectiveAgent
from IPython.display import display, Markdown

# Choose a real problem to test
problem = "How should I approach a classification problem with unbalanced data?"

# Create the agent
agent = SimpleReflectiveAgent()

# Run the reflection process (suppress verbose output)
final_result = agent.solve(problem, max_iterations=2, style="technical", verbose=False)

# Show the final result
print("\n🎯 Final Result:")
print(final_result)

# Show the full trace in a collapsible Markdown block
trace_md = """<details><summary>🔎 Show Full Reflection Trace</summary>\n"""
for step in agent.get_trace():
    trace_md += f"\n<b>Iteration {step['iteration']}</b><br>"
    trace_md += f"<b>Response:</b> {step['response']}<br>"
    if step['critiques']:
        trace_md += "<b>Critiques:</b><ul>"
        for c in step['critiques']:
            trace_md += f"<li>{c}</li>"
        trace_md += "</ul>"
    trace_md += "<hr>"
trace_md += "</details>"
display(Markdown(trace_md))

# Show analysis of the reflection process
agent.print_analysis()

## 🔬 Try Different Problems

Let's start with a simple problem to see how reflection works:

<a id="style-comparison"></a>

## 🎨 Style Comparison: Technical vs Creative vs Systematic

This section compares how the agent responds to the same problem using different reasoning styles. For each style, you'll see the final answer and a summary. Expand the details to see the full trace if you're curious.


In [None]:
# 🎯 Your Turn: Experiment!

# Try your own problem - pick from our suggested questions or create your own!
your_problem = "What's the best way to prevent overfitting in neural networks?"

print("🔍 Your Problem:", your_problem)
print("\n" + "="*50)

# 🎨 Style Comparison: Technical vs Creative vs Systematic

styles = ["technical", "creative", "systematic"]
results = {}
from IPython.display import display, Markdown
for style in styles:
    print(f"\n{style.title()} style:")
    result = agent.solve(your_problem, max_iterations=1, style=style, verbose=False)
    results[style] = result
    trace_md = f"<details><summary>Show Full Trace ({style.title()})</summary>\n"
    for step in agent.get_trace():
        trace_md += f"\n<b>Iteration {step['iteration']}</b><br>"
        trace_md += f"<b>Response:</b> {step['response']}<br>"
        if step['critiques']:
            trace_md += "<b>Critiques:</b><ul>"
            for c in step['critiques']:
                trace_md += f"<li>{c}</li>"
            trace_md += "</ul>"
        trace_md += "<hr>"
    trace_md += "</details>"
    display(Markdown(trace_md))

# Show a summary of the different approaches
print("\n📋 Style Comparison Summary:")
for style in styles:
    print(f"• {style.title()}: {len(results[style])} characters")

# Test 2: Debugging Problem
print("\n" + "="*60)
problem2 = "My code has a bug, how do I debug it?"
result2 = agent.solve(problem2, max_iterations=2)

# Test 3: System Design
print("\n" + "="*60)
problem3 = "How do I design a scalable web application?"
result3 = agent.solve(problem3, max_iterations=2)

print("\n🎉 All tests completed!")

🔍 Your Problem: What's the best way to learn machine learning?

📊 Comparing different reflection styles:

1. Technical style:
🔍 Problem: What's the best way to learn machine learning?
⚙️ Style: technical | Critique: constructive
------------------------------------------------------------

🔄 Iteration 1
💡 Current response: **Technical Analysis:** Begin with Andrew Ng's ML course, then practice on Kaggle competitions. Focus on understanding the fundamentals before jumping to advanced topics like neural networks. Dive into mathematical foundations, gradient computations, loss function analysis, and hyperparameter optimization strategies.
🔍 Critiques: Missing clear methodology or actionable approach, Algorithm discussions should address complexity analysis, Could include industry best practices or recommendations, Should mention testing or quality assurance aspects
✨ Refined: **Methodology:** **Technical Analysis:** Begin with Andrew Ng's ML course, then practice on Kaggle c...

🎯 Final r

<a id="domain-adaptation"></a>

## 🌍 Domain Adaptation: How the Agent Handles Different Problem Types

This section tests the agent on problems from different CS domains. For each, you'll see the final answer and a summary. Expand the details to see the full trace if you want to dig deeper.


In [None]:
# Experiment 2: Test different problem domains
print("\n" + "="*60)
print("🔬 Testing Different Problem Domains:")

# 🌍 Domain Adaptation: How the Agent Handles Different Problem Types

domain_problems = [
    ("Debugging Problem", "My code has a bug, how do I debug it?", "systematic"),
    ("System Design Problem", "How do I design a scalable web application?", "technical"),
    ("Learning Strategy Problem", "What's the best way to prepare for coding interviews?", "creative")
]
from IPython.display import display, Markdown
for label, problem, style in domain_problems:
    print(f"\n{label}:")
    result = agent.solve(problem, max_iterations=2, style=style, verbose=False)
    print("Final Result:")
    print(result)
    trace_md = f"<details><summary>Show Full Trace ({label})</summary>\n"
    for step in agent.get_trace():
        trace_md += f"\n<b>Iteration {step['iteration']}</b><br>"
        trace_md += f"<b>Response:</b> {step['response']}<br>"
        if step['critiques']:
            trace_md += "<b>Critiques:</b><ul>"
            for c in step['critiques']:
                trace_md += f"<li>{c}</li>"
            trace_md += "</ul>"
        trace_md += "<hr>"
    trace_md += "</details>"
    display(Markdown(trace_md))

<a id="custom-experiment"></a>

## 🎮 Your Custom Experiment

Design your own experiment! Enter your own problem, pick a style, and see both the final result and the full trace (expandable). This is a great way to test the agent on questions that matter to you.


In [None]:
# 🎮 Design Your Own Experiment!

custom_problem = "How can I become a better programmer?"
custom_style = "systematic"
custom_iterations = 2

custom_result = agent.solve(custom_problem, max_iterations=custom_iterations, style=custom_style, verbose=False)

from IPython.display import display, Markdown
trace_md = """<details><summary>Show Full Trace (Custom Experiment)</summary>\n"""
for step in agent.get_trace():
    trace_md += f"\n<b>Iteration {step['iteration']}</b><br>"
    trace_md += f"<b>Response:</b> {step['response']}<br>"
    if step['critiques']:
        trace_md += "<b>Critiques:</b><ul>"
        for c in step['critiques']:
            trace_md += f"<li>{c}</li>"
        trace_md += "</ul>"
    trace_md += "<hr>"
trace_md += "</details>"
display(Markdown(trace_md))

print("\n🎯 Final Result:")
print(custom_result)

print("\n📊 Your Experiment Results:")
print(f"• Final response length: {len(custom_result)} characters")
print(f"• Number of reflection iterations: {len(agent.get_trace())}")
print(f"• Style used: {custom_style}")

print("\n🔍 Reflection Process Analysis:")
agent.print_analysis()

print("\n💡 Try This Next:")
print("• Change the custom_problem to something you're curious about")
print("• Try different styles to see how they affect the response")
print("• Experiment with different iteration counts (1-5)")
print("• Compare results for technical vs. creative problems")

## 🎉 Congratulations!

You've successfully built and tested a self-reflecting AI agent! 🧠✨

### 🎯 What Just Happened?

You built an AI that:
1. **Generated** initial responses to CS problems
2. **Critiqued** its own work (found issues)
3. **Improved** responses based on self-critique
4. **Iterated** until satisfactory
5. **Adapted** to different problem domains and styles

### 🚀 Key Insights
- **Multiple iterations** → better quality responses
- **Self-critique** catches issues humans miss
- **Different styles** produce different approaches
- **Domain adaptation** allows flexible problem-solving
- **Modular design** makes it easy to improve each component

### 🔬 What You've Learned
- How production AI systems like ChatGPT use iterative refinement
- The architecture of self-reflecting AI agents
- How to configure agents for different problem types
- Practical prompt engineering and evaluation techniques

### 🔥 What's Next?

**Ready for more advanced techniques?**
- [`advanced_experiments.ipynb`](advanced_experiments.ipynb) - Multi-strategy agents & performance analysis
- [`research_extensions.ipynb`](research_extensions.ipynb) - Meta-reflection & cutting-edge research

**Want to build your own agents?**
- Modify the `lib/agents.py` file to add new strategies
- Experiment with real LLM APIs (OpenAI, Anthropic)
- Apply these patterns to your own projects

**Ready to join the AI agents community?** This is just the beginning! 🌟

# 🎮 Design Your Own Experiment!

# Step 1: Choose your problem (edit this!)
custom_problem = "How can I become a better programmer?"

# Step 2: Choose your style (try: 'technical', 'creative', 'systematic', 'balanced')
custom_style = "systematic"

# Step 3: Choose iteration count (1-3 recommended for learning)
custom_iterations = 2

print(f"🚀 Your Custom Experiment:")
print(f"Problem: {custom_problem}")
print(f"Style: {custom_style}")
print(f"Iterations: {custom_iterations}")
print("\n" + "="*50)

# Run your experiment
custom_result = agent.solve(custom_problem, max_iterations=custom_iterations, style=custom_style)

# Show the full trace of the reflection process
print("\n================ FULL TRACE ================")
print_full_trace(agent)

# Analyze your results
print("\n📊 Your Experiment Results:")
print(f"• Final response length: {len(custom_result)} characters")
print(f"• Number of reflection iterations: {len(agent.get_trace())}")
print(f"• Style used: {custom_style}")

# Show the reflection trace analysis
print("\n🔍 Reflection Process Analysis:")
agent.print_analysis()

print("\n💡 Try This Next:")
print("• Change the custom_problem to something you're curious about")
print("• Try different styles to see how they affect the response")
print("• Experiment with different iteration counts (1-5)")
print("• Compare results for technical vs. creative problems")

<a id="summary"></a>

## 🎉 Summary & Next Steps

You've completed the workshop! You now know how to:
- Build a self-reflecting agent
- Compare reasoning styles
- See the full improvement process (when you want it)
- Run your own experiments

**Next:**
- Try the advanced experiments notebook for more strategies and performance analysis
- Explore the research extensions for meta-reflection and cutting-edge ideas

Happy experimenting! 🚀


# Experiment 2: Test different problem domains
print("\n" + "="*60)
print("🔬 Testing Different Problem Domains:")

domain_problems = [
    ("Debugging Problem", "My code has a bug, how do I debug it?", "systematic"),
    ("System Design Problem", "How do I design a scalable web application?", "technical"),
    ("Learning Strategy Problem", "What's the best way to prepare for coding interviews?", "creative")
]






















print("\n✨ Notice how the agent adapts its reasoning to different problem types!")    display(Markdown(trace_md))    trace_md += "</details>"        trace_md += "<hr>"            trace_md += "</ul>"                trace_md += f"<li>{c}</li>"            for c in step['critiques']:            trace_md += "<b>Critiques:</b><ul>"        if step['critiques']:        trace_md += f"<b>Response:</b> {step['response']}<br>"        trace_md += f"\n<b>Iteration {step['iteration']}</b><br>"    for step in agent.get_trace():    trace_md = f"<details><summary>Show Full Trace ({label})</summary>\n"    print(result)    print("Final Result:")    result = agent.solve(problem, max_iterations=2, style=style)    print(f"\n{label}:")for label, problem, style in domain_problems:from IPython.display import display, Markdown
# 🎮 Design Your Own Experiment!

# Step 1: Choose your problem (edit this!)
custom_problem = "How can I become a better programmer?"

# Step 2: Choose your style (try: 'technical', 'creative', 'systematic', 'balanced')
custom_style = "systematic"

# Step 3: Choose iteration count (1-3 recommended for learning)
custom_iterations = 2

print(f"🚀 Your Custom Experiment:")
print(f"Problem: {custom_problem}")
print(f"Style: {custom_style}")

















print("• Compare results for technical vs. creative problems")print("• Experiment with different iteration counts (1-5)")print("• Try different styles to see how they affect the response")print("• Change the custom_problem to something you're curious about")print("\n💡 Try This Next:")agent.print_analysis()print("\n🔍 Reflection Process Analysis:")# Show the reflection trace analysisprint(f"• Style used: {custom_style}")print(f"• Number of reflection iterations: {len(agent.get_trace())}")print(f"• Final response length: {len(custom_result)} characters")

print("\n📊 Your Experiment Results:")# Analyze your results

display(Markdown(trace_md))trace_md += "</details>"
    trace_md += "<hr>"

        trace_md += "</ul>"            trace_md += f"<li>{c}</li>"
        for c in step['critiques']:
        trace_md += "<b>Critiques:</b><ul>"

    if step['critiques']:    trace_md += f"<b>Response:</b> {step['response']}<br>"
    trace_md += f"\n<b>Iteration {step['iteration']}</b><br>"
for step in agent.get_trace():
trace_md = """<details><summary>Show Full Trace (Custom Experiment)</summary>\n"""

from IPython.display import display, Markdownprint(custom_result)print("\n🎯 Final Result:")
custom_result = agent.solve(custom_problem, max_iterations=custom_iterations, style=custom_style)print(f"Iterations: {custom_iterations}")


# Run your experimentprint("\n" + "="*50)