# 🤖 Building Your First Agent - Practice Notebook

This notebook guides you through building a Reflection Agent step-by-step.

**What you'll build:** An agent that answers questions and improves its answers through self-reflection.

## Setup

Run the cell below to import what we need.

In [3]:
# Imports
from datetime import datetime
from typing import List, Dict, Optional
import json

print("✅ Imports complete!")

✅ Imports complete!


## Part 1: Build Memory Systems

First, let's create the memory components.

### Short-Term Memory
This holds the current context (last few actions).

In [5]:
class ShortTermMemory:
    """Working memory for current task"""
    
    def __init__(self, max_size: int = 5):
        self.max_size = max_size
        self.context = []
    
    def add(self, item: Dict):
        """Add item to working memory"""
        self.context.append(item)
        
        # Keep only recent items
        if len(self.context) > self.max_size:
            self.context = self.context[-self.max_size:]
    
    def get(self) -> List[Dict]:
        """Get current context"""
        return self.context
    
    def clear(self):
        """Clear working memory"""
        self.context = []

# Test it
stm = ShortTermMemory(max_size=3)
stm.add({"step": 1, "action": "search"})
stm.add({"step": 2, "action": "read"})
print("Short-term memory:", stm.get())
print("✅ Short-term memory works!")

Short-term memory: [{'step': 1, 'action': 'search'}, {'step': 2, 'action': 'read'}]
✅ Short-term memory works!


### Episodic Memory
This logs all actions - like a diary of what the agent did.

In [6]:
class EpisodicMemory:
    """Log of all actions and results"""
    
    def __init__(self):
        self.episodes = []
    
    def log(self, action_type: str, details: Dict):
        """Log an action"""
        episode = {
            "timestamp": datetime.now().isoformat(),
            "action": action_type,
            "details": details
        }
        self.episodes.append(episode)
    
    def get_history(self) -> List[Dict]:
        """Get all episodes"""
        return self.episodes
    
    def get_summary(self) -> str:
        """Summarize the session"""
        summary = f"Total actions: {len(self.episodes)}\n"
        
        action_counts = {}
        for ep in self.episodes:
            action = ep["action"]
            action_counts[action] = action_counts.get(action, 0) + 1
        
        for action, count in action_counts.items():
            summary += f"  - {action}: {count}\n"
        
        return summary

# Test it
em = EpisodicMemory()
em.log("perceive", {"context": "ready"})
em.log("plan", {"action": "search"})
print("Episodic memory:")
print(em.get_summary())
print("✅ Episodic memory works!")

Episodic memory:
Total actions: 2
  - perceive: 1
  - plan: 1

✅ Episodic memory works!


## Part 2: Create Mock LLM

We'll use a mock LLM so you can test without API keys.
Later you can replace this with real OpenAI/Anthropic calls.

In [7]:
class MockLLM:
    """Mock LLM for testing without API keys"""
    
    def __init__(self):
        self.call_count = 0
    
    def complete(self, prompt: str) -> str:
        """Generate a response (simulated)"""
        self.call_count += 1
        print(f"\n💭 [LLM CALL #{self.call_count}]")
        print(f"Prompt preview: {prompt[:80]}...")
        
        # Simulate responses
        if "critique" in prompt.lower() or "evaluate" in prompt.lower():
            if self.call_count <= 3:
                return "Quality: 7/10. The answer is decent but could be more detailed."
            else:
                return "Quality: 9/10. Excellent! Clear and comprehensive."
        else:
            if "difference" in prompt.lower() and "agent" in prompt.lower():
                if self.call_count == 1:
                    return "A language model responds to prompts. An AI agent can autonomously pursue goals using tools."
                else:
                    return """A language model is reactive - it responds to prompts but doesn't take actions.

An AI agent is proactive and autonomous:
1. Has goals it works toward
2. Plans multi-step solutions  
3. Uses tools (search, APIs, code execution)
4. Maintains memory
5. Learns through reflection

Example: ChatGPT is a model. An agent that searches, reads papers, and writes reports is an agent."""
            else:
                return f"This is a simulated answer for: {prompt[:100]}..."

llm = MockLLM()
print("✅ Mock LLM ready!")

✅ Mock LLM ready!


## Part 3: Build the Reflection Agent

Now the main part - the agent with the full loop!

This is the complete agent that uses:
- The agent loop (Perceive → Plan → Act → Reflect)
- Memory systems we just built
- Mock LLM to generate responses

In [8]:
class ReflectionAgent:
    """Agent that improves answers through self-reflection"""
    
    def __init__(self, llm, max_iterations: int = 3):
        self.llm = llm
        self.max_iterations = max_iterations
        
        # Memory systems
        self.short_term = ShortTermMemory()
        self.episodic = EpisodicMemory()
        
        # Agent state
        self.current_question = None
        self.current_answer = None
        self.iteration = 0
        self.satisfied = False
    
    def run(self, question: str) -> str:
        """Main agent loop"""
        self.current_question = question
        self.iteration = 0
        self.satisfied = False
        
        print(f"\n{'='*60}")
        print(f"🎯 QUESTION: {question}")
        print(f"{'='*60}")
        
        while not self.satisfied and self.iteration < self.max_iterations:
            self.iteration += 1
            print(f"\n{'─'*60}")
            print(f"🔄 ITERATION {self.iteration}/{self.max_iterations}")
            print(f"{'─'*60}")
            
            # THE AGENT LOOP
            context = self.perceive()
            action = self.plan(context)
            result = self.act(action)
            self.reflect(result)
        
        print(f"\n{'='*60}")
        print(f"✅ FINAL ANSWER:")
        print(f"{'='*60}")
        print(self.current_answer)
        print(f"{'='*60}\n")
        
        return self.current_answer
    
    def perceive(self) -> Dict:
        """Gather current context"""
        context = {
            "question": self.current_question,
            "current_answer": self.current_answer,
            "iteration": self.iteration
        }
        
        self.episodic.log("perceive", {"iteration": self.iteration})
        print(f"👁️  [PERCEIVE] Gathering context...")
        
        return context
    
    def plan(self, context: Dict) -> Dict:
        """Decide what to do next"""
        if context["current_answer"] is None:
            action = {"type": "generate_answer"}
        else:
            action = {"type": "improve_answer"}
        
        self.episodic.log("plan", action)
        print(f"🧠 [PLAN] Action: {action['type']}")
        
        return action
    
    def act(self, action: Dict) -> Dict:
        """Execute the planned action"""
        if action["type"] == "generate_answer":
            result = self._generate_initial_answer()
        else:
            result = self._improve_answer()
        
        self.short_term.add({"action": action["type"], "success": result["success"]})
        self.episodic.log("act", {"action_type": action["type"]})
        
        return result
    
    def _generate_initial_answer(self) -> Dict:
        """Generate the first answer"""
        print(f"⚡ [ACT] Generating initial answer...")
        
        prompt = f"Question: {self.current_question}\n\nProvide a clear and accurate answer."
        answer = self.llm.complete(prompt)
        self.current_answer = answer
        
        print(f"✓ Generated answer ({len(answer)} chars)")
        
        return {"success": True, "answer": answer}
    
    def _improve_answer(self) -> Dict:
        """Improve based on critique"""
        print(f"⚡ [ACT] Improving answer based on critique...")
        
        recent = self.short_term.get()
        last_critique = None
        for item in reversed(recent):
            if "critique" in item:
                last_critique = item["critique"]
                break
        
        prompt = f"""Question: {self.current_question}
Current Answer: {self.current_answer}
Critique: {last_critique}

Improve the answer based on the critique."""
        
        improved = self.llm.complete(prompt)
        self.current_answer = improved
        
        print(f"✓ Improved answer ({len(improved)} chars)")
        
        return {"success": True, "answer": improved}
    
    def reflect(self, result: Dict):
        """Evaluate and decide next steps"""
        print(f"🤔 [REFLECT] Evaluating quality...")
        
        critique_prompt = f"""Question: {self.current_question}
Answer: {self.current_answer}

Critically evaluate this answer:
1. Is it accurate?
2. Is it complete?
3. What could be improved?

Rate quality 1-10. Format: Quality: [score]/10"""
        
        critique = self.llm.complete(critique_prompt)
        
        self.short_term.add({"critique": critique})
        self.episodic.log("reflect", {"iteration": self.iteration})
        
        print(f"📊 Critique: {critique[:60]}...")
        
        if "9" in critique or "10" in critique:
            self.satisfied = True
            print(f"✅ [REFLECT] Satisfied with quality!")
        elif self.iteration >= self.max_iterations:
            print(f"⏱️  [REFLECT] Max iterations reached")
        else:
            print(f"🔄 [REFLECT] Will improve in next iteration")
    
    def get_session_summary(self) -> str:
        return self.episodic.get_summary()

print("✅ ReflectionAgent class created!")

✅ ReflectionAgent class created!


## Part 4: Test Your Agent!

Now let's create an agent and run it with a test question.

In [9]:
# Create agent instance
agent = ReflectionAgent(llm, max_iterations=3)

# Test question
question = "What is the difference between a language model and an AI agent?"

# Run it!
final_answer = agent.run(question)


🎯 QUESTION: What is the difference between a language model and an AI agent?

────────────────────────────────────────────────────────────
🔄 ITERATION 1/3
────────────────────────────────────────────────────────────
👁️  [PERCEIVE] Gathering context...
🧠 [PLAN] Action: generate_answer
⚡ [ACT] Generating initial answer...

💭 [LLM CALL #1]
Prompt preview: Question: What is the difference between a language model and an AI agent?

Prov...
✓ Generated answer (92 chars)
🤔 [REFLECT] Evaluating quality...

💭 [LLM CALL #2]
Prompt preview: Question: What is the difference between a language model and an AI agent?
Answe...
📊 Critique: Quality: 7/10. The answer is decent but could be more detail...
✅ [REFLECT] Satisfied with quality!

✅ FINAL ANSWER:
A language model responds to prompts. An AI agent can autonomously pursue goals using tools.



## View Session Summary

See how many actions the agent took and what happened.

In [10]:
print("\n📊 SESSION SUMMARY")
print("="*60)
print(agent.get_session_summary())
print(f"\nTotal LLM calls: {llm.call_count}")


📊 SESSION SUMMARY
Total actions: 4
  - perceive: 1
  - plan: 1
  - act: 1
  - reflect: 1


Total LLM calls: 2


## Part 5: Save Session to File

Export the agent's work to a JSON file.

In [14]:
def save_session(agent, filename="agent_session.json"):
    """Save agent session to JSON file"""
    session_data = {
        "question": agent.current_question,
        "final_answer": agent.current_answer,
        "iterations": agent.iteration,
        "history": agent.episodic.get_history()
    }
    
    with open(filename, 'w') as f:
        json.dump(session_data, f, indent=2)
    
    print(f"✅ Session saved to {filename}")

# Use it
save_session(agent, "my_first_agent_session.json")

✅ Session saved to my_first_agent_session.json


## 🎉 Congratulations!

You've built your first complete AI agent!

**What you accomplished:**
- ✅ Implemented the agent loop (Perceive → Plan → Act → Reflect)
- ✅ Added memory systems (short-term + episodic)
- ✅ Created self-reflection capability
- ✅ Built a working autonomous system

**Next steps:**
- Modify the questions and observe behavior
- Try changing max_iterations
- Add real LLM API calls (OpenAI/Anthropic)
- Build more complex agents!

**🎯 Challenges:**
1. Add a confidence score to each answer
2. Compare first vs final answer side-by-side
3. Add more reflection criteria
4. Make the agent handle follow-up questions