# Prompt Engineering Notebook: Agent Architectures
*Google Colab–Compatible — Author: ChatGPT (o3) — Date: 2025-07-09*

This interactive notebook introduces **agent architectures** for language‑model‑driven applications. It blends theory with hands‑on demos to help you design, implement, and critique agentic systems.

## Learning Objectives
By the end of this notebook you will be able to:
1. Describe the difference between **reactive**, **planning**, **memory‑augmented**, and **multi‑agent** architectures.
2. Implement a minimal reflex LLM agent that interacts with a user.
3. Extend an agent with tool use following the **ReAct** pattern.
4. Add simple long‑term memory via vector search.
5. Coordinate multiple specialized agents to solve a task.
6. Identify common pitfalls and safety issues in agent design.

## Setup
Run the next cell to install lightweight dependencies used in the examples. Feel free to skip it if you already have them installed.

In [None]:
#@title Install minimal dependencies
!pip -q install langchain openai faiss-cpu python-dotenv

### Environment Variables
If you plan to call the OpenAI API, execute the following cell after replacing `YOUR_KEY_HERE` with your key. The demos will still work with the stubbed local model provided below.

In [None]:
import os, getpass, textwrap
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY', '') or getpass.getpass('OpenAI API Key (leave blank to skip): ')

## 1. Why Agent Architectures?
LLMs excel at single‑turn tasks, but many real‑world applications require **iterative reasoning, tool use, and memory**. Agent architectures wrap an LLM inside a control loop that repeatedly:
1. **Observes** the environment (context, state, user input).
2. **Thinks** using the LLM to decide on an action.
3. **Acts** by calling tools/APIs or replying.
4. **Learns** by updating memory or state.

### Common Agent Categories
| Category | Key Idea | Typical Use Cases |
|----------|---------|-------------------|
| **Reactive** | Respond one step at a time based on current observation. | Chatbots, lightweight assistants |
| **Tool‑Using (ReAct)** | Interleave reasoning (`Thought:`) and tool calls (`Action:`). | Web search, code execution |
| **Planning** | Builds a multi‑step plan before acting. | Complex tasks, code generation, workflows |
| **Memory‑Augmented** | Reads/Writes to long‑term memory vectors or databases. | Personal assistants, tutoring |
| **Multi‑Agent** | Multiple specialized agents collaborate via messages. | Research agents, software engineering |

### The Core Agent Loop

```python
while not task_complete:
    observation = observe()
    thought     = think_with_llm(observation)
    action      = interpret(thought)
    result      = act(action)
    learn_from(result)
```

## 2. Hands‑On: Building a Minimal Reflex Agent

In [None]:
import random

class StubLLM:
    """A tiny fake 'LLM' that answers from a canned list so the notebook works offline."""
    def __init__(self):
        self.replies = [
            "Sure! What would you like me to do next?",
            "I've done that. Anything else?",
            "Task complete. 🎉"
        ]
    def __call__(self, prompt):
        return self.replies[random.randint(0, len(self.replies)-1)]

class ReflexAgent:
    def __init__(self, llm):
        self.llm = llm

    def chat(self, message):
        prompt = f"""You are an assistant. Respond helpfully to: {message}"""
        return self.llm(prompt)

# Instantiate and test
agent = ReflexAgent(StubLLM())
for turn in range(3):
    user_msg = input(f"User ({turn+1}/3): ")
    print("Agent:", agent.chat(user_msg))


**Discussion:** This `ReflexAgent` illustrates the simplest architecture: input text → LLM → output text. There is no tool use, explicit memory, or planning.

**Try it:** Rerun the cell, ask different questions, and observe the limitations.

## 3. Extending with Tool Use — ReAct Pattern
The **ReAct** (Reason + Act) agent alternates between *Thought* and *Action* steps. Below is a minimal implementation that can perform arithmetic using a calculator tool.

In [None]:
import re, math

class CalculatorTool:
    name = "calculator"
    description = "Evaluates basic math expressions like 2*3+1."
    def __call__(self, expression: str):
        try:
            return str(eval(expression, {'__builtins__': {}}, math.__dict__))
        except Exception as e:
            return f"Error: {e}"

class ReActAgent:
    TOOL_PATTERN = re.compile(r"Action: (\w+)\((.*)\)")
    def __init__(self, llm, tools):
        self.llm = llm
        self.tools = {t.name: t for t in tools}

    def run(self, question, max_iters=4):
        scratchpad = ""
        for step in range(max_iters):
            prompt = f"""Answer the question using the following tools when helpful.

Question: {question}

{scratchpad}

Thought:"""
            llm_output = self.llm(prompt)
            scratchpad += f"Thought: {llm_output}\n"
            match = self.TOOL_PATTERN.search(llm_output)
            if match:
                tool_name, arg = match.group(1), match.group(2)
                result = self.tools[tool_name](arg)
                scratchpad += f"Observation: {result}\n"
            else:
                return llm_output  # Final answer
        return "Failed to answer after max steps."

# Demo with stub LLM that acts intentionally
class MathLLM(StubLLM):
    def __call__(self, prompt):
        if "Question:" in prompt and "sqrt(16)+4" in prompt:
            return "Action: calculator(sqrt(16)+4)"
        return "The answer is 8."
react_agent = ReActAgent(MathLLM(), [CalculatorTool()])
print(react_agent.run("What is sqrt(16)+4?"))


**Exercise:**
1. Modify `MathLLM` to answer a different math question.
2. Replace `CalculatorTool` with a web search (hint: use `requests` to call DuckDuckGo).

## 4. Planning Agents
Planning agents create a structured sequence of steps *before* execution. They can optimize or verify the plan, improving reliability.

In [None]:
def simple_planner(goal: str):
    """Return a naïve three‑step plan as a list of strings."""
    return [f"Step 1: Break down '{goal}'", 
            "Step 2: Do the thing", 
            "Step 3: Verify and summarize"]

print(simple_planner("Write a blog post about agent architectures"))


**Challenge:** Implement a `PlannerAgent` that first calls `simple_planner`, then executes each step with an LLM.

## 5. Adding Long‑Term Memory
Many tasks need context beyond the current turn. Vector databases like **FAISS** enable similarity search over past interactions.

In [None]:
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
import numpy as np

# Build a toy memory of past notes
notes = ["Charles likes minimalism.", 
         "Charles teaches Prompt Engineering.", 
         "Charles lives in Michigan."]
emb = OpenAIEmbeddings() if os.getenv('OPENAI_API_KEY') else None
if emb:
    store = FAISS.from_texts(notes, emb)
    query = "Where does Charles live?"
    docs = store.similarity_search(query)
    print("Top memory match:", docs[0].page_content)
else:
    print("⚠️ Skipping FAISS demo — set your API key to enable embeddings.")


**Discussion:** Vector memory allows the agent to retrieve relevant past facts during reasoning.

## 6. Multi‑Agent Collaboration
In complex domains, multiple specialized agents can message each other. Below, the **Coordinator** forwards a task to two sub‑agents and merges the results.

In [None]:
class Coordinator:
    def __init__(self, agents):
        self.agents = agents

    def solve(self, problem):
        results = [agent.chat(problem) for agent in self.agents]
        return "\n---\n".join(results)

team = Coordinator([ReflexAgent(StubLLM()), ReflexAgent(StubLLM())])
print(team.solve("Draft two alternative replies to: 'Explain agent architectures.'"))


**Activity:** Split into pairs and design two specialized agents (e.g., *Researcher* and *Writer*). Have them collaborate via the `Coordinator`.

## 7. Safety & Evaluation Checklist
- **Define boundaries:** Limit tool permissions (e.g., sandbox FS, rate‑limited APIs).
- **Test‑driven prompts:** Write unit tests for critical tasks.
- **Guardrails:** Validate tool arguments before execution.
- **Logging & replay:** Store all agent traces for offline analysis.
- **Red‑teaming:** Challenge the agent with adversarial prompts.

## Further Reading & Resources
- Yao et al., *ReAct: Synergizing Reasoning and Acting in Language Models* (2023)
- Shinn et al., *Reflexion: Language Agents with Verbal Reinforcement Learning* (2023)
- OpenAI **Function Calling** & **Assistant API** docs
- Microsoft *AutoGen* framework
- LangChain Agents documentation


## Assignment
1. Build an agent that reads a TODO list in a text file and marks completed tasks using tool calls.
2. Evaluate it against at least **five** edge cases (invalid task names, ambiguous commands, etc.).
3. Submit a short report describing the architecture, prompt design, and evaluation results.