# Lab 1: Building the Brain

## From Scratch to Framework

In this lab, you will:

1. **Part 1:** Build a "raw" ReAct agent from scratch using a simple `while` loop and text parsing
2. **Part 2:** Compare your raw agent with the project's `ReactAgent` that uses native function calling

### Learning Goals
- Understand that an agent is just a **loop** with state and reasoning
- Experience the **pain** of parsing free-text tool calls
- Appreciate why **native function calling** is used in production

### Prerequisites
- `uv pip install litellm python-dotenv`
- A valid API key in your `.env` file (e.g., `OPENAI_API_KEY=sk-...`)

---
## Setup

In [None]:
import os
import re
import json
from dotenv import load_dotenv
from litellm import completion

load_dotenv()

MODEL = os.getenv("MODEL_NAME", "gpt-4o")
print(f"Using model: {MODEL}")

---
## Part 1: The "Raw" ReAct Agent

We'll build a ReAct agent that uses **text-based** tool calling. The LLM outputs plain text in a specific format, and we parse it to extract tool calls.

### Step 1: Define Mock Tools

First, let's create some simple tools our agent can use. These are just Python functions — no API keys needed.

In [None]:
# Mock tools for our agent

def search(query: str) -> str:
    """Simulate a web search."""
    mock_results = {
        "capital of france": "Paris is the capital and most populous city of France.",
        "population of paris": "The population of Paris is approximately 2.1 million in the city proper and 12.2 million in the metropolitan area.",
        "eiffel tower height": "The Eiffel Tower is 330 metres (1,083 ft) tall, about the same height as an 81-storey building.",
        "python programming": "Python is a high-level, general-purpose programming language created by Guido van Rossum in 1991.",
        "largest ocean": "The Pacific Ocean is the largest and deepest ocean on Earth, covering more than 63 million square miles.",
    }
    query_lower = query.lower()
    for key, value in mock_results.items():
        if key in query_lower or any(word in query_lower for word in key.split()):
            return value
    return f"No results found for: {query}"


def calculate(expression: str) -> str:
    """Evaluate a math expression safely."""
    try:
        allowed = set('0123456789+-*/.(). ')
        if all(c in allowed for c in expression):
            result = eval(expression)
            return str(result)
        return "Error: Invalid expression"
    except Exception as e:
        return f"Error: {e}"


# Tool registry — maps tool names to functions
TOOLS = {
    "search": search,
    "calculate": calculate,
}

# Test the tools
print(search("capital of france"))
print(calculate("15 * 500 / 100"))

### Step 2: The ReAct System Prompt

We need a system prompt that tells the LLM to follow the Thought -> Action -> Observation format **exactly**. This is the fragile part — the model must follow the format precisely for our parser to work.

In [None]:
REACT_SYSTEM_PROMPT = """You are a helpful research assistant that solves tasks step by step.

You have access to these tools:
- search(query): Search for information. Input: a search query string.
- calculate(expression): Evaluate a math expression. Input: a math expression string.

Follow this EXACT format for EVERY step:

Thought: <your reasoning about what to do next>
Action: <tool_name>("<argument>")

After receiving an Observation, continue with the next Thought.

When you have enough information to answer, use:

Thought: I have enough information to answer.
Final Answer: <your complete answer>

IMPORTANT:
- ALWAYS start with a Thought before any Action.
- Use EXACTLY one Action per step.
- Wait for the Observation before your next Thought.
- Never fabricate Observations — only use real tool results.
"""

print("System prompt loaded.")

### Step 3: Build the Parser

This is where the pain begins. We need to parse the LLM's text output to extract:
- Whether it produced a **Final Answer** (we're done)
- Or an **Action** with a tool name and argument (we need to execute it)

**TODO:** Complete the `parse_action` function to extract the tool name and argument from the LLM's response.

In [None]:
def parse_response(text: str) -> dict:
    """
    Parse the LLM's text response to extract either a Final Answer or an Action.
    
    Returns:
        {"type": "final_answer", "content": "..."}
        or
        {"type": "action", "tool": "search", "argument": "capital of france"}
        or
        {"type": "error", "content": "Could not parse response"}
    """
    # Check for Final Answer
    final_match = re.search(r'Final Answer:\s*(.+)', text, re.DOTALL)
    if final_match:
        return {"type": "final_answer", "content": final_match.group(1).strip()}
    
    # TODO: Parse the Action line to extract tool name and argument
    # The format is: Action: tool_name("argument")
    # Hint: Use a regex like r'Action:\s*(\w+)\("(.+?)"\)'
    # Return: {"type": "action", "tool": tool_name, "argument": argument}
    
    # --- YOUR CODE HERE ---
    pass
    # --- END YOUR CODE ---
    
    return {"type": "error", "content": f"Could not parse response: {text[:200]}"}


# Test the parser
test1 = 'Thought: I need to find the capital.\nAction: search("capital of france")'
test2 = 'Thought: I have enough info.\nFinal Answer: Paris has 2.1 million people.'
print("Test 1:", parse_response(test1))
print("Test 2:", parse_response(test2))

### Step 4: Build the Agent Loop

Now the core — the `while` loop that drives the agent. Each iteration:
1. Calls the LLM with the conversation history
2. Parses the response for an Action or Final Answer
3. If Action: executes the tool and appends the Observation
4. If Final Answer: returns the result

**TODO:** Complete the agent loop by:
1. Calling the LLM with `completion()`
2. Executing the tool when an action is parsed
3. Appending the observation back to messages

In [None]:
def run_react_agent(query: str, max_steps: int = 5) -> dict:
    """
    Run a text-based ReAct agent loop.
    
    Returns:
        {"answer": str, "steps": list, "total_steps": int}
    """
    messages = [
        {"role": "system", "content": REACT_SYSTEM_PROMPT},
        {"role": "user", "content": query},
    ]
    steps = []
    
    for step in range(max_steps):
        print(f"\n{'='*50}")
        print(f"Step {step + 1}")
        print(f"{'='*50}")
        
        # TODO 1: Call the LLM using litellm's completion()
        # Hint: response = completion(model=MODEL, messages=messages, max_tokens=512)
        # Then extract: assistant_text = response.choices[0].message.content
        
        # --- YOUR CODE HERE ---
        assistant_text = ""  # Replace this
        # --- END YOUR CODE ---
        
        print(f"LLM Output:\n{assistant_text}")
        
        # Append assistant response to messages
        messages.append({"role": "assistant", "content": assistant_text})
        
        # Parse the response
        parsed = parse_response(assistant_text)
        steps.append({"step": step + 1, "raw_output": assistant_text, "parsed": parsed})
        
        if parsed["type"] == "final_answer":
            print(f"\nFinal Answer: {parsed['content']}")
            return {"answer": parsed["content"], "steps": steps, "total_steps": step + 1}
        
        elif parsed["type"] == "action":
            tool_name = parsed["tool"]
            argument = parsed["argument"]
            
            # TODO 2: Execute the tool and get the observation
            # Hint: Look up the tool in TOOLS dict, call it with the argument
            # Handle the case where the tool doesn't exist
            
            # --- YOUR CODE HERE ---
            observation = ""  # Replace this
            # --- END YOUR CODE ---
            
            print(f"\nObservation: {observation}")
            
            # TODO 3: Append the observation back to messages
            # The agent needs to see the tool result to continue reasoning
            # Hint: Append as a user message with "Observation: {observation}"
            
            # --- YOUR CODE HERE ---
            pass
            # --- END YOUR CODE ---
        
        else:
            print(f"\nParse error: {parsed['content']}")
            # Nudge the agent to follow the format
            messages.append({
                "role": "user",
                "content": "Please follow the exact format: Thought: ... Action: tool_name(\"argument\")"
            })
    
    return {
        "answer": "[Agent reached max steps without a final answer]",
        "steps": steps,
        "total_steps": max_steps,
    }

### Step 5: Test Your Agent

Try these queries — they require multi-step reasoning:

In [None]:
# Test 1: Multi-step factual question
result = run_react_agent("What is the population of the capital of France?")
print(f"\n{'='*50}")
print(f"Answer: {result['answer']}")
print(f"Total steps: {result['total_steps']}")

In [None]:
# Test 2: Requires search + calculation
result = run_react_agent("How tall is the Eiffel Tower in feet? What is that height divided by 3?")
print(f"\n{'='*50}")
print(f"Answer: {result['answer']}")
print(f"Total steps: {result['total_steps']}")

### Reflection: The Pain Points

After running the agent, think about these questions:

1. **Did the parser always work?** Did the LLM deviate from the expected format?
2. **How fragile is the regex?** What happens if the model writes `Action: search('query')` instead of `search("query")`?
3. **How would you handle multiple tool calls per step?** (Hint: you can't with text parsing — but native calling supports it)
4. **How would you debug a failure?** You have the raw text, but no structured trace.

---
## Part 2: Native Function Calling — The Production Approach

Now let's compare with the **native** approach. Instead of parsing text, we use the LLM's built-in function calling API, which returns structured JSON.

### The Key Differences

| Aspect | Part 1 (Text) | Part 2 (Native) |
|--------|---------------|------------------|
| Tool format | Free text: `Action: search("...")` | Structured JSON: `{"name": "search", ...}` |
| Parsing | Regex (fragile) | `json.loads()` (guaranteed) |
| Multiple tools | Not supported | Built-in support |
| Error handling | Manual | API-level |
| Debugging | Raw text | Structured tool_calls object |

In [None]:
# Define tools as OpenAI-compatible schemas
TOOLS_SCHEMA = [
    {
        "type": "function",
        "function": {
            "name": "search",
            "description": "Search for information on a topic. Returns relevant text.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query"
                    }
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Evaluate a mathematical expression. Returns the result.",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "The math expression to evaluate (e.g., '15 * 500 / 100')"
                    }
                },
                "required": ["expression"]
            }
        }
    }
]

print(f"Defined {len(TOOLS_SCHEMA)} tool schemas for native calling.")

In [None]:
def run_native_agent(query: str, max_steps: int = 5) -> dict:
    """
    Run an agent using native function calling.
    No text parsing needed — the API returns structured tool calls.
    """
    messages = [
        {"role": "system", "content": "You are a helpful research assistant. Use the provided tools to answer questions."},
        {"role": "user", "content": query},
    ]
    steps = []
    
    for step in range(max_steps):
        print(f"\n--- Step {step + 1} ---")
        
        # Call LLM with tools parameter
        response = completion(
            model=MODEL,
            messages=messages,
            tools=TOOLS_SCHEMA,
            tool_choice="auto",
            max_tokens=512,
        )
        
        message = response.choices[0].message
        assistant_content = message.content
        tool_calls = message.tool_calls
        
        # Add assistant message to history
        messages.append(message)
        
        step_info = {"step": step + 1, "content": assistant_content, "tool_calls": []}
        
        if assistant_content:
            print(f"Content: {assistant_content[:200]}")
        
        # Handle tool calls — structured JSON, no parsing needed!
        if tool_calls:
            for tc in tool_calls:
                func_name = tc.function.name
                func_args = json.loads(tc.function.arguments)
                
                print(f"Tool Call: {func_name}({func_args})")
                
                # Execute tool
                if func_name in TOOLS:
                    result = TOOLS[func_name](**func_args)
                else:
                    result = f"Error: Unknown tool '{func_name}'"
                
                print(f"Result: {result}")
                
                step_info["tool_calls"].append({
                    "tool": func_name, "args": func_args, "result": result
                })
                
                # Feed result back — structured format
                messages.append({
                    "tool_call_id": tc.id,
                    "role": "tool",
                    "name": func_name,
                    "content": result,
                })
        
        steps.append(step_info)
        
        # If no tool calls and we have content, the agent is done
        if not tool_calls and assistant_content:
            return {"answer": assistant_content, "steps": steps, "total_steps": step + 1}
    
    return {
        "answer": "[Max steps reached]",
        "steps": steps,
        "total_steps": max_steps,
    }

In [None]:
# Test the native agent with the same query
result = run_native_agent("What is the population of the capital of France?")
print(f"\n{'='*50}")
print(f"Answer: {result['answer']}")
print(f"Total steps: {result['total_steps']}")

---
## Part 3: Compare and Reflect

### Side-by-Side Comparison

Run both agents on the same query and compare:

| Metric | Text-Based | Native |
|--------|-----------|--------|
| Steps taken | ? | ? |
| Parse errors | ? | 0 (guaranteed) |
| Code complexity | High (regex) | Low (structured) |
| Debugging ease | Hard | Easy |

### Key Takeaways

1. **Text-based parsing is fragile** — the model may deviate from the format in subtle ways
2. **Native calling is robust** — structured JSON from the API, no regex needed
3. **Text-based teaches the mechanics** — you understand what native calling does under the hood
4. **Both are the same loop** — `while` + state + reasoning. Only the interface differs.

### Next Steps

Open the project's `src/agent/react_agent.py` and compare with your native agent above. Note how it:
- Uses a `ToolRegistry` for dynamic tool management
- Implements proper error handling with try/except
- Logs each step for debugging (preview of Session 3's tracing)