# Notebook 02: Building a Simple Agent with Tools

## üéØ What is This Notebook About?

In this notebook, we'll build a simple autonomous agent **step by step** using the LlamaStack SDK directly. You'll see exactly how each component works.

**What we'll learn:**
1. How to connect to LlamaStack using the Python SDK
2. How to define tools in the correct format (OpenAI function calling)
3. How to create an agent with tools using the SDK
4. How to create agent sessions
5. How to execute turns and see the agent's reasoning
6. How to process streaming responses from the agent

**Why this matters:**
- You'll understand the **actual API calls** and responses
- You'll see **raw LlamaStack outputs** to understand how it works
- You'll learn the **fundamentals** before using abstractions
- This foundation prepares you for more advanced agents

**Note:** We'll use the SDK directly here. In later notebooks, we'll show how abstractions can simplify this.

---

## üìö Learning Objectives

By the end of this notebook, you will:
- ‚úÖ Know how to use LlamaStack SDK directly
- ‚úÖ Understand the exact format for tools (OpenAI function calling)
- ‚úÖ Be able to create agents, sessions, and turns manually
- ‚úÖ See and understand agent streaming responses
- ‚úÖ Know how to process tool calls and responses

---

## ‚öôÔ∏è Prerequisites

- LlamaStack server running (see Module README)
- Ollama running with llama3.2:3b model
- Python environment with dependencies installed


In [None]:
# Import required libraries
import os
import sys
import json
import time
from pathlib import Path

# Add src to path for imports (we'll use some helper functions)
notebook_dir = Path().resolve()
src_path = notebook_dir.parent / 'src'
sys.path.insert(0, str(src_path))

# Import LlamaStack SDK - this is what we'll use directly!
from llama_stack_client import LlamaStackClient

# Import helper modules for environment and tools
from environment import SimulatedEnvironment
from tools import ToolRegistry

# Configuration
llamastack_url = os.getenv("LLAMA_STACK_URL", "http://localhost:8321")
model = os.getenv("LLAMA_MODEL", "ollama/llama3.2:3b")

print(f"üì° LlamaStack URL: {llamastack_url}")
print(f"ü§ñ Model: {model}")

# Initialize LlamaStack client
client = LlamaStackClient(base_url=llamastack_url)

# Verify connection
try:
    models = client.models.list()
    print(f"\n‚úÖ Connected to LlamaStack")
    print(f"   Available models: {len(models)}")
    if models:
        print(f"   Using model: {model}")
except Exception as e:
    print(f"\n‚ùå Cannot connect to LlamaStack: {e}")
    print("   Please ensure LlamaStack is running:")
    print("   python scripts/start_llama_stack.py")
    raise


---

## Part 1: Understanding Tools

### What are Tools?

**Tools** are the actions an agent can take. They're defined in a specific format that LlamaStack understands.

**Key Points:**
- Tools use **OpenAI function calling format**
- Each tool has: `type`, `function` (with `name`, `description`, `parameters`)
- Tools are passed to the agent when creating it
- The agent uses LLM reasoning to decide which tool to call

Let's see how tools are structured!


In [None]:
# Step 1: Create a simulated environment
# This gives us a safe place to test our agent
env = SimulatedEnvironment()
print("‚úÖ Created simulated environment")
print(f"   Services: {list(env.services.keys())}")

# Step 2: Create tools using our helper
# We'll use ToolRegistry to help us create tools, but we'll see the actual format
tool_registry = ToolRegistry(env)
print(f"\n‚úÖ Created tool registry")
print(f"   Tools available: {len(tool_registry.list_tools())}")

# Step 3: Get tools in LlamaStack format
# This is the IMPORTANT part - see the exact format!
tools = tool_registry.get_tools_for_llamastack()

print(f"\nüìã Tools in LlamaStack format (OpenAI function calling):")
print(f"   Number of tools: {len(tools)}")
print(f"\n   First tool structure:")
print(json.dumps(tools[0], indent=2))


### Understanding the Tool Format

Notice the structure:
- **`type: "function"`** - This tells LlamaStack it's a function tool
- **`function`** - Contains the tool definition
  - **`name`** - The tool identifier
  - **`description`** - What the tool does (the LLM uses this to decide when to call it)
  - **`parameters`** - JSON Schema defining the tool's inputs

This is the **OpenAI function calling format** that LlamaStack uses. The agent's LLM reads these descriptions and decides which tools to call based on the task.

---

## Part 2: Creating an Agent

Now let's create an agent using the LlamaStack SDK directly. We'll see the exact API call and response!


In [None]:
# Step 1: Define agent instructions
# These tell the agent how to behave and what to do
instructions = """You are an autonomous IT operations agent. Your job is to monitor IT services, identify problems, and take corrective actions.

When analyzing IT services:
1. First, check the status of services to understand the current state
2. Identify any problems (failed services, high CPU/memory, degraded performance)
3. Take appropriate corrective actions (restart failed services, scale overloaded services)
4. Verify that actions were successful
5. Provide a clear summary of what was done

Always be careful and thoughtful. Only take actions that are necessary and safe.
If you're unsure about an action, explain your reasoning."""

print("üìù Agent Instructions:")
print(instructions)


In [None]:
# Step 2: Create the agent using LlamaStack SDK
# This is the ACTUAL API call - see what happens!
print("=" * 60)
print("Creating Agent with LlamaStack SDK")
print("=" * 60)

print(f"\nüì§ Sending request to: {llamastack_url}/v1alpha/agents")
print(f"   Model: {model}")
print(f"   Tools: {len(tools)}")
print(f"\n   Payload structure:")
print(f"   {{")
print(f"     'agent_config': {{")
print(f"       'model': '{model}',")
print(f"       'instructions': '...',")
print(f"       'tools': [list of {len(tools)} tools]")
print(f"     }}")
print(f"   }}")

# Create the agent - this is the actual SDK call
agent_response = client.alpha.agents.create(
    agent_config={
        "model": model,
        "instructions": instructions,
        "tools": tools
    }
)

# See the response!
print(f"\nüì• Response received:")
print(f"   Type: {type(agent_response)}")
print(f"   Agent ID: {agent_response.agent_id}")

# Store the agent_id - we'll need it!
agent_id = agent_response.agent_id
print(f"\n‚úÖ Agent created successfully!")
print(f"   Agent ID: {agent_id}")


### What Just Happened?

1. **We called `client.alpha.agents.create()`** - This is the SDK method for creating agents
2. **We passed `agent_config`** with:
   - `model`: Which LLM to use
   - `instructions`: How the agent should behave
   - `tools`: What actions the agent can take
3. **LlamaStack created the agent** and returned an `agent_id`
4. **The agent is now ready** to receive tasks!

**Important:** The agent exists on the LlamaStack server. We'll use the `agent_id` to interact with it.

---

## Part 3: Creating a Session

Before we can give the agent a task, we need to create a **session**. A session is like a conversation thread - it maintains context for the agent.


In [None]:
# Create an agent session
# Sessions maintain conversation context
print("=" * 60)
print("Creating Agent Session")
print("=" * 60)

session_name = f"session-{int(time.time())}"
print(f"\nüì§ Creating session: {session_name}")

# Create session using SDK
session_response = client.alpha.agents.session.create(
    agent_id=agent_id,
    session_name=session_name
)

# See the response
print(f"\nüì• Session response:")
print(f"   Type: {type(session_response)}")
print(f"   Session ID: {session_response.session_id}")

session_id = session_response.session_id
print(f"\n‚úÖ Session created!")
print(f"   Session ID: {session_id}")
print(f"\nüí° Note: Sessions are different from regular conversations.")
print(f"   Agent sessions maintain agent-specific context and tool state.")


---

## Part 4: Executing a Turn

Now let's give the agent a task and see it work! A **turn** is one interaction with the agent.

**What happens:**
1. We send a message to the agent
2. The agent reasons about what to do
3. The agent may call tools
4. The agent responds with results
5. We see the streaming response in real-time

Let's watch it happen!


In [None]:
# Step 1: Prepare the task
task = "Check the status of all services"
print("=" * 60)
print("Executing Agent Turn")
print("=" * 60)
print(f"\nüìã Task: {task}")

# Step 2: Prepare messages
# Messages are how we communicate with the agent
messages = [
    {
        "role": "user",
        "content": task
    }
]

print(f"\nüì§ Sending messages:")
print(json.dumps(messages, indent=2))


In [None]:
# Step 3: Create a turn and process streaming response
# This is where the magic happens - the agent reasons and acts!
print(f"\nüîÑ Creating turn...")
print(f"   Agent ID: {agent_id}")
print(f"   Session ID: {session_id}")
print(f"   Streaming: True (we'll see responses in real-time)")

# Create turn - this returns a stream of events
turn_stream = client.alpha.agents.turn.create(
    agent_id=agent_id,
    session_id=session_id,
    messages=messages,
    stream=True  # We want to see the response as it's generated
)

print(f"\nüì• Streaming response (showing first few chunks):")
print("=" * 60)

# Process the stream - this is what the agent is doing!
result = ""
turn_id = None
chunk_count = 0

for chunk in turn_stream:
    chunk_count += 1
    
    # Show first few chunks to understand the structure
    if chunk_count <= 3:
        print(f"\n[Chunk {chunk_count}]")
        print(f"  Type: {type(chunk).__name__}")
        if hasattr(chunk, 'event') and chunk.event:
            event = chunk.event
            if hasattr(event, 'payload') and event.payload:
                payload = event.payload
                # Try to convert to dict to see structure
                if hasattr(payload, 'dict'):
                    payload_dict = payload.model_dump()
                    print(f"  Event type: {payload_dict.get('event_type', 'N/A')}")
                    if 'delta' in payload_dict:
                        delta = payload_dict['delta']
                        if isinstance(delta, dict) and 'content' in delta:
                            print(f"  Text: {delta['text'][:100]}...")
    
    # Extract content from chunks
    if hasattr(chunk, 'event') and chunk.event:
        event = chunk.event
        if hasattr(event, 'payload') and event.payload:
            payload = event.payload
            payload_dict = payload.model_dump() if hasattr(payload, 'dict') else {}
            
            # Get content from delta
            if 'delta' in payload_dict:
                delta = payload_dict['delta']
                delta_dict = delta.model_dump() if hasattr(delta, 'dict') else (delta if isinstance(delta, dict) else {})
                if 'text' in delta_dict and delta_dict['text']:
                    result += str(delta_dict['text'])
            
            # Get turn_id
            if 'turn_id' in payload_dict and not turn_id:
                turn_id = payload_dict['turn_id']
            
            # Check for completion
            if payload_dict.get('event_type') in ['turn_complete', 'turn_end', 'complete', 'done']:
                break

print(f"\n{'=' * 60}")
print(f"‚úÖ Turn completed!")
print(f"   Total chunks processed: {chunk_count}")
print(f"   Turn ID: {turn_id}")
print(f"\nüìä Agent Response:")
print("-" * 60)
print(result)
print("-" * 60)


### Understanding the Streaming Response

**What we saw:**
- **Chunks** - Each chunk is a piece of the agent's response
- **Event structure** - Each chunk has an `event` with a `payload`
- **Delta** - The `delta` contains the actual content being generated
- **Event types** - Different events indicate different stages (start, progress, complete)

**Key insight:** The agent is reasoning and responding in real-time. We can see:
- When the agent starts thinking
- When the agent generates text
- When the agent calls tools (if any)
- When the agent finishes

This streaming approach lets us see the agent's "thought process" as it happens!

---

## Part 5: Testing More Complex Tasks

Let's try a more complex task that requires the agent to actually use tools!


In [None]:
# Test 2: A task that requires tool usage
task2 = "Check the status of the web-server service and restart it if it's down"

print("=" * 60)
print("Test 2: Task Requiring Tool Usage")
print("=" * 60)
print(f"\nüìã Task: {task2}")

# Create a new session for this task
session_name2 = f"session-{int(time.time())}-task2"
session_response2 = client.alpha.agents.session.create(
    agent_id=agent_id,
    session_name=session_name2
)
session_id2 = session_response2.session_id

print(f"\nüìù Created new session: {session_id2}")

# Execute the turn
messages2 = [{"role": "user", "content": task2}]

turn_stream2 = client.alpha.agents.turn.create(
    agent_id=agent_id,
    session_id=session_id2,
    messages=messages2,
    stream=True
)

# Process the stream
result2 = ""
print(f"\nüîÑ Agent is working...")
print("=" * 60)

for chunk in turn_stream2:
    if hasattr(chunk, 'event') and chunk.event:
        event = chunk.event
        if hasattr(event, 'payload') and event.payload:
            payload = event.payload
            payload_dict = payload.model_dump() if hasattr(payload, 'dict') else {}
            
            # Extract content
            if 'delta' in payload_dict:
                delta = payload_dict['delta']
                delta_dict = delta.model_dump() if hasattr(delta, 'dict') else (delta if isinstance(delta, dict) else {})
                if 'text' in delta_dict and delta_dict['text']:
                    content = str(delta_dict['text'])
                    result2 += content
                    # Print as it streams
                    print(content, end="", flush=True)
            
            # Check for completion
            if payload_dict.get('event_type') in ['turn_complete', 'turn_end', 'complete', 'done']:
                break

print(f"\n\n{'=' * 60}")
print(f"‚úÖ Task completed!")
print(f"\nüìä Full Response:")
print("-" * 60)
print(result2)
print("-" * 60)


---

## Part 6: Understanding What Happened

### The Agent Execution Flow

Let's break down what just happened:

1. **Task Received**: "Check the status of the web-server service and restart it if it's down"
2. **Agent Reasoning**: The LLM analyzed the task and decided:
   - First, I need to check the service status (use `check_service_status` tool)
   - Then, if it's down, restart it (use `restart_service` tool)
3. **Tool Selection**: The agent selected appropriate tools based on the task
4. **Tool Execution**: Tools were executed (in the simulated environment)
5. **Response Generation**: The agent synthesized the results and responded

### Key Observations

- **The agent reasoned** about what to do
- **The agent selected tools** based on the task
- **The agent executed tools** and got results
- **The agent provided a summary** of what was done

This is the **autonomous agent loop** in action:
- **Observe** (read the task)
- **Think** (reason about what to do)
- **Act** (execute tools)
- **Respond** (provide results)

---

## Summary

### What We Learned

1. **Tool Format**: Tools use OpenAI function calling format with `type`, `function`, `name`, `description`, `parameters`
2. **Agent Creation**: Use `client.alpha.agents.create()` with `agent_config` containing model, instructions, and tools
3. **Session Creation**: Use `client.alpha.agents.session.create()` to create conversation sessions
4. **Turn Execution**: Use `client.alpha.agents.turn.create()` with `stream=True` to see real-time responses
5. **Streaming Processing**: Process chunks to extract content from `event.payload.delta.content`

### Key Takeaways

‚úÖ We saw the **actual SDK calls** and responses  
‚úÖ We understood the **exact format** for tools  
‚úÖ We saw **raw streaming responses** from LlamaStack  
‚úÖ We learned how to **process agent outputs** step by step  
‚úÖ We saw the agent **reason and act** autonomously  

### Next Steps

- In Notebook 03, we'll explore LlamaStack's other features (RAG, MCP, Safety, Eval)
- In Notebook 04, we'll see how to combine features for advanced agents
- Later, we'll see how abstractions can simplify this workflow (but you'll understand what's happening underneath!)
