# Notebook 02: Building a Simple Agent with Tools

## üéØ What is This Notebook About?

In this notebook, we'll build autonomous agents with tools using LlamaStack. We'll start with a custom Wikipedia search tool and then build more complex custom tools.

**What we'll learn:**
1. How to create a custom Wikipedia search tool
2. How to create sessions and execute turns
3. How to see tool calls using AgentEventLogger
4. How to create custom (client-side) tools
5. How to use custom tools with agents

---

## ‚öôÔ∏è Prerequisites

- LlamaStack server running (see Module README)
- Python environment with dependencies installed
- `requests` library (for Wikipedia API calls)


In [None]:
# Import required libraries
import os
from llama_stack_client import LlamaStackClient, AgentEventLogger
from termcolor import cprint

# ============================================================================
# Configuration - Update these values for your OpenShift deployment
# ============================================================================

# LlamaStack URL - Get from OpenShift route or set manually
# Option 1: Use OpenShift route (external access)
#   Get route URL: oc get route llamastack-route -n my-first-model -o jsonpath='{.spec.host}'
#   Example: llamastack_url = "https://llamastack-route-my-first-model.apps.ocp.example.com"
llamastack_url = os.getenv("LLAMA_STACK_URL", "https://llamastack-route-my-first-model.apps.ocp.5pndc.sandbox5432.opentlc.com")

# Option 2: Use service URL (if running inside cluster)
#   llamastack_url = "http://lsd-llama-milvus-inline-service.my-first-model.svc.cluster.local:8321"

# Option 3: Use localhost (if port-forwarding)
#   llamastack_url = "http://localhost:8321"

# Model identifier - Use the full identifier from LlamaStack
model = os.getenv("LLAMA_MODEL", "vllm-inference/llama-32-3b-instruct")

# ============================================================================

print(f"üì° LlamaStack URL: {llamastack_url}")
print(f"ü§ñ Model: {model}")

# Initialize LlamaStack client
client = LlamaStackClient(base_url=llamastack_url)

# Verify connection
try:
    models = client.models.list()
    model_count = len(models.data) if hasattr(models, 'data') else len(models)
    print(f"\n‚úÖ Connected to LlamaStack")
    print(f"   Available models: {model_count}")
except Exception as e:
    print(f"\n‚ùå Cannot connect to LlamaStack: {e}")
    print("\nüí° Troubleshooting:")
    print("   1. Check if route exists: oc get route llamastack-route -n my-first-model")
    print("   2. Update llamastack_url variable above with your route URL")
    print("   3. Or set LLAMA_STACK_URL environment variable:")
    print("      export LLAMA_STACK_URL='https://<route-host>'")
    print("   4. Or use service URL if running in cluster:")
    print("      llamastack_url = 'http://lsd-llama-milvus-inline-service.my-first-model.svc.cluster.local:8321'")
    raise


---

## Part 1: Create an Agent with Custom Wikipedia Search

**What we're doing:** Creating a **custom Wikipedia search tool** and giving it to an agent.

**Why:** Wikipedia is perfect for learning because:
- ‚úÖ No API key required (free and easy!)
- ‚úÖ Runs client-side (in your Python process - no external dependencies)
- ‚úÖ Shows how to create custom tools (the pattern you'll use for everything)
- ‚úÖ Perfect for factual information and general knowledge

**The fun part:** We'll create a `wikipedia_search` function and pass it to the agent. The agent will automatically detect it as a tool based on the function's docstring! No complex configuration needed - just write a function with a good docstring, and the agent figures it out.


In [None]:
# Step 1: Create a custom Wikipedia search tool
# Wikipedia doesn't require an API key - perfect for learning!
print("=" * 60)
print("Creating Custom Wikipedia Search Tool")
print("=" * 60)

# Import requests for API calls
import requests
import inspect
import json
import re

def wikipedia_search(query: str, max_results: int = 3) -> str:
    """
    Search Wikipedia for information.
    
    This tool searches Wikipedia for factual information. Use it when you need
    general knowledge, definitions, or information about well-known topics.
    
    :param query: The search query to look up (e.g., "Python programming", "Artificial Intelligence")
    :param max_results: Maximum number of results to return (default: 3)
    :return: A formatted string with Wikipedia search results
    """
    try:
        import requests
        
        # First, try the search API (more flexible for general queries)
        search_url = "https://en.wikipedia.org/api/rest_v1/page/search"
        params = {"q": query, "limit": max_results}
        search_response = requests.get(search_url, params=params, timeout=5)
        
        if search_response.status_code == 200:
            results = search_response.json().get("pages", [])
            if results:
                formatted = f"Wikipedia search results for '{query}':\n\n"
                for j, page in enumerate(results[:max_results], 1):
                    title = page.get("title", "No title")
                    snippet = page.get("snippet", "No snippet")
                    # Try to get the full summary for the first result
                    if j == 1:
                        try:
                            page_url = f"https://en.wikipedia.org/api/rest_v1/page/summary/{title.replace(' ', '_')}"
                            page_response = requests.get(page_url, timeout=5)
                            if page_response.status_code == 200:
                                page_data = page_response.json()
                                extract = page_data.get("extract", snippet)
                                formatted += f"{j}. {title}\n"
                                formatted += f"   {extract[:400]}...\n\n"
                            else:
                                formatted += f"{j}. {title}\n"
                                formatted += f"   {snippet[:300]}...\n\n"
                        except:
                            formatted += f"{j}. {title}\n"
                            formatted += f"   {snippet[:300]}...\n\n"
                    else:
                        formatted += f"{j}. {title}\n"
                        formatted += f"   {snippet[:300]}...\n\n"
                return formatted.strip()
        
        # Fallback: try direct page access
        page_url = "https://en.wikipedia.org/api/rest_v1/page/summary/" + query.replace(" ", "_")
        response = requests.get(page_url, timeout=5)
        
        if response.status_code == 200:
            data = response.json()
            title = data.get("title", query)
            extract = data.get("extract", "No summary available")
            page_url_desktop = data.get("content_urls", {}).get("desktop", {}).get("page", "")
            
            result = f"Wikipedia: {title}\n"
            result += f"URL: {page_url_desktop}\n"
            result += f"Summary: {extract[:500]}...\n"
            return result
            
        return f"No Wikipedia results found for: {query}"
            
    except Exception as e:
        return f"Error searching Wikipedia: {str(e)}"

# Helper function to parse JSON tool calls from response
def parse_tool_call(content):
    """Parse JSON tool call from model response."""
    content = content.strip()
    
    # Try to find JSON object
    json_match = re.search(r'\{[^{}]*"tool_call"[^{}]*\}', content)
    if not json_match:
        # Try broader match for nested objects
        brace_count = 0
        start_idx = content.find('{')
        if start_idx != -1:
            for i in range(start_idx, len(content)):
                if content[i] == '{':
                    brace_count += 1
                elif content[i] == '}':
                    brace_count -= 1
                    if brace_count == 0:
                        json_str = content[start_idx:i+1]
                        try:
                            return json.loads(json_str)
                        except:
                            pass
    
    if json_match:
        try:
            return json.loads(json_match.group())
        except:
            pass
    
    return None

# Store the function for client-side execution
wikipedia_search_func = wikipedia_search
tool_registry = {"wikipedia_search": wikipedia_search_func}

# Create system prompt describing the tool
wikipedia_tool_description = """- wikipedia_search(query: str, max_results: int = 3): Search Wikipedia for information. 
  Use this tool when you need general knowledge, definitions, or information about well-known topics.
  Parameters:
    - query (string): The search query to look up (e.g., "Python programming", "Artificial Intelligence")
    - max_results (integer): Maximum number of results to return (default: 3)"""

system_prompt = f"""You are a helpful assistant with access to tools.

When you need to use a tool, respond ONLY with valid JSON in this exact format:
{{
  "tool_call": "function_name",
  "arguments": {{"param1": "value1", "param2": "value2"}}
}}

Available tools:
{wikipedia_tool_description}

IMPORTANT: 
- If you need to use a tool, respond with ONLY the JSON object, nothing else.
- Do not include any text before or after the JSON.
- Use the exact function name as shown above."""

print("\n‚úÖ Custom wikipedia_search tool created!")
print("   Provider: Wikipedia (no API key required)")
print("   Function: wikipedia_search(query: str, max_results: int = 3)")
print("\n‚úÖ System prompt created for prompt-based tool calling")
print("   Note: vLLM doesn't support the 'tools' parameter, so we use prompts instead")


In [None]:
# Step 2: Create helper function for executing turns
# This function handles tool detection and execution

def execute_turn(task, messages, tool_registry, max_tokens=200):
    """
    Execute a turn with the agent, detecting and executing tool calls.
    
    Returns True if a tool was used, False otherwise.
    """
    print(f"\nüìã Task: {task}")
    print(f"\nüîÑ Agent execution (using prompt-based tool calling):")
    print("-" * 60)
    
    cprint(f"User> {task}", "green")
    
    # Add user message to conversation
    messages.append({
        "role": "user",
        "content": task,
    })
    
    # Create chat completion with streaming
    stream = client.chat.completions.create(
        model=model,
        messages=messages,
        stream=True,
        max_tokens=max_tokens,
        temperature=0.1  # Lower temperature for more consistent JSON output
    )
    
    # Process the stream
    full_response = ""
    print("\nü§ñ Agent response:")
    print("-" * 60)
    for chunk in stream:
        if chunk.choices and len(chunk.choices) > 0:
            delta = chunk.choices[0].delta
            if delta.content:
                content = delta.content
                print(content, end="", flush=True)
                full_response += content
    
    print("\n\n" + "-" * 60)
    
    # Check if response contains a tool call
    tool_call_data = parse_tool_call(full_response)
    tool_calls_found = False
    
    if tool_call_data and "tool_call" in tool_call_data:
        tool_calls_found = True
        func_name = tool_call_data["tool_call"]
        func_args = tool_call_data.get("arguments", {})
        
        print(f"\nüîß Tool call detected: {func_name}")
        print(f"   Arguments: {func_args}")
        
        # Execute the tool
        if func_name in tool_registry:
            print(f"\n   ‚Üí Executing {func_name}...")
            try:
                result = tool_registry[func_name](**func_args)
                print(f"   ‚úÖ Tool result received ({len(result)} chars)")
                
                # Add assistant message with tool call
                messages.append({
                    "role": "assistant",
                    "content": full_response
                })
                
                # Add tool result to conversation
                messages.append({
                    "role": "user",
                    "content": f"Tool result: {result}. Now provide a natural language response based on this information."
                })
                
                # Get final response with tool results
                print("\nü§ñ Final response with tool results:")
                print("-" * 60)
                final_response = client.chat.completions.create(
                    model=model,
                    messages=messages,
                    stream=True,
                    max_tokens=300
                )
                
                final_text = ""
                for chunk in final_response:
                    if chunk.choices and len(chunk.choices) > 0:
                        delta = chunk.choices[0].delta
                        if delta.content:
                            content = delta.content
                            print(content, end="", flush=True)
                            final_text += content
                
                # Add final assistant response to messages
                messages.append({
                    "role": "assistant",
                    "content": final_text
                })
            except Exception as e:
                print(f"   ‚ùå Error executing tool: {e}")
                tool_calls_found = False
        else:
            print(f"   ‚ö†Ô∏è  Tool '{func_name}' not found in registry")
            tool_calls_found = False
    else:
        # No tool call detected - model answered directly
        messages.append({
            "role": "assistant",
            "content": full_response
        })
        print("\nüí° Model responded directly without using tools")
    
    print(f"\n\n{'=' * 60}")
    if tool_calls_found:
        print("‚úÖ Turn completed! (Tools were used)")
    else:
        print("‚ö†Ô∏è  Turn completed, but no tool calls detected")
    print(f"{'=' * 60}")
    
    return tool_calls_found

# Step 3: Initialize conversation
# We'll use a messages list to maintain conversation context
print("\n" + "=" * 60)
print("Initializing Conversation")
print("=" * 60)

# Initialize messages list with system prompt
messages = [
    {"role": "system", "content": system_prompt}
]

print("\n‚úÖ Conversation initialized!")
print("\nüí° We'll use a messages list to maintain conversation history and context.")
print("   The system prompt describes available tools and how to use them.")


In [None]:
# Step 4: Execute a turn (give the agent a task)
# A turn is one interaction with the agent
print("\n" + "=" * 60)
print("Executing Turn with Task")
print("=" * 60)

task = "What is Python programming? Use wikipedia_search to find information."

# Use the execute_turn helper function
tool_used = execute_turn(task, messages, tool_registry)

# Inspect the turn - check if tools were called
print("\n" + "=" * 60)
print("Inspecting Turn Response")
print("=" * 60)

print("\n‚úÖ Turn completed!")
print("\nüí° Tip: Look for these indicators in the output above:")
print("   üîß = Tool call detected")
print("   ‚Üí = Tool being executed")
print("   ‚úÖ = Tool result received")
print("\nIf you see üîß, the tool was used successfully!")


---

## Part 2: Create Custom IT Operations Tools

**What we're doing:** Creating **custom IT operations tools** - Python functions that agents can use to manage infrastructure.

**Why:** This is where it gets real! We'll build tools that check service status, restart services, and get system overviews - the kind of tools you'd use in production.

**Key Points:**
- Custom tools are Python functions with **docstrings** (the docstring is critical - the LLM reads it!)
- Docstrings describe the tool to the LLM (what it does, when to use it, what parameters it needs)
- Tools are passed directly to the Agent (no complex configuration)
- Agent automatically detects and uses them (based on the docstring - magic!)

**The fun part:** You'll see the agent reason about which tool to use. It might check status first, then decide to restart a service if needed. Watch it think!

In [None]:
# Step 1: Create a simple simulated IT environment
class SimpleITEnvironment:
    """Simple simulated IT environment for our custom tools."""
    def __init__(self):
        self.services = {
            "web-server": {"status": "online", "cpu": 45, "memory": 60},
            "database": {"status": "online", "cpu": 30, "memory": 50},
            "cache-service": {"status": "degraded", "cpu": 85, "memory": 90},
        }
    
    def get_service_status(self, service_name: str) -> dict:
        """Get service status."""
        return self.services.get(service_name, {"status": "not_found"})
    
    def restart_service(self, service_name: str) -> str:
        """Restart a service."""
        if service_name in self.services:
            self.services[service_name]["status"] = "online"
            self.services[service_name]["cpu"] = 20
            self.services[service_name]["memory"] = 30
            return f"Service {service_name} restarted successfully"
        return f"Service {service_name} not found"


# Initialize environment
env = SimpleITEnvironment()
print("‚úÖ Created simulated IT environment")
print(f"   Services: {list(env.services.keys())}")

In [None]:
# Step 2: Define custom tools as Python functions with docstrings
# The docstrings are CRITICAL - the LLM uses them to understand the tools!

def check_service_status(service_name: str) -> str:
    """
    Check the status of an IT service.
    
    Returns information about service health including CPU usage, memory usage, and current status.
    Use this to monitor service health and detect issues.
    
    :param service_name: The name of the service to check (e.g., 'web-server', 'database', 'cache-service')
    :return: A string describing the service status, CPU usage, and memory usage
    """
    status = env.get_service_status(service_name)
    if status.get("status") == "not_found":
        return f"Service '{service_name}' not found"
    
    return (
        f"Service: {service_name}\n"
        f"Status: {status['status']}\n"
        f"CPU Usage: {status['cpu']}%\n"
        f"Memory Usage: {status['memory']}%"
    )


def restart_service(service_name: str) -> str:
    """
    Restart an IT service that is not working properly.
    
    Use this when a service has failed or is in a degraded state.
    This will stop and start the service, which may cause brief downtime.
    
    :param service_name: The name of the service to restart (e.g., 'web-server', 'database')
    :return: A string confirming the restart operation
    """
    return env.restart_service(service_name)


def get_all_services() -> str:
    """
    Get the status of all services in the environment.
    
    Use this to get an overview of the entire system health.
    Returns a list of all services with their current status and metrics.
    
    :return: A string listing all services and their status
    """
    result = "All Services Status:\n"
    for name, status in env.services.items():
        result += f"\n{name}:\n"
        result += f"  Status: {status['status']}\n"
        result += f"  CPU: {status['cpu']}%\n"
        result += f"  Memory: {status['memory']}%\n"
    return result


print("\n‚úÖ Custom tools defined:")
print("   - check_service_status(service_name: str)")
print("   - restart_service(service_name: str)")
print("   - get_all_services()")
print("\nüí° Key Points:")
print("   - Tools are Python functions with docstrings")
print("   - Docstrings describe what the tool does")
print("   - :param describes parameters")
print("   - The LLM reads these docstrings to understand when to use each tool")
print("\n‚ö†Ô∏è  Note: These functions use the 'env' variable defined in the previous cell.")
print("   Make sure to run the previous cell first!")

In [None]:
# Step 3: Prepare custom tools for prompt-based calling
# Bind the environment to the tool functions

# Create bound functions that include the environment
def check_service_status_bound(service_name: str) -> str:
    return check_service_status(service_name)

def restart_service_bound(service_name: str) -> str:
    return restart_service(service_name)

def get_all_services_bound() -> str:
    return get_all_services()

# Store bound functions for client-side execution
it_tool_registry = {
    "check_service_status": check_service_status_bound,
    "restart_service": restart_service_bound,
    "get_all_services": get_all_services_bound,
}

# Create tool descriptions for system prompt
it_tool_descriptions = """
- check_service_status(service_name: str): Check the status of an IT service. 
  Returns information about service health including CPU usage, memory usage, and current status.
  Use this to monitor service health and detect issues.
  Parameters:
    - service_name (string): The name of the service to check (e.g., 'web-server', 'database', 'cache-service')

- restart_service(service_name: str): Restart an IT service that is not working properly.
  Use this when a service has failed or is in a degraded state.
  Parameters:
    - service_name (string): The name of the service to restart (e.g., 'web-server', 'database')

- get_all_services(): Get the status of all services in the environment.
  Use this to get an overview of the entire system health.
  No parameters required."""

it_system_prompt = f"""You are an IT operations agent with access to tools.

When you need to use a tool, respond ONLY with valid JSON in this exact format:
{{
  "tool_call": "function_name",
  "arguments": {{"param1": "value1"}}
}}

Available tools:
{it_tool_descriptions}

IMPORTANT: 
- If you need to use a tool, respond with ONLY the JSON object, nothing else.
- Do not include any text before or after the JSON.
- Use the exact function name as shown above.
- Always check service status before taking actions.
- Provide clear summaries of what you did."""

print("‚úÖ IT operations tools prepared")


In [None]:
# Step 4: Test the agent with custom tools
# Initialize messages for IT operations conversation
it_messages = [{"role": "system", "content": it_system_prompt}]

# Test tasks
test_tasks = [
    "Check the status of all services",
    "Check the status of cache-service and restart it if it's degraded",
]

print("\nüìù Testing IT Operations Agent\n")

for task in test_tasks:
    print(f"\n{'=' * 60}")
    print(f"Task: {task}")
    print("=" * 60)
    
    # Execute turn using helper function
    tool_used = execute_turn(task, it_messages, it_tool_registry, max_tokens=300)
    
    if tool_used:
        print("\n‚úÖ Tool was used")
    else:
        print("\nüí° No tool call detected")
        print("   The model may have answered directly, or tool call format was different")
    
    print()

print("‚úÖ All tests completed!")


---

## üéì Key Takeaways

**What we learned:**

1. **Custom Tools** are Python functions with docstrings - the agent detects them automatically (no complex configuration!)
2. **Wikipedia Search** is a great example - free, no API key, perfect for learning the pattern
3. **Agent Class** simplifies agent creation and tool management - just pass functions and go!
4. **AgentEventLogger** shows tool calls visually - watch the agent think (ü§î) and act (üîß) in real-time
5. **Sessions** maintain conversation context - agents remember what happened before

**The big picture:**
- Tools are the agent's "hands" - they let agents take actions, not just think
- Docstrings are critical - the LLM reads them to understand when to use each tool
- Client-side tools run in your Python process - fast, secure, no external dependencies
- You can build any tool you need - Wikipedia, IT operations, databases, APIs, anything!

**For IT operations:**
- Build tools for your specific infrastructure (monitoring systems, service management, etc.)
- Watch agents reason about problems and choose the right tools
- See agents execute multi-step operations (check status ‚Üí detect problem ‚Üí fix it)
- Create production-ready agents that can actually manage your systems

---

## üöÄ Next Steps

**Ready for more?** In **Notebook 03**, we'll explore:
- **LlamaStack Core Features** - Chat and RAG (Retrieval Augmented Generation)
- **When to use each feature** - Understanding which tool fits which problem
- **Building more powerful agents** - Combining features for better results

**The fun part:** You'll learn how to give agents access to knowledge bases (RAG) so they can answer questions about your specific infrastructure!

---

**Ready?** Let's move to Notebook 03: LlamaStack Core Features! üöÄ
