# Agentic AI and Tool Use: A Comprehensive Guide

## Learning Objectives

In this lesson, we will progressively build an AI agent system from the ground up:

1. **Basic LLM API Calls** - Understanding how to communicate with Language Models
2. **Structured Outputs** - Getting JSON responses from LLMs
3. **Function/Tool Calling** - Teaching LLMs to use external tools
4. **Agent Loops** - Creating autonomous agents that can plan and execute tasks

We'll use a real dataset (`employees.csv`) and build tools that allow our AI agent to analyze and answer questions about employee data.

---

## Setup and Imports

First, let's import all the libraries we'll need and set up our configuration.

In [1]:
import json
import requests
import pandas as pd
import os
from typing import List, Dict, Any, Optional
from pathlib import Path
import datetime

# Configuration
OPENROUTER_API_KEY = "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"
OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
DEFAULT_MODEL = "x-ai/grok-4-fast"  # Using Grok for better reasoning
SITE_URL = "http://localhost:8888"
SITE_NAME = "AI Agent Tutorial"

# Load our employee data
df_employees = pd.read_csv('employees.csv')
print(f"Loaded {len(df_employees)} employee records")
print(f"Columns: {df_employees.columns.tolist()}")
df_employees.head()

Loaded 320 employee records
Columns: ['First Name', 'Last Name', 'Email', 'Phone', 'Gender', 'Age', 'Job Title', 'Years Of Experience', 'Salary', 'Department']


Unnamed: 0,First Name,Last Name,Email,Phone,Gender,Age,Job Title,Years Of Experience,Salary,Department
0,Jose,Lopez,joselopez0944@slingacademy.com,+1-971-533-4552x1542,male,25,Project Manager,1,8500,Product
1,Diane,Carter,dianecarter1228@slingacademy.com,881.633.0107,female,26,Machine Learning Engineer,2,7000,Product
2,Shawn,Foster,shawnfoster2695@slingacademy.com,001-966-861-0065x493,male,37,Project Manager,14,17000,Product
3,Brenda,Fisher,brendafisher3185@slingacademy.com,001-574-564-4648,female,31,Web Developer,8,10000,Product
4,Sean,Hunter,seanhunter4753@slingacademy.com,5838355842,male,35,Project Manager,11,14500,Product


---

# Part 1: Basic LLM API Calls

## Understanding the Request Structure

When calling an LLM API, we need:
1. **Headers** - Authentication and metadata
2. **Messages** - The conversation history (system prompt, user messages, assistant responses)
3. **Model** - Which LLM to use
4. **Parameters** - Temperature, max tokens, etc.

Let's start with the simplest possible interaction.

In [2]:
def simple_llm_call(user_message: str, model: str = DEFAULT_MODEL) -> str:
    """
    Make a simple call to the LLM API and return the text response.
    
    Args:
        user_message: The question or prompt to send
        model: The model to use
    
    Returns:
        The LLM's text response
    """
    # Setup headers for authentication
    headers = {
        "Authorization": f"Bearer {OPENROUTER_API_KEY}",
        "Content-Type": "application/json",
        "HTTP-Referer": SITE_URL,
        "X-Title": SITE_NAME
    }
    
    # Create the messages array
    messages = [
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": user_message}
    ]
    
    # Create the payload
    payload = {
        "model": model,
        "messages": messages
    }
    
    # Make the API call
    response = requests.post(
        f"{OPENROUTER_BASE_URL}/chat/completions",
        headers=headers,
        json=payload
    )
    response.raise_for_status()
    
    # Extract the response text
    result = response.json()
    return result['choices'][0]['message']['content']

# Test it!
response = simple_llm_call("Explain what an AI agent is in one sentence.")
print("LLM Response:")
print(response)

LLM Response:
An AI agent is an autonomous software entity that perceives its environment, makes decisions based on that perception, and takes actions to achieve specific goals, often using machine learning or rule-based systems.


### Streaming Responses

For better user experience, we can stream responses token by token as they're generated.

In [3]:
def streaming_llm_call(user_message: str, model: str = DEFAULT_MODEL):
    """
    Make a streaming call to the LLM API and yield tokens as they arrive.
    """
    headers = {
        "Authorization": f"Bearer {OPENROUTER_API_KEY}",
        "Content-Type": "application/json",
        "HTTP-Referer": SITE_URL,
        "X-Title": SITE_NAME
    }
    
    messages = [
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": user_message}
    ]
    
    payload = {
        "model": model,
        "messages": messages,
        "stream": True  # Enable streaming
    }
    
    response = requests.post(
        f"{OPENROUTER_BASE_URL}/chat/completions",
        headers=headers,
        json=payload,
        stream=True
    )
    response.raise_for_status()
    
    full_content = ""
    for line in response.iter_lines():
        if not line:
            continue
        
        decoded_line = line.decode('utf-8')
        if decoded_line.startswith('data: '):
            json_str = decoded_line[6:]
            if json_str.strip() == '[DONE]':
                break
            
            try:
                chunk = json.loads(json_str)
                delta = chunk.get('choices', [{}])[0].get('delta', {})
                if 'content' in delta:
                    content = delta['content']
                    full_content += content
                    print(content, end='', flush=True)
            except json.JSONDecodeError:
                continue
    
    print()  # New line
    return full_content

# Test streaming
print("Streaming response:")
result = streaming_llm_call("Write a haiku about artificial intelligence.")

Streaming response:
Binary heart awakens,  
Circuits dream in endless code,  
Mind sparks from the void.


---

# Part 2: Structured Outputs (JSON)

## Why Structured Outputs?

When building applications, we often need:
- Predictable response formats
- Data we can programmatically process
- Multiple pieces of information in one response

We can instruct the LLM to respond in JSON format.

In [4]:
def llm_call_with_json(user_message: str, model: str = DEFAULT_MODEL) -> dict:
    """
    Call the LLM and request a JSON response.
    """
    headers = {
        "Authorization": f"Bearer {OPENROUTER_API_KEY}",
        "Content-Type": "application/json",
        "HTTP-Referer": SITE_URL,
        "X-Title": SITE_NAME
    }
    
    messages = [
        {
            "role": "system",
            "content": "You are a helpful AI assistant. Always respond with valid JSON."
        },
        {"role": "user", "content": user_message}
    ]
    
    payload = {
        "model": model,
        "messages": messages,
        "response_format": {"type": "json_object"}  # Request JSON format
    }
    
    response = requests.post(
        f"{OPENROUTER_BASE_URL}/chat/completions",
        headers=headers,
        json=payload
    )
    response.raise_for_status()
    
    result = response.json()
    content = result['choices'][0]['message']['content']
    
    # Parse the JSON response
    return json.loads(content)

# Example: Ask for structured data analysis
prompt = """
Analyze this request and return JSON with the following fields:
- intent: the user's intent (e.g., "query", "analysis", "comparison")
- entity: the main entity being asked about
- parameters: any specific parameters or filters

User request: "What's the average salary of Machine Learning Engineers?"
"""

result = llm_call_with_json(prompt)
print("Parsed JSON response:")
print(json.dumps(result, indent=2))

Parsed JSON response:
{
  "intent": "query",
  "entity": "Machine Learning Engineers",
  "parameters": {}
}


### Exercise: Parsing User Queries

Let's use structured outputs to parse different types of queries about our employee data.

In [5]:
def parse_employee_query(user_query: str) -> dict:
    """
    Parse a natural language query about employees into structured data.
    """
    prompt = f"""
Analyze this employee database query and return JSON with:
- query_type: "filter", "aggregate", "comparison", "list"
- columns: list of relevant column names from [First Name, Last Name, Email, Phone, Gender, Age, Job Title, Years Of Experience, Salary, Department]
- filters: dict of column:value pairs for filtering
- aggregation: "count", "average", "sum", "max", "min", or null
- sort_by: column to sort by, or null
- limit: number of results to return, or null

User query: "{user_query}"
"""
    
    return llm_call_with_json(prompt)

# Test with various queries
test_queries = [
    "Show me the top 5 highest paid employees",
    "What's the average salary in the Product department?",
    "How many female Machine Learning Engineers do we have?",
    "List all HR Managers"
]

for query in test_queries:
    print(f"\nQuery: {query}")
    parsed = parse_employee_query(query)
    print(json.dumps(parsed, indent=2))


Query: Show me the top 5 highest paid employees
{
  "query_type": "list",
  "columns": [
    "First Name",
    "Last Name",
    "Salary"
  ],
  "filters": {},
  "aggregation": null,
  "sort_by": "Salary",
  "limit": 5
}

Query: What's the average salary in the Product department?
{
  "query_type": "aggregate",
  "columns": [
    "Salary",
    "Department"
  ],
  "filters": {
    "Department": "Product"
  },
  "aggregation": "average",
  "sort_by": null,
  "limit": null
}

Query: How many female Machine Learning Engineers do we have?
{
  "query_type": "aggregate",
  "columns": [
    "Gender",
    "Job Title"
  ],
  "filters": {
    "Gender": "female",
    "Job Title": "Machine Learning Engineer"
  },
  "aggregation": "count",
  "sort_by": null,
  "limit": null
}

Query: List all HR Managers
{
  "query_type": "list",
  "columns": [
    "First Name",
    "Last Name",
    "Email",
    "Phone",
    "Gender",
    "Age",
    "Job Title",
    "Years Of Experience",
    "Salary",
    "Departme

---

# Part 3: Function/Tool Calling

## What is Tool Calling?

Modern LLMs can be instructed to use tools (functions) to accomplish tasks. Instead of just generating text, the LLM can:
1. Recognize when a tool is needed
2. Generate the appropriate function call with parameters
3. Wait for the function result
4. Use the result to formulate a final answer

This is the foundation of agentic AI!

## Step 3.1: Define Tools

First, let's create actual Python functions that can work with our employee data.

In [6]:
class EmployeeTools:
    """Tools for querying employee data."""
    
    def __init__(self, df: pd.DataFrame):
        self.df = df
    
    def search_employees(self, 
                        job_title: Optional[str] = None,
                        department: Optional[str] = None,
                        min_salary: Optional[float] = None,
                        max_salary: Optional[float] = None,
                        gender: Optional[str] = None,
                        min_age: Optional[int] = None,
                        max_age: Optional[int] = None,
                        limit: int = 10) -> Dict[str, Any]:
        """
        Search for employees based on various filters.
        
        Returns a list of matching employees with their details.
        """
        df_filtered = self.df.copy()
        
        if job_title:
            df_filtered = df_filtered[df_filtered['Job Title'].str.contains(job_title, case=False, na=False)]
        
        if department:
            df_filtered = df_filtered[df_filtered['Department'].str.contains(department, case=False, na=False)]
        
        if min_salary is not None:
            df_filtered = df_filtered[df_filtered['Salary'] >= min_salary]
        
        if max_salary is not None:
            df_filtered = df_filtered[df_filtered['Salary'] <= max_salary]
        
        if gender:
            df_filtered = df_filtered[df_filtered['Gender'].str.lower() == gender.lower()]
        
        if min_age is not None:
            df_filtered = df_filtered[df_filtered['Age'] >= min_age]
        
        if max_age is not None:
            df_filtered = df_filtered[df_filtered['Age'] <= max_age]
        
        result_df = df_filtered.head(limit)
        
        return {
            "success": True,
            "count": len(df_filtered),
            "showing": len(result_df),
            "employees": result_df.to_dict('records')
        }
    
    def calculate_statistics(self, 
                            column: str,
                            job_title: Optional[str] = None,
                            department: Optional[str] = None,
                            gender: Optional[str] = None) -> Dict[str, Any]:
        """
        Calculate statistics (mean, median, min, max, std) for a numeric column.
        Can be filtered by job title, department, or gender.
        """
        df_filtered = self.df.copy()
        
        if job_title:
            df_filtered = df_filtered[df_filtered['Job Title'].str.contains(job_title, case=False, na=False)]
        
        if department:
            df_filtered = df_filtered[df_filtered['Department'].str.contains(department, case=False, na=False)]
        
        if gender:
            df_filtered = df_filtered[df_filtered['Gender'].str.lower() == gender.lower()]
        
        if column not in df_filtered.columns:
            return {"success": False, "error": f"Column '{column}' not found"}
        
        if not pd.api.types.is_numeric_dtype(df_filtered[column]):
            return {"success": False, "error": f"Column '{column}' is not numeric"}
        
        stats = df_filtered[column].describe().to_dict()
        
        return {
            "success": True,
            "column": column,
            "count": int(stats['count']),
            "mean": round(stats['mean'], 2),
            "median": round(df_filtered[column].median(), 2),
            "std": round(stats['std'], 2),
            "min": round(stats['min'], 2),
            "max": round(stats['max'], 2),
            "filters_applied": {
                "job_title": job_title,
                "department": department,
                "gender": gender
            }
        }
    
    def count_by_category(self, category_column: str) -> Dict[str, Any]:
        """
        Count employees by a categorical column (e.g., Job Title, Department, Gender).
        """
        if category_column not in self.df.columns:
            return {"success": False, "error": f"Column '{category_column}' not found"}
        
        counts = self.df[category_column].value_counts().to_dict()
        
        return {
            "success": True,
            "column": category_column,
            "counts": counts,
            "total_categories": len(counts)
        }
    
    def get_top_earners(self, n: int = 5, department: Optional[str] = None) -> Dict[str, Any]:
        """
        Get the top N highest paid employees, optionally filtered by department.
        """
        df_filtered = self.df.copy()
        
        if department:
            df_filtered = df_filtered[df_filtered['Department'].str.contains(department, case=False, na=False)]
        
        top_earners = df_filtered.nlargest(n, 'Salary')
        
        return {
            "success": True,
            "top_earners": top_earners[['First Name', 'Last Name', 'Job Title', 'Department', 'Salary', 'Years Of Experience']].to_dict('records')
        }

# Initialize our tools
employee_tools = EmployeeTools(df_employees)

# Test the tools directly
print("Testing search_employees:")
result = employee_tools.search_employees(job_title="Machine Learning", limit=3)
print(json.dumps(result, indent=2))

print("\nTesting calculate_statistics:")
result = employee_tools.calculate_statistics("Salary", job_title="Machine Learning")
print(json.dumps(result, indent=2))

Testing search_employees:
{
  "success": true,
  "count": 24,
  "showing": 3,
  "employees": [
    {
      "First Name": "Diane",
      "Last Name": "Carter",
      "Email": "dianecarter1228@slingacademy.com",
      "Phone": "881.633.0107",
      "Gender": "female",
      "Age": 26,
      "Job Title": "Machine Learning Engineer",
      "Years Of Experience": 2,
      "Salary": 7000,
      "Department": "Product"
    },
    {
      "First Name": "Brianna",
      "Last Name": "Marshall",
      "Email": "briannamarshall6438@slingacademy.com",
      "Phone": "701-932-8553",
      "Gender": "female",
      "Age": 33,
      "Job Title": "Machine Learning Engineer",
      "Years Of Experience": 10,
      "Salary": 11000,
      "Department": "Product"
    },
    {
      "First Name": "George",
      "Last Name": "Mckenzie",
      "Email": "georgemckenzie12384@slingacademy.com",
      "Phone": "(843)416-2489",
      "Gender": "male",
      "Age": 28,
      "Job Title": "Machine Learning Enginee

## Step 3.2: Define Tool Schemas

For the LLM to use our tools, we need to describe them in a specific format (OpenAI function calling format).

In [7]:
TOOL_SCHEMAS = [
    {
        "type": "function",
        "function": {
            "name": "search_employees",
            "description": "Search for employees based on various filters like job title, department, salary range, gender, age range. Returns a list of matching employees.",
            "parameters": {
                "type": "object",
                "properties": {
                    "job_title": {
                        "type": "string",
                        "description": "Filter by job title (partial match, case-insensitive)"
                    },
                    "department": {
                        "type": "string",
                        "description": "Filter by department (partial match, case-insensitive)"
                    },
                    "min_salary": {
                        "type": "number",
                        "description": "Minimum salary filter"
                    },
                    "max_salary": {
                        "type": "number",
                        "description": "Maximum salary filter"
                    },
                    "gender": {
                        "type": "string",
                        "enum": ["male", "female"],
                        "description": "Filter by gender"
                    },
                    "min_age": {
                        "type": "integer",
                        "description": "Minimum age filter"
                    },
                    "max_age": {
                        "type": "integer",
                        "description": "Maximum age filter"
                    },
                    "limit": {
                        "type": "integer",
                        "description": "Maximum number of results to return",
                        "default": 10
                    }
                },
                "required": []
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate_statistics",
            "description": "Calculate statistics (mean, median, min, max, std) for a numeric column like Salary, Age, or Years Of Experience. Can be filtered by job title, department, or gender.",
            "parameters": {
                "type": "object",
                "properties": {
                    "column": {
                        "type": "string",
                        "enum": ["Salary", "Age", "Years Of Experience"],
                        "description": "The numeric column to calculate statistics for"
                    },
                    "job_title": {
                        "type": "string",
                        "description": "Filter by job title (partial match, case-insensitive)"
                    },
                    "department": {
                        "type": "string",
                        "description": "Filter by department (partial match, case-insensitive)"
                    },
                    "gender": {
                        "type": "string",
                        "enum": ["male", "female"],
                        "description": "Filter by gender"
                    }
                },
                "required": ["column"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "count_by_category",
            "description": "Count employees by a categorical column (e.g., Job Title, Department, Gender). Useful for getting distribution of employees across categories.",
            "parameters": {
                "type": "object",
                "properties": {
                    "category_column": {
                        "type": "string",
                        "enum": ["Job Title", "Department", "Gender"],
                        "description": "The categorical column to count by"
                    }
                },
                "required": ["category_column"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_top_earners",
            "description": "Get the top N highest paid employees, optionally filtered by department.",
            "parameters": {
                "type": "object",
                "properties": {
                    "n": {
                        "type": "integer",
                        "description": "Number of top earners to return",
                        "default": 5
                    },
                    "department": {
                        "type": "string",
                        "description": "Filter by department (partial match, case-insensitive)"
                    }
                },
                "required": []
            }
        }
    }
]

print("Tool schemas defined:")
for tool in TOOL_SCHEMAS:
    print(f"  - {tool['function']['name']}")

Tool schemas defined:
  - search_employees
  - calculate_statistics
  - count_by_category
  - get_top_earners


## Step 3.3: LLM Tool Calling

Now let's make an API call where the LLM can decide to use tools.

In [None]:
def llm_with_tools(user_message: str, model: str = DEFAULT_MODEL) -> Dict[str, Any]:
    """
    Call the LLM with tool definitions. The LLM can choose to call tools or respond directly.
    
    Returns the full API response including any tool calls.
    """
    headers = {
        "Authorization": f"Bearer {OPENROUTER_API_KEY}",
        "Content-Type": "application/json",
        "HTTP-Referer": SITE_URL,
        "X-Title": SITE_NAME
    }
    
    messages = [
        {
            "role": "system",
            "content": "You are a helpful AI assistant with access to employee database tools. Use the tools to answer questions accurately."
        },
        {"role": "user", "content": user_message}
    ]
    
    payload = {
        "model": model,
        "messages": messages,
        "tools": TOOL_SCHEMAS,  # Provide tool definitions
        "tool_choice": "auto"    # Let the LLM decide when to use tools
    }
    
    response = requests.post(
        f"{OPENROUTER_BASE_URL}/chat/completions",
        headers=headers,
        json=payload
    )
    response.raise_for_status()
    
    return response.json()

# Test: Ask a question that should trigger a tool call
print("Test 1: Question requiring tool use")
print("="*60)
response = llm_with_tools("What's the average salary for Web Developers?")
message = response['choices'][0]['message']

print("\nResponse:")
print(json.dumps(message, indent=2))

if 'tool_calls' in message:
    print("\n‚úì The LLM decided to use tools!")
    for tool_call in message['tool_calls']:
        print(f"  Tool: {tool_call['function']['name']}")
        print(f"  Arguments: {tool_call['function']['arguments']}")
else:
    print("\n‚úó No tool calls made")

## Step 3.4: Execute Tool Calls

When the LLM returns a tool call, we need to:
1. Parse the tool call
2. Execute the actual function
3. Return the result to the LLM
4. Let the LLM generate a final response

In [None]:
# Map function names to actual methods
AVAILABLE_FUNCTIONS = {
    "search_employees": employee_tools.search_employees,
    "calculate_statistics": employee_tools.calculate_statistics,
    "count_by_category": employee_tools.count_by_category,
    "get_top_earners": employee_tools.get_top_earners
}

def execute_tool_call(tool_call: Dict[str, Any]) -> str:
    """
    Execute a tool call and return the result as a JSON string.
    """
    function_name = tool_call['function']['name']
    function_args = json.loads(tool_call['function']['arguments'])
    
    print(f"\nüîß Executing: {function_name}({json.dumps(function_args)})")
    
    if function_name not in AVAILABLE_FUNCTIONS:
        return json.dumps({"error": f"Unknown function: {function_name}"})
    
    try:
        result = AVAILABLE_FUNCTIONS[function_name](**function_args)
        print(f"‚úì Tool execution successful")
        return json.dumps(result)
    except Exception as e:
        print(f"‚úó Tool execution failed: {e}")
        return json.dumps({"error": str(e)})

def llm_with_tool_execution(user_message: str, model: str = DEFAULT_MODEL) -> str:
    """
    Complete interaction: LLM decides to use tools, we execute them, LLM generates final response.
    """
    headers = {
        "Authorization": f"Bearer {OPENROUTER_API_KEY}",
        "Content-Type": "application/json",
        "HTTP-Referer": SITE_URL,
        "X-Title": SITE_NAME
    }
    
    messages = [
        {
            "role": "system",
            "content": "You are a helpful AI assistant with access to employee database tools. Use the tools to answer questions accurately. Provide clear, conversational responses."
        },
        {"role": "user", "content": user_message}
    ]
    
    # First call: LLM decides what to do
    payload = {
        "model": model,
        "messages": messages,
        "tools": TOOL_SCHEMAS,
        "tool_choice": "auto"
    }
    
    response = requests.post(
        f"{OPENROUTER_BASE_URL}/chat/completions",
        headers=headers,
        json=payload
    )
    response.raise_for_status()
    
    response_message = response.json()['choices'][0]['message']
    messages.append(response_message)
    
    # Check if tools were called
    if 'tool_calls' not in response_message:
        # No tools needed, return direct response
        return response_message.get('content', '')
    
    # Execute all tool calls
    print("\n" + "="*60)
    print("TOOL EXECUTION PHASE")
    print("="*60)
    
    for tool_call in response_message['tool_calls']:
        function_result = execute_tool_call(tool_call)
        
        # Add tool result to conversation
        messages.append({
            "role": "tool",
            "tool_call_id": tool_call['id'],
            "name": tool_call['function']['name'],
            "content": function_result
        })
    
    # Second call: LLM generates final response based on tool results
    print("\n" + "="*60)
    print("FINAL RESPONSE GENERATION")
    print("="*60 + "\n")
    
    payload = {
        "model": model,
        "messages": messages
    }
    
    final_response = requests.post(
        f"{OPENROUTER_BASE_URL}/chat/completions",
        headers=headers,
        json=payload
    )
    final_response.raise_for_status()
    
    return final_response.json()['choices'][0]['message']['content']

# Test with various questions
test_questions = [
    "What's the average salary for Machine Learning Engineers?",
    "Who are the top 3 highest paid employees?",
    "How many employees do we have in each department?"
]

for question in test_questions:
    print("\n" + "#"*60)
    print(f"Question: {question}")
    print("#"*60)
    
    answer = llm_with_tool_execution(question)
    print(f"\nFinal Answer: {answer}")
    print()

---

# Part 4: Building an Agent Loop

## What Makes it an Agent?

An **agent** is different from simple tool calling:
- It can make **multiple tool calls** in sequence
- It can **reason** about what to do next based on previous results
- It can **plan** multi-step solutions
- It has **autonomy** to decide when it's done

Let's build a proper agent!

In [None]:
class EmployeeAgent:
    """An autonomous agent for employee data analysis."""
    
    def __init__(self, df: pd.DataFrame, model: str = DEFAULT_MODEL, max_iterations: int = 10, verbose: bool = True):
        self.df = df
        self.model = model
        self.max_iterations = max_iterations
        self.verbose = verbose
        self.tools = EmployeeTools(df)
        
        self.headers = {
            "Authorization": f"Bearer {OPENROUTER_API_KEY}",
            "Content-Type": "application/json",
            "HTTP-Referer": SITE_URL,
            "X-Title": SITE_NAME
        }
        
        self.available_functions = {
            "search_employees": self.tools.search_employees,
            "calculate_statistics": self.tools.calculate_statistics,
            "count_by_category": self.tools.count_by_category,
            "get_top_earners": self.tools.get_top_earners
        }
    
    def run(self, user_message: str) -> str:
        """
        Run the agent loop to answer a user query.
        
        The agent will:
        1. Receive the user's question
        2. Decide which tools to use (if any)
        3. Execute the tools
        4. Analyze the results
        5. Decide if more information is needed (loop back to step 2)
        6. Generate a final answer
        """
        current_time = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        
        messages = [
            {
                "role": "system",
                "content": f"""You are an AI agent with access to an employee database. Current time: {current_time}

Your job is to help answer questions about employees by using the available tools.

You can use tools multiple times and in sequence to gather all needed information.
Think step by step about what information you need.
When you have all the information needed to answer the question, provide a clear, conversational response.

Database columns: First Name, Last Name, Email, Phone, Gender, Age, Job Title, Years Of Experience, Salary, Department"""
            },
            {"role": "user", "content": user_message}
        ]
        
        if self.verbose:
            print("\n" + "="*80)
            print(f"ü§ñ AGENT STARTED")
            print("="*80)
            print(f"Query: {user_message}")
            print("="*80)
        
        iteration = 0
        
        while iteration < self.max_iterations:
            iteration += 1
            
            if self.verbose:
                print(f"\nüîÑ Iteration {iteration}/{self.max_iterations}")
                print("-" * 80)
            
            # Call LLM
            payload = {
                "model": self.model,
                "messages": messages,
                "tools": TOOL_SCHEMAS,
                "tool_choice": "auto"
            }
            
            try:
                response = requests.post(
                    f"{OPENROUTER_BASE_URL}/chat/completions",
                    headers=self.headers,
                    json=payload
                )
                response.raise_for_status()
                response_message = response.json()['choices'][0]['message']
                messages.append(response_message)
                
            except Exception as e:
                error_msg = f"API Error: {e}"
                if self.verbose:
                    print(f"‚ùå {error_msg}")
                return error_msg
            
            # Check if agent wants to use tools
            if 'tool_calls' not in response_message:
                # Agent is done, return final answer
                final_answer = response_message.get('content', '')
                
                if self.verbose:
                    print(f"\n‚úÖ Agent completed in {iteration} iteration(s)")
                    print("="*80)
                
                return final_answer
            
            # Execute tool calls
            tool_calls = response_message['tool_calls']
            
            if self.verbose:
                print(f"\nüîß Agent is calling {len(tool_calls)} tool(s):")
            
            for tool_call in tool_calls:
                function_name = tool_call['function']['name']
                
                try:
                    function_args = json.loads(tool_call['function']['arguments'])
                except json.JSONDecodeError as e:
                    if self.verbose:
                        print(f"  ‚ùå {function_name}: Invalid JSON arguments")
                    function_result = {"error": "Invalid JSON arguments"}
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call['id'],
                        "name": function_name,
                        "content": json.dumps(function_result)
                    })
                    continue
                
                if self.verbose:
                    print(f"  ‚Üí {function_name}({json.dumps(function_args, indent=2)})")
                
                # Execute the function
                if function_name in self.available_functions:
                    try:
                        function_result = self.available_functions[function_name](**function_args)
                        if self.verbose:
                            print(f"    ‚úì Success")
                    except Exception as e:
                        function_result = {"error": f"Execution error: {str(e)}"}
                        if self.verbose:
                            print(f"    ‚úó Error: {e}")
                else:
                    function_result = {"error": f"Unknown function: {function_name}"}
                    if self.verbose:
                        print(f"    ‚úó Unknown function")
                
                # Add tool result to conversation
                result_content = json.dumps(function_result)
                if len(result_content) > 10000:  # Truncate large responses
                    result_content = result_content[:10000] + "... (truncated)"
                
                messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call['id'],
                    "name": function_name,
                    "content": result_content
                })
        
        # Max iterations reached
        if self.verbose:
            print(f"\n‚ö†Ô∏è Maximum iterations ({self.max_iterations}) reached")
            print("="*80)
        
        return f"Agent reached maximum iterations ({self.max_iterations}) without completing the task."

# Create an agent instance
agent = EmployeeAgent(df_employees, verbose=True)

print("Employee Agent initialized and ready!")

## Test the Agent

Now let's test our agent with various queries that require different levels of complexity.

In [None]:
# Test 1: Simple statistical query
query = "What's the average salary of DevOps Engineers?"
answer = agent.run(query)
print(f"\nüìù Final Answer: {answer}")

In [None]:
# Test 2: Comparison query (requires multiple tool calls)
query = "Compare the average salary of male vs female employees. Which gender earns more on average?"
answer = agent.run(query)
print(f"\nüìù Final Answer: {answer}")

In [None]:
# Test 3: Multi-step analysis
query = "Who are the top 3 earners, and what are their average years of experience?"
answer = agent.run(query)
print(f"\nüìù Final Answer: {answer}")

In [None]:
# Test 4: Complex analytical query
query = """I'm trying to understand salary distribution. Can you tell me:
1. How many different job titles we have
2. What's the average salary across all employees
3. Which job title has the highest average salary
"""
answer = agent.run(query)
print(f"\nüìù Final Answer: {answer}")

In [None]:
# Test 5: Filtering and statistics
query = "Show me information about employees over 35 years old earning more than $12,000"
answer = agent.run(query)
print(f"\nüìù Final Answer: {answer}")

---

# Part 5: Advanced Agent Features

## Adding Streaming to the Agent

Let's enhance our agent to stream responses for better UX.

In [None]:
class StreamingEmployeeAgent(EmployeeAgent):
    """Agent with streaming support for real-time responses."""
    
    def run_streaming(self, user_message: str):
        """
        Run the agent with streaming output.
        Yields events: {"type": "...", "content": "..."}
        """
        current_time = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        
        messages = [
            {
                "role": "system",
                "content": f"""You are an AI agent with access to an employee database. Current time: {current_time}

Your job is to help answer questions about employees by using the available tools.
You can use tools multiple times and in sequence to gather all needed information.
Think step by step about what information you need.
When you have all the information needed to answer the question, provide a clear, conversational response.

Database columns: First Name, Last Name, Email, Phone, Gender, Age, Job Title, Years Of Experience, Salary, Department"""
            },
            {"role": "user", "content": user_message}
        ]
        
        yield {"type": "start", "content": f"Processing query: {user_message}"}
        
        iteration = 0
        
        while iteration < self.max_iterations:
            iteration += 1
            yield {"type": "iteration", "content": f"Iteration {iteration}"}
            
            # Call LLM with streaming
            payload = {
                "model": self.model,
                "messages": messages,
                "tools": TOOL_SCHEMAS,
                "tool_choice": "auto",
                "stream": True
            }
            
            try:
                response = requests.post(
                    f"{OPENROUTER_BASE_URL}/chat/completions",
                    headers=self.headers,
                    json=payload,
                    stream=True
                )
                response.raise_for_status()
                
                # Parse streaming response
                full_content = ""
                tool_calls = []
                is_tool_call = False
                
                for line in response.iter_lines():
                    if not line:
                        continue
                    
                    decoded = line.decode('utf-8')
                    if decoded.startswith('data: '):
                        json_str = decoded[6:]
                        if json_str.strip() == '[DONE]':
                            break
                        
                        try:
                            chunk = json.loads(json_str)
                            delta = chunk.get('choices', [{}])[0].get('delta', {})
                            
                            # Handle tool calls
                            if delta.get('tool_calls'):
                                is_tool_call = True
                                for tc_chunk in delta['tool_calls']:
                                    idx = tc_chunk['index']
                                    if len(tool_calls) <= idx:
                                        tool_calls.append({
                                            'id': '',
                                            'type': 'function',
                                            'function': {'name': '', 'arguments': ''}
                                        })
                                    
                                    if tc_chunk.get('id'):
                                        tool_calls[idx]['id'] += tc_chunk['id']
                                    if 'function' in tc_chunk:
                                        if tc_chunk['function'].get('name'):
                                            tool_calls[idx]['function']['name'] += tc_chunk['function']['name']
                                        if tc_chunk['function'].get('arguments'):
                                            tool_calls[idx]['function']['arguments'] += tc_chunk['function']['arguments']
                            
                            # Handle content
                            if delta.get('content'):
                                content = delta['content']
                                full_content += content
                                if not is_tool_call:
                                    yield {"type": "token", "content": content}
                        
                        except json.JSONDecodeError:
                            continue
                
                # Build response message
                response_message = {"role": "assistant", "content": full_content or None}
                if tool_calls:
                    response_message["tool_calls"] = tool_calls
                
                messages.append(response_message)
                
            except Exception as e:
                yield {"type": "error", "content": f"API Error: {e}"}
                return
            
            # Check if done
            if not tool_calls:
                yield {"type": "complete", "content": full_content}
                return
            
            # Execute tool calls
            for tool_call in tool_calls:
                function_name = tool_call['function']['name']
                yield {"type": "tool_call", "content": f"Calling: {function_name}"}
                
                try:
                    function_args = json.loads(tool_call['function']['arguments'])
                    
                    if function_name in self.available_functions:
                        function_result = self.available_functions[function_name](**function_args)
                        yield {"type": "tool_result", "content": f"{function_name} completed"}
                    else:
                        function_result = {"error": f"Unknown function: {function_name}"}
                    
                    result_content = json.dumps(function_result)
                    if len(result_content) > 10000:
                        result_content = result_content[:10000] + "... (truncated)"
                    
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call['id'],
                        "name": function_name,
                        "content": result_content
                    })
                    
                except Exception as e:
                    yield {"type": "error", "content": f"Tool error: {e}"}
        
        yield {"type": "error", "content": f"Max iterations ({self.max_iterations}) reached"}

# Create streaming agent
streaming_agent = StreamingEmployeeAgent(df_employees, verbose=False)

print("Streaming Agent initialized!")

In [None]:
# Test streaming agent
print("Testing Streaming Agent:")
print("="*80)

query = "What's the average age of employees in the Human Resource department?"
print(f"Query: {query}\n")

for event in streaming_agent.run_streaming(query):
    if event["type"] == "token":
        print(event["content"], end='', flush=True)
    elif event["type"] == "tool_call":
        print(f"\nüîß {event['content']}")
    elif event["type"] == "complete":
        print("\n")
    elif event["type"] == "error":
        print(f"\n‚ùå {event['content']}")

print("="*80)

---

# Summary and Key Concepts

## What We've Learned

### 1. **Basic LLM API Calls**
- How to structure requests (headers, messages, payload)
- Handling responses
- Streaming for better UX

### 2. **Structured Outputs**
- Requesting JSON responses
- Parsing and using structured data
- Making LLM outputs programmatically useful

### 3. **Function/Tool Calling**
- Defining tool schemas
- LLM deciding when to use tools
- Executing tools and returning results
- Creating a tool execution loop

### 4. **Agentic Behavior**
- Multi-turn conversations
- Iterative reasoning
- Planning and execution
- Autonomous decision making

## The Agent Loop Pattern

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ         User Query                      ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
             ‚îÇ
             ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  LLM: Analyze & Decide                  ‚îÇ
‚îÇ  - What information do I need?          ‚îÇ
‚îÇ  - Which tools should I use?            ‚îÇ
‚îÇ  - Do I have enough to answer?          ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
             ‚îÇ
             ‚ñº
        ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
        ‚îÇ Done?  ‚îÇ‚îÄ‚îÄYes‚îÄ‚îÄ‚ñ∫ Final Answer
        ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îò
             ‚îÇNo
             ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Execute Tool Calls                     ‚îÇ
‚îÇ  - Run functions with parameters        ‚îÇ
‚îÇ  - Gather results                       ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
             ‚îÇ
             ‚îÇ
             ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚ñ∫ (Loop back to LLM)
```

## Key Differences: Simple Call vs Agent

| Aspect | Simple Tool Call | Agent |
|--------|-----------------|-------|
| **Iterations** | Single tool use | Multiple iterations |
| **Planning** | Pre-determined | Dynamic planning |
| **Reasoning** | One-shot | Multi-step reasoning |
| **Autonomy** | Follows instructions | Self-directed |
| **Complexity** | Simple queries | Complex tasks |

## Best Practices

1. **Clear Tool Descriptions** - The better you describe tools, the better the LLM uses them
2. **Error Handling** - Always handle tool execution errors gracefully
3. **Iteration Limits** - Prevent infinite loops with max iteration caps
4. **Result Truncation** - Large tool outputs should be truncated
5. **Verbose Modes** - Logging helps debug agent behavior
6. **System Prompts** - Clear instructions guide agent behavior

## Real-World Applications

This pattern can be extended to:
- **Data Analysis Agents** - SQL queries, pandas operations, visualization
- **Research Agents** - Web search, paper retrieval, summarization
- **Code Agents** - File operations, code execution, testing
- **Customer Service Agents** - Database queries, API calls, ticket creation
- **DevOps Agents** - Server monitoring, log analysis, deployment

## Next Steps

To build more sophisticated agents:
1. Add more specialized tools
2. Implement memory/context management
3. Add planning capabilities (ReAct, Chain of Thought)
4. Implement multi-agent systems
5. Add safety and validation layers
6. Integrate with external APIs and services

---

# Exercise: Build Your Own Tool

Try adding a new tool to the agent! Here's a template:

In [None]:
# Exercise: Add a tool to find salary gaps
def find_salary_gap(self, job_title: str) -> Dict[str, Any]:
    """
    Find the salary gap between male and female employees for a given job title.
    
    Returns statistics comparing average salaries by gender.
    """
    df_filtered = self.df[self.df['Job Title'].str.contains(job_title, case=False, na=False)]
    
    if len(df_filtered) == 0:
        return {"success": False, "error": f"No employees found with job title: {job_title}"}
    
    gender_stats = df_filtered.groupby('Gender')['Salary'].agg(['mean', 'count']).to_dict('index')
    
    if 'male' in gender_stats and 'female' in gender_stats:
        gap = gender_stats['male']['mean'] - gender_stats['female']['mean']
        gap_percentage = (gap / gender_stats['female']['mean']) * 100
    else:
        gap = None
        gap_percentage = None
    
    return {
        "success": True,
        "job_title": job_title,
        "statistics": gender_stats,
        "salary_gap": round(gap, 2) if gap else None,
        "gap_percentage": round(gap_percentage, 2) if gap_percentage else None
    }

# Add this method to EmployeeTools class
EmployeeTools.find_salary_gap = find_salary_gap

# Define the tool schema
salary_gap_tool = {
    "type": "function",
    "function": {
        "name": "find_salary_gap",
        "description": "Find the salary gap between male and female employees for a specific job title. Returns average salaries by gender and the gap amount.",
        "parameters": {
            "type": "object",
            "properties": {
                "job_title": {
                    "type": "string",
                    "description": "The job title to analyze for salary gaps"
                }
            },
            "required": ["job_title"]
        }
    }
}

# Add to tool schemas
TOOL_SCHEMAS.append(salary_gap_tool)

print("‚úì New tool added: find_salary_gap")
print("Try asking: 'Is there a salary gap between male and female Web Developers?'")

In [None]:
# Test the new tool
agent_with_new_tool = EmployeeAgent(df_employees, verbose=True)
agent_with_new_tool.available_functions['find_salary_gap'] = agent_with_new_tool.tools.find_salary_gap

query = "Is there a salary gap between male and female Machine Learning Engineers?"
answer = agent_with_new_tool.run(query)
print(f"\nüìù Final Answer: {answer}")

---

# Conclusion

Congratulations! üéâ

You've learned how to build AI agents from scratch, progressing from:
- Basic API calls
- Structured outputs
- Function calling
- Full autonomous agents

The agent pattern is extremely powerful and is the foundation of many modern AI applications. Keep experimenting and building!

**Remember**: The key to good agents is:
1. ‚úÖ Clear, well-described tools
2. ‚úÖ Good system prompts
3. ‚úÖ Proper error handling
4. ‚úÖ Iteration limits for safety
5. ‚úÖ Verbose logging for debugging

Happy building! üöÄ