# OpenAI Agents SDK Tutorial: Building AI Agents for Insurance

This tutorial introduces the **OpenAI Agents SDK** (evolved from Swarm), OpenAI's lightweight framework for building multi-agent systems with explicit handoffs.

## What You'll Learn

1. OpenAI Agents SDK core concepts: Agents, Functions, and Handoffs
2. Building a Weather Verification Agent with tool calling
3. Building a Claims Eligibility Agent for business logic
4. Implementing explicit agent handoffs
5. Integrating DSPy for prompt optimization
6. Using MLFlow for experiment tracking

## Prerequisites

- Python 3.10+
- OpenAI API key (**Note: This SDK is OpenAI-only by default**)
- Basic Python knowledge

---

## Important: OpenAI-Only Framework

The OpenAI Agents SDK is designed specifically for OpenAI's models. While you can use alternative APIs, they must be fully OpenAI-compatible. Some features may not work with non-OpenAI providers.

## 1. Installation & Setup

In [None]:
# Install the OpenAI SDK (Agents SDK is included)
# !pip install openai

# Additional dependencies
# !pip install httpx beautifulsoup4 python-dotenv

In [None]:
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Get API configuration
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
    raise ValueError("Please set OPENAI_API_KEY environment variable")

# Optional: Alternative API base (must be OpenAI-compatible)
api_base = os.getenv("OPENAI_API_BASE", "https://api.openai.com/v1")
model_name = os.getenv("MODEL_NAME", "gpt-4o-mini")

print(f"Using model: {model_name}")
print(f"API base: {api_base}")

## 2. Core Concepts: The Swarm/Agents Pattern

The OpenAI Agents SDK (formerly Swarm) uses a **minimal, controllable** approach:

### 2.1 Agents

Agents are lightweight wrappers around instructions and functions:

```python
from openai.agents import Agent

agent = Agent(
    name="my_agent",
    instructions="You are a helpful assistant that...",
    functions=[my_function]
)
```

### 2.2 Functions (Tools)

Functions are regular Python functions. The SDK automatically:
- Generates JSON schemas from type hints
- Handles function calling
- Returns results to the agent

```python
def my_function(param: str) -> str:
    """Description for the LLM."""
    return f"Result: {param}"
```

### 2.3 Handoffs

The key feature: explicit handoffs between agents:

```python
def transfer_to_specialist():
    """Transfer to the specialist agent."""
    return specialist_agent  # Return another agent
```

When a function returns an Agent, control transfers to that agent.

## 3. Define Tools for Weather Verification

Tools are plain Python functions with type hints and docstrings.

In [None]:
import httpx
from bs4 import BeautifulSoup

def geocode_location(location: str) -> str:
    """
    Convert a location name to latitude/longitude coordinates.
    Use this before fetching weather data.
    
    Args:
        location: Address or place name (e.g., 'Brisbane, QLD')
    
    Returns:
        String with coordinates and location details
    """
    try:
        with httpx.Client() as client:
            response = client.get(
                "https://nominatim.openstreetmap.org/search",
                params={
                    "q": f"{location}, Australia",
                    "format": "json",
                    "limit": 1
                },
                headers={"User-Agent": "InsuranceWeatherBot/1.0"},
                timeout=10.0
            )
            
            if response.status_code != 200:
                return f"Error: Geocoding failed with status {response.status_code}"
            
            data = response.json()
            if not data:
                return f"Error: Location not found: {location}"
            
            result = data[0]
            return f"""GEOCODING RESULT:
Location: {result['display_name']}
Latitude: {result['lat']}
Longitude: {result['lon']}"""
    except Exception as e:
        return f"Error: {str(e)}"

# Test
print(geocode_location("Brisbane, QLD"))

In [None]:
def get_bom_weather(latitude: str, longitude: str, date: str) -> str:
    """
    Fetch weather observations from Australian Bureau of Meteorology.
    
    Args:
        latitude: Latitude coordinate as string (e.g., '-27.4698')
        longitude: Longitude coordinate as string (e.g., '153.0251')
        date: Date in YYYY-MM-DD format
    
    Returns:
        Weather report with events found
    """
    try:
        # Parse date
        year, month, day = date.split("-")
        
        url = "https://reg.bom.gov.au/cgi-bin/climate/storms/get_storms.py"
        params = {
            "begin_day": day,
            "begin_month": month,
            "begin_year": year,
            "end_day": day,
            "end_month": month,
            "end_year": year,
            "lat": float(latitude),
            "lng": float(longitude),
            "event": "all",
            "distance_from_point": "50",
            "states": "all"
        }
        
        with httpx.Client() as client:
            response = client.get(url, params=params, timeout=15.0)
            
            if response.status_code != 200:
                return f"Error: BOM API returned status {response.status_code}"
            
            soup = BeautifulSoup(response.text, 'html.parser')
            rows = soup.find_all('tr')
            
            events = []
            for row in rows[1:]:
                cells = row.find_all('td')
                if len(cells) >= 2:
                    event_type = cells[0].get_text(strip=True)
                    if event_type:
                        events.append(event_type)
            
            has_thunderstorm = any('thunder' in e.lower() or 'lightning' in e.lower() for e in events)
            has_strong_wind = any('wind' in e.lower() or 'gust' in e.lower() for e in events)
            
            return f"""WEATHER REPORT:
Date: {date}
Coordinates: ({latitude}, {longitude})
Events Found: {', '.join(events) if events else 'None'}
Has Thunderstorm: {has_thunderstorm}
Has Strong Wind: {has_strong_wind}
Total Events: {len(events)}"""
    except Exception as e:
        return f"Error: {str(e)}"

# Test
print(get_bom_weather("-27.4698", "153.0251", "2025-03-07"))

## 4. Create Agents

Now let's create our agents using the OpenAI Agents SDK.

In [None]:
from openai import OpenAI

# Create the OpenAI client
client = OpenAI(
    api_key=api_key,
    base_url=api_base if api_base != "https://api.openai.com/v1" else None
)

print(f"OpenAI client configured")

In [None]:
# Note: The Agents SDK may not be available in all OpenAI package versions
# If not available, we'll implement a compatible pattern

try:
    from openai.agents import Agent, Swarm
    AGENTS_SDK_AVAILABLE = True
    print("OpenAI Agents SDK available")
except ImportError:
    AGENTS_SDK_AVAILABLE = False
    print("Agents SDK not available - using compatible implementation")

# Define a simple Agent class if SDK not available
if not AGENTS_SDK_AVAILABLE:
    from dataclasses import dataclass
    from typing import Callable, List, Optional
    
    @dataclass
    class Agent:
        """Simple agent compatible with Swarm pattern."""
        name: str
        instructions: str
        functions: List[Callable] = None
        model: str = "gpt-4o-mini"
        
        def __post_init__(self):
            if self.functions is None:
                self.functions = []

In [None]:
# Define the handoff function first (references eligibility_agent)
eligibility_agent = None  # Placeholder, will be defined below

def transfer_to_eligibility_agent():
    """
    Transfer control to the Claims Eligibility Agent.
    Use this after weather verification is complete.
    """
    return eligibility_agent

# Create Weather Agent
weather_agent = Agent(
    name="Weather Verification Agent",
    instructions="""You are a Weather Verification Agent for an Australian insurance company.

Your job:
1. Use geocode_location to convert addresses to coordinates
2. Use get_bom_weather to fetch weather data from BOM
3. Compile a weather verification report
4. Transfer to the eligibility agent for the final decision

Always include in your report:
- Verified location and coordinates
- Date checked
- Weather events found
- Whether thunderstorms were detected
- Whether strong winds were detected

After completing verification, call transfer_to_eligibility_agent.""",
    functions=[geocode_location, get_bom_weather, transfer_to_eligibility_agent],
    model=model_name
)

print(f"Created: {weather_agent.name}")

In [None]:
# Create Eligibility Agent
eligibility_agent = Agent(
    name="Claims Eligibility Agent",
    instructions="""You are a Claims Eligibility Agent for an Australian insurance company.

You receive weather verification reports and determine CAT event eligibility.

## Eligibility Rules

**APPROVED** - Qualifies as CAT event if:
- Location is within Australia (lat: -44 to -10, lon: 112 to 154)
- BOTH thunderstorms AND strong winds were detected
- Date is valid (within 90 days, not in future)

**REVIEW** - Needs manual review if:
- Only ONE weather type detected
- Location is near Australian borders

**DENIED** - Does not qualify if:
- Neither thunderstorms nor strong winds detected
- Location outside Australia
- Invalid date

## Response Format

Provide:
1. DECISION: APPROVED, REVIEW, or DENIED
2. REASONING: Why this decision
3. CONFIDENCE: High, Medium, or Low
4. RECOMMENDATIONS: Any follow-up actions""",
    functions=[],  # No tools
    model=model_name
)

print(f"Created: {eligibility_agent.name}")

## 5. Implement the Swarm Runner

Since the Agents SDK may not be available, let's implement a compatible runner.

In [None]:
import json
import inspect
from typing import Any, Dict

def function_to_schema(func: Callable) -> Dict:
    """Convert a Python function to OpenAI function schema."""
    sig = inspect.signature(func)
    
    parameters = {
        "type": "object",
        "properties": {},
        "required": []
    }
    
    for name, param in sig.parameters.items():
        param_type = "string"  # Default
        if param.annotation != inspect.Parameter.empty:
            if param.annotation == int:
                param_type = "integer"
            elif param.annotation == float:
                param_type = "number"
            elif param.annotation == bool:
                param_type = "boolean"
        
        parameters["properties"][name] = {"type": param_type}
        
        if param.default == inspect.Parameter.empty:
            parameters["required"].append(name)
    
    return {
        "type": "function",
        "function": {
            "name": func.__name__,
            "description": func.__doc__ or "",
            "parameters": parameters
        }
    }

def run_swarm(agent: Agent, messages: list, client: OpenAI, max_turns: int = 10):
    """
    Run a swarm-style agent conversation with handoffs.
    """
    current_agent = agent
    conversation = messages.copy()
    
    for turn in range(max_turns):
        # Build tools from current agent's functions
        tools = [function_to_schema(f) for f in current_agent.functions] if current_agent.functions else None
        
        # Build messages with system prompt
        full_messages = [
            {"role": "system", "content": current_agent.instructions}
        ] + conversation
        
        # Call the API
        response = client.chat.completions.create(
            model=current_agent.model,
            messages=full_messages,
            tools=tools if tools else None
        )
        
        message = response.choices[0].message
        
        # Add assistant message to conversation
        conversation.append({
            "role": "assistant",
            "content": message.content,
            "tool_calls": [tc.model_dump() for tc in message.tool_calls] if message.tool_calls else None
        })
        
        print(f"\n[{current_agent.name}]: {message.content or '(calling tools)'}")
        
        # Handle tool calls
        if message.tool_calls:
            for tool_call in message.tool_calls:
                func_name = tool_call.function.name
                func_args = json.loads(tool_call.function.arguments)
                
                # Find and call the function
                func = next((f for f in current_agent.functions if f.__name__ == func_name), None)
                
                if func:
                    print(f"  -> Calling {func_name}({func_args})")
                    result = func(**func_args)
                    
                    # Check for handoff (function returns an Agent)
                    if isinstance(result, Agent):
                        print(f"  -> Handoff to: {result.name}")
                        current_agent = result
                        conversation.append({
                            "role": "tool",
                            "tool_call_id": tool_call.id,
                            "content": f"Transferred to {result.name}"
                        })
                    else:
                        print(f"  -> Result: {result[:100]}..." if len(str(result)) > 100 else f"  -> Result: {result}")
                        conversation.append({
                            "role": "tool",
                            "tool_call_id": tool_call.id,
                            "content": str(result)
                        })
        else:
            # No tool calls - conversation might be done
            if "DECISION:" in (message.content or "").upper():
                break
    
    return conversation

print("Swarm runner implemented")

## 6. Run the Agent Pipeline

In [None]:
def process_claim(location: str, date: str):
    """
    Process an insurance claim through the agent pipeline.
    """
    
    print("=" * 60)
    print(f"Processing claim for {location} on {date}")
    print("=" * 60)
    
    messages = [
        {
            "role": "user",
            "content": f"""Please process this insurance claim:
            
Location: {location}
Date of Incident: {date}

Verify the weather conditions and determine CAT event eligibility."""
        }
    ]
    
    result = run_swarm(weather_agent, messages, client)
    
    # Get final message
    final_message = result[-1].get("content", "")
    
    print("\n" + "=" * 60)
    print("FINAL DECISION:")
    print("=" * 60)
    print(final_message)
    
    return result

# Run the pipeline
result = process_claim("Brisbane, QLD, 4000", "2025-03-07")

## 7. Testing Multiple Claims

In [None]:
import time

test_claims = [
    ("Brisbane, QLD, 4000", "2025-03-07"),
    ("Sydney, NSW, 2000", "2025-03-07"),
    ("Perth, WA, 6000", "2025-01-15"),
]

def test_all_claims():
    results = []
    
    for location, date in test_claims:
        print(f"\n\n{'#'*60}")
        print(f"# Testing: {location}")
        print(f"{'#'*60}")
        
        try:
            result = process_claim(location, date)
            results.append((location, date, "Success"))
        except Exception as e:
            results.append((location, date, f"Error: {e}"))
        
        time.sleep(2)
    
    return results

# Uncomment to run all tests
# all_results = test_all_claims()

---

## 8. DSPy Integration for Prompt Optimization

DSPy can optimize the agent instructions used in the OpenAI Agents pattern.

In [None]:
# !pip install dspy

In [None]:
import dspy

# Configure DSPy
dspy_lm = dspy.LM(
    model=f"openai/{model_name}",
    api_key=api_key,
    api_base=api_base
)
dspy.configure(lm=dspy_lm)

print("DSPy configured")

In [None]:
# DSPy signature for eligibility determination
class EligibilitySignature(dspy.Signature):
    """Determine CAT event eligibility based on weather verification."""
    
    weather_report: str = dspy.InputField(
        desc="Weather verification report with events, location, date"
    )
    
    decision: str = dspy.OutputField(
        desc="APPROVED, REVIEW, or DENIED"
    )
    
    reasoning: str = dspy.OutputField(
        desc="Brief explanation for the decision"
    )
    
    confidence: str = dspy.OutputField(
        desc="High, Medium, or Low"
    )

# Create module
eligibility_module = dspy.ChainOfThought(EligibilitySignature)

# Test
test_report = """Location: Brisbane, QLD. Lat: -27.47, Lon: 153.02.
Date: 2025-03-07. Events: Thunderstorm, Wind Gust.
Has Thunderstorm: True. Has Strong Wind: True."""

result = eligibility_module(weather_report=test_report)
print(f"Decision: {result.decision}")
print(f"Reasoning: {result.reasoning}")
print(f"Confidence: {result.confidence}")

In [None]:
# Training examples
training_examples = [
    dspy.Example(
        weather_report="Brisbane, QLD. -27.47, 153.02. 2025-03-07. Events: Thunderstorm, Wind Gust. Has Thunderstorm: True. Has Strong Wind: True.",
        decision="APPROVED",
        reasoning="Both severe weather types confirmed",
        confidence="High"
    ).with_inputs("weather_report"),
    
    dspy.Example(
        weather_report="Sydney, NSW. -33.87, 151.21. 2025-03-07. Events: Light Rain. Has Thunderstorm: False. Has Strong Wind: False.",
        decision="DENIED",
        reasoning="No severe weather detected",
        confidence="High"
    ).with_inputs("weather_report"),
    
    dspy.Example(
        weather_report="Melbourne, VIC. -37.81, 144.96. 2025-03-07. Events: Thunderstorm. Has Thunderstorm: True. Has Strong Wind: False.",
        decision="REVIEW",
        reasoning="Only thunderstorm detected, missing wind",
        confidence="Medium"
    ).with_inputs("weather_report"),
]

print(f"Created {len(training_examples)} examples")

In [None]:
# Optimize
from dspy.teleprompt import BootstrapFewShot
from dspy.evaluate import Evaluate

def metric(example, pred, trace=None):
    return example.decision.upper() == pred.decision.upper()

evaluator = Evaluate(devset=training_examples, metric=metric, num_threads=1)
baseline_score = evaluator(eligibility_module)
print(f"Baseline: {baseline_score}%")

optimizer = BootstrapFewShot(metric=metric, max_bootstrapped_demos=2)
optimized_module = optimizer.compile(eligibility_module, trainset=training_examples)

optimized_score = evaluator(optimized_module)
print(f"Optimized: {optimized_score}%")

### 8.1 Export Optimized Instructions to Agent

In [None]:
def build_enhanced_instructions(module) -> str:
    """Build enhanced instructions from DSPy module."""
    
    instructions = """You are a Claims Eligibility Agent optimized with DSPy.

## Eligibility Rules

- APPROVED: Both thunderstorms AND strong winds detected in valid Australian location
- REVIEW: Only ONE severe weather type detected
- DENIED: No severe weather or location outside Australia

## Example Decisions
"""
    
    # Add demos if available
    if hasattr(module, 'demos') and module.demos:
        for i, demo in enumerate(module.demos, 1):
            instructions += f"\n### Example {i}\n"
            for key in ['weather_report', 'decision', 'reasoning']:
                if hasattr(demo, key):
                    val = getattr(demo, key)
                    instructions += f"{key}: {val[:100] if len(str(val)) > 100 else val}\n"
    
    instructions += """\n## Response Format

Always provide:
1. DECISION: APPROVED, REVIEW, or DENIED
2. REASONING: Brief explanation
3. CONFIDENCE: High, Medium, or Low
"""
    
    return instructions

enhanced_instructions = build_enhanced_instructions(optimized_module)
print(enhanced_instructions)

In [None]:
# Create enhanced eligibility agent
enhanced_eligibility_agent = Agent(
    name="Enhanced Claims Eligibility Agent",
    instructions=enhanced_instructions,
    functions=[],
    model=model_name
)

# Update handoff function
def transfer_to_enhanced_eligibility():
    """Transfer to the enhanced Claims Eligibility Agent."""
    return enhanced_eligibility_agent

# Create enhanced weather agent
enhanced_weather_agent = Agent(
    name="Weather Verification Agent",
    instructions=weather_agent.instructions.replace(
        "transfer_to_eligibility_agent",
        "transfer_to_enhanced_eligibility"
    ),
    functions=[geocode_location, get_bom_weather, transfer_to_enhanced_eligibility],
    model=model_name
)

print("Enhanced agents created")

---

## 9. MLFlow Integration for Experiment Tracking

In [None]:
# !pip install mlflow

In [None]:
import mlflow
from datetime import datetime

mlflow.set_experiment("openai-agents-claims")

print(f"MLFlow tracking: {mlflow.get_tracking_uri()}")

In [None]:
def run_tracked_pipeline(location: str, date: str, use_enhanced: bool = False):
    """
    Run pipeline with MLFlow tracking.
    """
    
    run_name = f"agents_{location.split(',')[0]}_{date}"
    if use_enhanced:
        run_name += "_enhanced"
    
    with mlflow.start_run(run_name=run_name):
        mlflow.log_params({
            "framework": "openai-agents",
            "model": model_name,
            "location": location,
            "date": date,
            "use_enhanced": use_enhanced
        })
        
        start_time = datetime.now()
        
        try:
            agent = enhanced_weather_agent if use_enhanced else weather_agent
            
            messages = [{
                "role": "user",
                "content": f"Process claim for {location} on {date}"
            }]
            
            result = run_swarm(agent, messages, client)
            
            duration = (datetime.now() - start_time).total_seconds()
            
            # Extract decision
            final = result[-1].get("content", "")
            decision = "UNKNOWN"
            for d in ["APPROVED", "DENIED", "REVIEW"]:
                if d in final.upper():
                    decision = d
                    break
            
            mlflow.log_metrics({
                "duration_seconds": duration,
                "message_count": len(result),
                "success": 1
            })
            
            mlflow.log_text(final, "final_response.txt")
            mlflow.set_tags({"decision": decision, "status": "success"})
            
            print(f"\nLogged to MLFlow - Decision: {decision}, Duration: {duration:.2f}s")
            
            return {"decision": decision, "duration": duration}
            
        except Exception as e:
            mlflow.log_metrics({"success": 0})
            mlflow.set_tags({"status": "error"})
            raise

# Run with tracking
result = run_tracked_pipeline("Brisbane, QLD", "2025-03-07", use_enhanced=False)

In [None]:
# Compare standard vs enhanced
print("Comparing standard vs enhanced...")

std = run_tracked_pipeline("Sydney, NSW", "2025-03-07", use_enhanced=False)
enh = run_tracked_pipeline("Sydney, NSW", "2025-03-07", use_enhanced=True)

print(f"\nStandard: {std}")
print(f"Enhanced: {enh}")

In [None]:
# Log DSPy optimization
with mlflow.start_run(run_name="dspy_optimization"):
    mlflow.log_params({"optimizer": "BootstrapFewShot", "examples": len(training_examples)})
    mlflow.log_metrics({"baseline": baseline_score, "optimized": optimized_score})
    mlflow.log_text(enhanced_instructions, "enhanced_instructions.txt")
    print("Optimization logged")

In [None]:
# View results
exp = mlflow.get_experiment_by_name("openai-agents-claims")
runs = mlflow.search_runs(experiment_ids=[exp.experiment_id])

print("\nExperiment Runs:")
cols = ['run_id', 'status'] + [c for c in runs.columns if c.startswith(('params.', 'metrics.', 'tags.'))][:8]
print(runs[[c for c in cols if c in runs.columns]].to_string())

## 10. Summary & Key Takeaways

### What We Covered

1. **OpenAI Agents SDK**: Lightweight agent pattern with explicit handoffs
2. **Functions as Tools**: Regular Python functions with type hints
3. **Agent Handoffs**: Return Agent objects to transfer control
4. **DSPy Integration**: Optimize instructions and export to agents
5. **MLFlow Tracking**: Log experiments and compare results

### Strengths

- Minimal abstraction over raw API
- Clear, explicit control flow
- Easy to understand and debug
- Good for simple agent pipelines

### Challenges

- **OpenAI-only** by default
- Manual orchestration required
- Less structured than frameworks like Pydantic AI
- No built-in streaming for multi-agent

### For Insurance Teams

- **Good for**: Teams already using OpenAI, simple agent workflows
- **Consider alternatives if**: Need non-OpenAI models or strict type safety

### Next Steps

1. Add more handoff patterns (circular, conditional)
2. Implement context passing between agents
3. Add more DSPy training examples
4. Compare with other frameworks

In [None]:
print("Tutorial complete!")