# LlamaIndex Tutorial: Building AI Agents for Insurance

This tutorial introduces **LlamaIndex**, a framework originally focused on RAG but now with robust agent capabilities. LlamaIndex excels at connecting LLMs with data sources.

## What You'll Learn

1. LlamaIndex agents and tool concepts
2. Building Weather and Eligibility Agents
3. Sequential agent pipelines
4. DSPy Integration
5. MLFlow Tracking

## 1. Installation & Setup

In [None]:
# !pip install llama-index llama-index-llms-openai llama-index-agent-openai
# !pip install httpx beautifulsoup4 python-dotenv

In [None]:
import os
from dotenv import load_dotenv

load_dotenv()

api_key = os.getenv("OPENAI_API_KEY")
api_base = os.getenv("OPENAI_API_BASE", "https://api.openai.com/v1")
model_name = os.getenv("MODEL_NAME", "gpt-4o-mini")

# Set for LlamaIndex
os.environ["OPENAI_API_KEY"] = api_key

print(f"Model: {model_name}")

## 2. Core Concepts

LlamaIndex uses:
- **FunctionTool**: Wrap Python functions as tools
- **OpenAIAgent**: Agent that uses OpenAI function calling
- **ReActAgent**: Agent using ReAct pattern

In [None]:
from llama_index.llms.openai import OpenAI

# Create LLM
llm = OpenAI(
    model=model_name,
    api_key=api_key,
    api_base=api_base if api_base != "https://api.openai.com/v1" else None
)

# Test
response = llm.complete("What is 2+2?")
print(response.text)

## 3. Define Tools

In [None]:
import httpx
from bs4 import BeautifulSoup
from llama_index.core.tools import FunctionTool

def geocode_location(location: str) -> str:
    """
    Convert a location name to coordinates.
    
    Args:
        location: Address or place name
    
    Returns:
        String with location and coordinates
    """
    try:
        with httpx.Client() as client:
            r = client.get(
                "https://nominatim.openstreetmap.org/search",
                params={"q": f"{location}, Australia", "format": "json", "limit": 1},
                headers={"User-Agent": "InsuranceBot/1.0"},
                timeout=10.0
            )
            if r.status_code == 200 and r.json():
                d = r.json()[0]
                return f"Location: {d['display_name']}\nLatitude: {d['lat']}\nLongitude: {d['lon']}"
            return "Error: Location not found"
    except Exception as e:
        return f"Error: {e}"

def get_bom_weather(latitude: str, longitude: str, date: str) -> str:
    """
    Fetch weather from BOM.
    
    Args:
        latitude: Latitude coordinate
        longitude: Longitude coordinate
        date: Date in YYYY-MM-DD format
    
    Returns:
        Weather report
    """
    try:
        year, month, day = date.split("-")
        url = "https://reg.bom.gov.au/cgi-bin/climate/storms/get_storms.py"
        params = {
            "begin_day": day, "begin_month": month, "begin_year": year,
            "end_day": day, "end_month": month, "end_year": year,
            "lat": float(latitude), "lng": float(longitude),
            "event": "all", "distance_from_point": "50", "states": "all"
        }
        
        with httpx.Client() as client:
            r = client.get(url, params=params, timeout=15.0)
            if r.status_code != 200:
                return f"Error: HTTP {r.status_code}"
            
            soup = BeautifulSoup(r.text, 'html.parser')
            events = []
            for row in soup.find_all('tr')[1:]:
                cells = row.find_all('td')
                if len(cells) >= 2:
                    event = cells[0].get_text(strip=True)
                    if event:
                        events.append(event)
            
            has_thunder = any('thunder' in e.lower() or 'lightning' in e.lower() for e in events)
            has_wind = any('wind' in e.lower() or 'gust' in e.lower() for e in events)
            
            return f"""Date: {date}
Events: {', '.join(events) if events else 'None'}
Has Thunderstorm: {has_thunder}
Has Strong Wind: {has_wind}"""
    except Exception as e:
        return f"Error: {e}"

# Create FunctionTools
geocode_tool = FunctionTool.from_defaults(fn=geocode_location)
weather_tool = FunctionTool.from_defaults(fn=get_bom_weather)

print(f"Tools: {geocode_tool.metadata.name}, {weather_tool.metadata.name}")

In [None]:
# Test
print(geocode_location("Brisbane, QLD"))

## 4. Create Weather Agent

In [None]:
from llama_index.agent.openai import OpenAIAgent

weather_agent = OpenAIAgent.from_tools(
    tools=[geocode_tool, weather_tool],
    llm=llm,
    system_prompt="""You are a Weather Verification Agent for insurance.

1. Use geocode_location to get coordinates
2. Use get_bom_weather to fetch weather data
3. Compile a structured report

Include: location, coordinates, date, events, thunderstorm status, wind status.""",
    verbose=True
)

print("Weather agent created")

In [None]:
# Run weather agent
def run_weather_agent(location: str, date: str):
    print(f"\n[Weather Agent] Verifying {location} on {date}")
    response = weather_agent.chat(f"Verify weather for {location} on {date}")
    return str(response)

weather_report = run_weather_agent("Brisbane, QLD", "2025-03-07")
print("\nReport:")
print(weather_report)

## 5. Create Eligibility Agent

In [None]:
eligibility_agent = OpenAIAgent.from_tools(
    tools=[],  # No tools
    llm=llm,
    system_prompt="""You are a Claims Eligibility Agent.

Rules:
- APPROVED: Both thunderstorms AND strong winds detected in Australia
- REVIEW: Only one severe weather type detected
- DENIED: No severe weather or outside Australia

Provide: DECISION, REASONING, CONFIDENCE (High/Medium/Low)""",
    verbose=True
)

def run_eligibility_agent(weather_report: str):
    print("\n[Eligibility Agent] Determining...")
    response = eligibility_agent.chat(f"Determine eligibility:\n\n{weather_report}")
    return str(response)

eligibility_result = run_eligibility_agent(weather_report)
print("\nDecision:")
print(eligibility_result)

## 6. Complete Pipeline

In [None]:
def process_claim(location: str, date: str):
    """Run complete claims pipeline."""
    
    print("=" * 60)
    print(f"Processing: {location} on {date}")
    print("=" * 60)
    
    # Reset agent memory
    weather_agent.reset()
    eligibility_agent.reset()
    
    # Step 1: Weather
    weather_report = run_weather_agent(location, date)
    
    # Step 2: Eligibility
    eligibility_result = run_eligibility_agent(weather_report)
    
    print("\n" + "=" * 60)
    print("FINAL:")
    print("=" * 60)
    print(eligibility_result)
    
    return {"weather": weather_report, "decision": eligibility_result}

result = process_claim("Brisbane, QLD", "2025-03-07")

---

## 7. DSPy Integration

In [None]:
import dspy

dspy_lm = dspy.LM(model=f"openai/{model_name}", api_key=api_key, api_base=api_base)
dspy.configure(lm=dspy_lm)

class EligSig(dspy.Signature):
    """CAT eligibility."""
    weather_report: str = dspy.InputField()
    decision: str = dspy.OutputField(desc="APPROVED/REVIEW/DENIED")
    reasoning: str = dspy.OutputField()

elig_mod = dspy.ChainOfThought(EligSig)

# Optimize
from dspy.teleprompt import BootstrapFewShot

examples = [
    dspy.Example(weather_report="Thunder+Wind. Both: True", decision="APPROVED", reasoning="Both met").with_inputs("weather_report"),
    dspy.Example(weather_report="Rain. Both: False", decision="DENIED", reasoning="No severe").with_inputs("weather_report"),
    dspy.Example(weather_report="Thunder only. Wind: False", decision="REVIEW", reasoning="One met").with_inputs("weather_report"),
]

metric = lambda ex, pred, trace=None: ex.decision.upper() == pred.decision.upper()
optimizer = BootstrapFewShot(metric=metric, max_bootstrapped_demos=2)
optimized_mod = optimizer.compile(elig_mod, trainset=examples)

print("DSPy optimized")

In [None]:
# DSPy-enhanced pipeline
def process_with_dspy(location: str, date: str):
    weather_agent.reset()
    
    print(f"\n[DSPy Pipeline] {location} on {date}")
    weather_report = run_weather_agent(location, date)
    
    r = optimized_mod(weather_report=weather_report)
    print(f"\nDSPy Decision: {r.decision}")
    print(f"Reasoning: {r.reasoning}")
    
    return {"weather": weather_report, "decision": r.decision, "reasoning": r.reasoning}

dspy_result = process_with_dspy("Brisbane, QLD", "2025-03-07")

## 8. MLFlow Integration

In [None]:
import mlflow
from datetime import datetime

mlflow.set_experiment("llamaindex-claims")

def run_tracked(location: str, date: str, use_dspy: bool = False):
    run_name = f"llama_{location.split(',')[0]}"
    if use_dspy:
        run_name += "_dspy"
    
    with mlflow.start_run(run_name=run_name):
        mlflow.log_params({"framework": "llamaindex", "model": model_name, "use_dspy": use_dspy})
        
        start = datetime.now()
        
        if use_dspy:
            result = process_with_dspy(location, date)
        else:
            result = process_claim(location, date)
        
        duration = (datetime.now() - start).total_seconds()
        
        decision = "UNKNOWN"
        for d in ["APPROVED", "DENIED", "REVIEW"]:
            if d in str(result.get("decision", "")).upper():
                decision = d
                break
        
        mlflow.log_metrics({"duration": duration})
        mlflow.set_tags({"decision": decision})
        
        print(f"Logged: {decision}, {duration:.2f}s")

run_tracked("Brisbane, QLD", "2025-03-07", use_dspy=True)

## 9. Summary

### Covered
- LlamaIndex FunctionTool and agents
- OpenAIAgent for tool calling
- DSPy + MLFlow integration

### Strengths
- Excellent for RAG applications
- Strong data connector ecosystem
- Good documentation

### Challenges
- Less agent-focused than some frameworks
- API changes between versions

In [None]:
print("Tutorial complete!")