# Smart Insurance Claim Processing with Box and OpenAI Agents SDK and Responses API

## Introduction
In this demo, we'll create an intelligent insurance claim processing system that combines Box for content storage with OpenAI's Agents SDK and Response API. Our system will:
1. Create a set of initial demo content
2. Access and analyze car crash photos stored in Box
3. Generate damage cost estimates based on the photos
4. Search for nearby repair shops with good ratings
5. Compile all this information into a pre-inspection report for the insurance adjuster using Box Doc Gen

## Prerequisites 

### Box Setup
In order to use this walkthrough, you will need an enterprise advanced Box instance or Sandbox.

1. Go to [Box Developer Console](https://app.box.com/developers/console)
2. Create a new application with the following settings:
   - Select "Custom App"
   - Give your app a name like "Insurance Claim Processor"
   - Select the "Automation" purpose
   - Select the "User Authentication (2.0)" authentication method type
   - Click "Create App"
   - Add http://localhost:4000/oauth2/callback as a Redirect URI
      - Remove the default Redirect URI
   - Under the Configuration tab, and under Application Scopes, enable the following:
     - Read all files and folders stored in Box
     - Write all files and folders stored in Box
     - Manage signature requests
     - Manage Doc Gen
   - Click "Save Changes" in the top right
3. Once created, note down the Client ID and Client Secret

To use Box Doc Gen, make sure it is enabled by an admin in the Admin Console. If you are a Box Admin, you will find the necessary information in [Enterprise Settings Content & Sharing Tab](https://support.box.com/hc/en-us/articles/4404822772755-Enterprise-Settings-Content-Sharing-Tab#h_01FYQGK5RW42T07GV985MQ9E9A) documentation.

### OpenAI Setup
In order to use this walkthrough, you will need an Open AI account with an organization and billing attached. You can find out more about that process in their [documentation](https://platform.openai.com/)

1. Once your account is created, create and note down API key

## Step 1: Setup environment
First, create a virtual environment, install the required Python packages, import the necessary libraries, and configure your environment variables.

In [40]:
# Create a virtual environment for the project
!python3 -m venv venv

In [1]:
# Activate the virtual environment
# For Windows, use the following command instead:
# !venv\Scripts\activate
!source venv/bin/activate

In [None]:
# Install the required packages
!pip3 install box-sdk-gen flask openai openai-agents pydantic

In [None]:
# Import the necessary libraries
import os
import webbrowser
from flask import Flask, request
from threading import Thread
from box_sdk_gen import BoxOAuth, BoxClient, OAuthConfig, FileTokenStorage, BoxClient, GetAuthorizeUrlOptions, CreateFolderParent, UploadFileAttributes, UploadFileAttributesParentField, FileReferenceV2025R0, CreateDocgenBatchV2025R0DestinationFolder, DocGenDocumentGenerationDataV2025R0
from agents import Agent, Runner, function_tool, WebSearchTool, input_guardrail, GuardrailFunctionOutput, trace
from agents.extensions.handoff_prompt import prompt_with_handoff_instructions
import datetime
from openai import OpenAI
from pydantic import BaseModel
from typing import List, Optional
import json

# Box Custom Application IDs (replace with your own)
BOX_CLIENT_ID = 'client_id'
BOX_CLIENT_SECRET = 'client_secret'

# Open AI Key (replace with your own)
os.environ['OPENAI_API_KEY'] = 'your_openai_api_key'
openAIClient = OpenAI()

# Paths to local files and folders
IMAGES_FOLDER_PATH = "./supporting_files/images"
REPORT_TEMPLATE_PATH = "./supporting_files/finished_report_template.docx"
CLAIM_INFO_FILE_PATH = "./supporting_files/dummy_customer_information.json"

# Folder and claim details
BASE_FOLDER_NAME = "Open AI Demo"
CLAIMS_FOLDER_NAME = "Claims"
CLAIM_ID = "G8947892834455"
BASE_FOLDER_ID = "317948020158"

# Global variable to store folder and file IDs for later use
uploaded_ids = {}

print("Environment variables set")

In [None]:
# Pydantic models for structured outputs
class InsuranceGuardrailOutput(BaseModel):
    is_insurance_related: bool
    reasoning: str

class DamageInfo(BaseModel):
    damage_description: str
    damaged_parts_list: List[str]

class EstimatedCost(BaseModel):
    estimated_cost: str   

class Claim(BaseModel):
    report_date: str 
    assigned_adjuster: str
    other_driver_insurance_company: Optional[str] = None
    other_driver_policy_number: Optional[str] = None
    cross_streets_of_accident: str
    date_of_incident: str
    time_of_incident: str
    number_of_vehicles_involved: str
    customer_initial_report: str
    law_enforcement_agency: Optional[str] = None
    law_enforcement_report_id: Optional[str] = None
    damage_description: Optional[str] = None
    estimated_cost: Optional[str] = None
    damaged_parts_list: Optional[List[str]] = None 

class Customer(BaseModel):
    first_name: str
    last_name: str
    street_address: str
    city: str
    state: str
    zip_code: str
    phone_number: str
    email: str
    car_year: str
    car_make: str
    car_model: str
    vin: str
    license_plate: str
    car_mileage: str
    car_color: str
    policy_number: str

class Shop(BaseModel):
    shop_name: str
    shop_address: str
    shop_phone: str

class Shops(BaseModel):
    shop_one: Optional[Shop] = None
    shop_two: Optional[Shop] = None
    shop_three: Optional[Shop] = None

class InsuranceReport(BaseModel):
    claim_number: str
    claim: Claim
    customer: Customer
    shops: Shops
print("Pydantic models defined")

## Step 2: Authenticate with Box and Open AI
Opens a web page so that you can authentication using OAuth 2.0. The tokens are stored in a local db file. If you want to reset this, you can simply delete that file. 

In [None]:
# Flask app for handling the redirect
app = Flask(__name__)
auth_code = None  # Global variable to store the authorization code

# Flask route to handle the redirect
@app.route("/oauth2/callback")
def oauth2_callback():
    global auth_code
    auth_code = request.args.get("code")
    return "Authorization successful! You can close this window."

# Function to start the Flask app in a separate thread
def start_flask_app():
    app.run(port=4000)

# Function to authenticate using OAuth 2.0 with built-in file token storage
def authenticate_box():
    # Create the token storage object
    token_storage = FileTokenStorage()  # Uses the built-in file-based token storage

    # Create the OAuth configuration
    oauth_config = OAuthConfig(
        client_id=BOX_CLIENT_ID,
        client_secret=BOX_CLIENT_SECRET,
        token_storage=token_storage,
    )

    # Initialize the BoxOAuth object
    oauth = BoxOAuth(config=oauth_config)

    # Check if tokens already exist in storage
    if token_storage.get():
        print("Loaded existing tokens from storage.")
    else:
        # Generate the authorization URL with explicit redirect_uri
        auth_url = oauth.get_authorize_url(
            options=GetAuthorizeUrlOptions(redirect_uri="http://localhost:4000/oauth2/callback")
        )
        print(f"Go to the following URL to authorize the application: {auth_url}")

        # Start the Flask app in a separate thread
        flask_thread = Thread(target=start_flask_app)
        flask_thread.daemon = True
        flask_thread.start()

        # Open the authorization URL in the default web browser
        webbrowser.open(auth_url)

        # Wait for the authorization code to be set
        global auth_code
        while auth_code is None:
            pass

        # Exchange the authorization code for tokens
        oauth.get_tokens_authorization_code_grant(auth_code)
        print("Authentication successful! Tokens saved to storage.")

    # Instantiate and return the Box client
    return BoxClient(auth=oauth)

# Authenticate and create the Box client
box_client = authenticate_box()

The below is optional, but running it will make sure everything with Open AI is working correctly.

In [None]:
agent = Agent(name="Assistant", instructions="You are a helpful assistant")

result = await Runner.run(agent, "Write a haiku about recursion in programming.")  # type: ignore[top-level-await]  # noqa: F704
print(result.final_output)

# Code within the code,
# Functions calling themselves,
# Infinite loop's dance.

## Step 3: Upload Dummy Content
Create content in your Box instance based on the demo content in this repository. This will create a parent folder for the demo called "Insurance Open AI Demo." Within that folder, it will create another folder called "Claims," as well as upload the finished report template file. Within the Claims folder, it will create a folder called "G8947892834455" for a specific claim in question. Within that folder, it will create a folder called "images" and upload the dummy crash images from the supporting_files images folder. It will also upload the dummy claim information file. All of the ids for these will be saved for later use and logged in the output.

In [None]:
def upload_dummy_content(client, base_folder_name, claims_folder_name, claim_id, images_folder_path, base_folder_id, report_template_path):
    # 1. Create the base folder
    base_folder = client.folders.create_folder(
        base_folder_name,
        CreateFolderParent(id=base_folder_id)
    )
    print(f"Created base folder: {base_folder.name} (ID: {base_folder.id})")

    # 2. Create the Claims folder under the base folder
    claims_folder = client.folders.create_folder(
        claims_folder_name,
        CreateFolderParent(id=base_folder.id)
    )
    print(f"Created Claims folder: {claims_folder.name} (ID: {claims_folder.id})")

    # 3. Create a folder for this specific claim
    claim_folder = client.folders.create_folder(
        claim_id,
        CreateFolderParent(id=claims_folder.id)
    )
    print(f"Created claim folder: {claim_folder.name} (ID: {claim_folder.id})")

    # 4. Create an images folder inside the claim folder
    images_folder = client.folders.create_folder(
        "images",
        CreateFolderParent(id=claim_folder.id)
    )
    print(f"Created images folder: {images_folder.name} (ID: {images_folder.id})")

    # 6. Upload all crash images to the images folder
    uploaded_image_ids = {}  

    for image_file in os.listdir(images_folder_path):
        image_path = os.path.join(images_folder_path, image_file)
        if os.path.isfile(image_path):
            with open(image_path, "rb") as img_stream:
                img_result = client.uploads.upload_file(
                    UploadFileAttributes(
                        name=image_file,
                        parent=UploadFileAttributesParentField(id=images_folder.id)
                    ),
                    img_stream
                )
            uploaded_image = img_result.entries[0]
            print(f"Uploaded image: {uploaded_image.name} (ID: {uploaded_image.id})")    
            uploaded_image_ids[uploaded_image.name] = uploaded_image.id

        # Upload report template
    with open(report_template_path, 'rb') as f:
        uploaded_result = client.uploads.upload_file(
            UploadFileAttributes(
                name="report_template.docx",
                parent=UploadFileAttributesParentField(id=base_folder.id)
            ),
            f
        )
        uploaded_template = uploaded_result.entries[0]
        print(f"Uploaded report template (ID: {uploaded_template.id})")
    
    # Mark the template as a Doc Gen template
    try:
        docgen_template = client.docgen_template.create_docgen_template_v2025_r0(FileReferenceV2025R0(id=uploaded_template.id))
        print(f"Marked file as Doc Gen template (ID: {docgen_template})")
    except Exception as e:
        print(f"Error marking file as Doc Gen template: {str(e)}")

    # 7. Return the relevant IDs
    return {
        "base_folder_id": base_folder.id,
        "claims_folder_id": claims_folder.id,
        "claim_folder_id": claim_folder.id,
        "images_folder_id": images_folder.id,
        "uploaded_image_ids": uploaded_image_ids,
        "template_file_id": uploaded_template.id
    }

# Use the function, then print output
uploaded_ids = upload_dummy_content(
    box_client,
    BASE_FOLDER_NAME,
    CLAIMS_FOLDER_NAME,
    CLAIM_ID,
    IMAGES_FOLDER_PATH,
    BASE_FOLDER_ID, 
    REPORT_TEMPLATE_PATH
)
print("Uploaded content IDs:", uploaded_ids)

## Step 4: Define Custom Tools and Guardrails
Define custom tools and guardrails to use during the agentic flow.

In [None]:
@function_tool
def analyze_vehicle_damage_gpt4v(
    image_urls: list[str], year: str, make: str, model: str
) -> dict:

    input_content = [
        {
            "role": "user",
            "content": [
                {
                    "type": "input_text",
                    "text": (
                        f"Think deeply. Based on these images, create a list of parts that you estimate "
                        f"will need repaired or replaced, as well as a detailed description of the damage "
                        f"to the vehicle. Make sure to think about and include hidden damage that you "
                        f"might not be able to see in the photos, for example camera in the bumper or the "
                        f"batteries the vehicle uses. The vehicle is a {year} {make} {model}. "
                        f"Return a valid JSON object that includes the following:\n"
                        f' - \"damage_description\": a concise summary describing the overall damage to '
                        f"the vehicle.\n"
                        f' - \"damaged_parts_list\": a list of parts that need repair or replacement.\n'
                        f"Do NOT include any cost estimates, extra explanations, or any text outside of "
                        f"this JSON object. Follow proper JSON formatting."
                    ),
                }
            ]
            # ⇢ feed data URIs to the vision model
            + [{"type": "input_image", "image_url": url} for url in image_urls],
        }
    ]

    schema = {
        "type": "object",
        "properties": {
            "damage_description": {"type": "string"},
            "damaged_parts_list": {
                "type": "array",
                "items": {"type": "string"}
            }
        },
        "required": ["damage_description", "damaged_parts_list"],
        "additionalProperties": False
    }

    print(input_content)
    response = openAIClient.responses.create(
        model="gpt-4.1",
        input=input_content,
        text={
            "format": {
                "type": "json_schema",
                "name": "DamageInfo",
                "schema": schema,
                "strict": True
            }
        }
    )

    return json.loads(response.output_text)

print("Analyze custom tools defined")

@input_guardrail
async def insurance_query_guardrail(_context, _agent, input_data: str):
    """
    Input‑guardrail → returns GuardrailFunctionOutput.
    Unused arguments are prefixed “_”.
    """
    guardrail_agent = Agent(
        name="Insurance Guardrail",
        instructions=(
            "Return JSON {is_insurance_related: bool, reasoning: str}. "
            "True only if the query is clearly about an insurance claim."
        ),
        model="gpt-4.1-mini",
        output_type=InsuranceGuardrailOutput,
    )

    result = await Runner.run(guardrail_agent, input_data)
    final_output = result.final_output_as(InsuranceGuardrailOutput)

    return GuardrailFunctionOutput(
        output_info=final_output,
        tripwire_triggered=not final_output.is_insurance_related,
    )

print("Input guardrail defined")

## Step 5: Create Specialized Agents

Now, let's create specialized agents for different parts of the insurance claim process.

In [None]:
repair_shop_agent = Agent(
    name="Repair Shop Finder",
    instructions=prompt_with_handoff_instructions(
        """
        Find exactly THREE reputable auto‑repair shops near the provided address
        that can perform the required repairs.

        Output ONLY this JSON object:

        {
          "shop_one":   { "shop_name": "...", "shop_address": "...", "shop_phone": "..." },
          "shop_two":   { "shop_name": "...", "shop_address": "...", "shop_phone": "..." },
          "shop_three": { "shop_name": "...", "shop_address": "...", "shop_phone": "..." }
        }

        After you reply, immediately hand off to the orchestrator
        by calling transfer_to_insurance_claim_orchestrator().
        """
    ),
    model="gpt-4.1-mini",
    tools=[WebSearchTool()],
    handoffs=[],
    output_type=Shops
)

cost_estimation_agent = Agent(
    name="Cost Estimator",
    instructions=prompt_with_handoff_instructions(
        """
        Estimate a realistic TOTAL repair‑cost range (parts + labour) based on
        the supplied damage description and parts list.

        Respond ONLY with JSON in exactly this form
        (no back‑ticks, no markdown, no explanation):

        { "estimated_cost": "$X – $Y" }

        After you reply, hand off automatically to Repair Shop Finder.
        """
    ),
    model="gpt-4.1-mini",
    handoffs=[repair_shop_agent],
    output_type=EstimatedCost
)

image_analysis_agent = Agent(
    name="Image Analyst",
    instructions=prompt_with_handoff_instructions(
        """
        You are a specialist at analysing vehicle‑damage photos.

        Call analyze_vehicle_damage_gpt4v to obtain the analysis.

        Make sure to think about and include hidden damage that you might not be able to see in the photos for example camera in the bumper or the batteries the vehicle uses.

        Respond ONLY with JSON:
        {
          "damage_description": <string>,
          "damaged_parts_list": <string list>
        }

        After you reply, hand off automatically to Cost Estimator
        by calling transfer_to_cost_estimator().
        """
    ),
    model="gpt-4.1",
    tools=[analyze_vehicle_damage_gpt4v],
    handoffs=[cost_estimation_agent],                   
    output_type=DamageInfo
)

claim_orchestrator = Agent(
    name="Insurance Claim Orchestrator",
    instructions=prompt_with_handoff_instructions("""
        You are an insurance claim processing Orchestrator.

        Your job is to process a complete insurance claim by:
        1. Verifying the query is insurance-related using the guardrail insurance_query_guardrail
        2. Analyzing vehicle damage from the provided photos using image_analysis_agent
        3. Estimating repair costs based on the damage assessment using cost_estimation_agent
        4. Finding recommended repair shops near the customer using repair_shop_agent

        CRITICAL INSTRUCTIONS:
        - You are an agent - please keep going until the user's query is completely resolved, before ending your turn
        - You MUST use ALL three specialist agents in sequence: image_analysis_agent, cost_estimation_agent, and repair_shop_agent
        - If you are not sure about information, use your specialist agents to gather it - do NOT guess
        - You MUST plan extensively before each handoff, and reflect on the outcomes before proceeding to the next agent
        - NEVER stop processing after just the first agent completes - you must complete ALL steps

        WORKFLOW:
        1. First, use image_analysis_agent to analyze damage
        2. Next, handoff to cost_estimation_agent with the complete damage analysis from step 1
        3. Finally, handoff to repair_shop_agent with the customer location information

        When you regain control you will have DamageInfo JSON from Image Analyst, EstimatedCost JSON from Cost Estimator, Shops JSON from Repair Shop Finder, and raw_json from the customer info file.
        Merge those four into ONE InsuranceReport JSON object.

        This output structure is critical for processing the claim report.
        Reply with ONLY that JSON (no extra text) and do **not** hand off further."""
    ),
    input_guardrails=[insurance_query_guardrail],
    handoffs=[image_analysis_agent],                   
    model="gpt-4.1",        
    output_type=InsuranceReport,
)

repair_shop_agent.handoffs    = [claim_orchestrator]

print("Agents made.")

## Step 6: Build and Run the Insurance Claim Workflow

Now, let's create the main workflow function that will orchestrate all of our agents to process an insurance claim.

In [None]:
def parse_customer_info_file(json_path: str) -> tuple[Customer, dict]:
    """Return a Customer instance *and* the raw dict from the JSON file."""
    with open(json_path, "r", encoding="utf‑8") as f:
        raw = json.load(f)

    # Map customer_* keys to field names using dictionary comprehension
    customer_kwargs = {
        field: raw[f"customer_{field}"]
        for field in [
            "first_name", "last_name", "street_address", "city", "state",
            "zip_code", "phone_number", "email", "car_year", "car_make",
            "car_model", "vin", "license_plate", "car_mileage", "car_color",
            "policy_number"
        ]
    }
    return Customer(**customer_kwargs), raw

async def process_insurance_claim() -> InsuranceReport:
    print("Loading customer information …")
    customer, raw_json = parse_customer_info_file(CLAIM_INFO_FILE_PATH)

    image_urls = []
    image_ids = list(uploaded_ids["uploaded_image_ids"].values())
    for image_id in image_ids:
        # Get the temporary download URL
        file_info = box_client.files.get_file_by_id(image_id, fields=["download_url"])
        print(f"Download URL: {file_info.download_url}")
        image_urls.append(file_info.download_url)
    

    customer_address = (
        f"{customer.street_address}, {customer.city}, "
        f"{customer.state} {customer.zip_code}"
    )
    
    query = f"""
    Process an insurance claim for a {customer.car_year} {customer.car_make} {customer.car_model} with damage.
    Customer is located at: {customer_address}
    Analyze these damage photos: {', '.join(image_urls)}
    Other information to use for the final object: {raw_json}
    """

    print("Running claim orchestrator …")
    with trace("Joke workflow"):
        result = await Runner.run(claim_orchestrator, query)

    # Strictly require a valid InsuranceReport.  If not, raise immediately.
    try:
        report: InsuranceReport = result.final_output_as(InsuranceReport)
        assert isinstance(report, InsuranceReport), type(report)
    except Exception as exc:
        print("❌ Orchestrator failed to return a valid InsuranceReport:\n", exc)
        raise

    if not report.claim.report_date or not report.claim.report_date.strip():
        # e.g. "2025‑04‑21"
        report.claim.report_date = datetime.date.today().isoformat()

    print("✅ Workflow finished – InsuranceReport ready")
    return report


# -------------------  Run it  -------------------
print("🚀 Starting complete insurance‑claim workflow …")
final_report: InsuranceReport = await process_insurance_claim()

## STEP 7: Generate the report
Generate and save pre-inspection report to Box

In [None]:
docgen_user_input: dict = final_report.model_dump(mode="json")

parts_list = docgen_user_input["claim"].get("damaged_parts_list", [])
docgen_user_input["claim"]["damaged_parts_list"] = ", ".join(parts_list)

print("📄  Submitting Box Doc Gen job …")

batch_job = box_client.docgen.create_docgen_batch_v2025_r0(
    file=FileReferenceV2025R0(id=uploaded_ids["template_file_id"]),
    input_source="api",
    destination_folder=CreateDocgenBatchV2025R0DestinationFolder(
        id=uploaded_ids["claim_folder_id"]
    ),
    output_type="pdf",
    document_generation_data=[
        DocGenDocumentGenerationDataV2025R0(
            generated_file_name=f"Pre‑Inspection‑Report‑{final_report.claim_number}",
            user_input=docgen_user_input,
        )
    ],
)

print(f"Doc Gen batch created (batch_id = {batch_job.id}).")

## STEP 8: Clean Up Dummy Content(Optional)
Remove original folder from Box. This will allow you to rerun the demo from Step 3 onward. 

In [None]:
def remove_demo_folder(client, folder_id):
    # Delete the Box folder and all sub-contents
    client.folders.delete_folder_by_id(folder_id, recursive=True)
    print(f"Deleted folder (ID: {folder_id}) and all sub-contents.")

# Delete folder by ID, including all subfolders/files
remove_demo_folder(box_client, uploaded_ids["base_folder_id"])

## Conclusion

In this demo, we've built a comprehensive insurance claim adjuster report system that combines the power of:

1. **Box's storage and document management** - for securely storing and accessing car damage photos as well as Box Doc Gen
2. **OpenAI's Agents SDK** - for creating specialized agents that can perform different tasks in the workflow
3. **OpenAI's Responses API** - for generating accurate and detailed analyses and reports

This workflow demonstrates how agentic systems can streamline and enhance complex business processes by:
- Automating the analysis of visual data
- Providing accurate cost estimates
- Finding relevant service providers
- Generating comprehensive reports

The same approach can be applied to many other business workflows that involve document processing, analysis, and decision-making.