# Fuel Bid Extraction Testing

This notebook tests the fuel bid extraction functionality using the Fleet AI backend services.
We'll extract structured data from fuel bid documents using LlamaIndex extraction agents.


## Setup and Imports


In [59]:
#!/usr/bin/env python3
import sys
import os
import json
from pathlib import Path
from io import BytesIO
from datetime import datetime
import traceback

# Fix the path - we need to go up to backend root, then into app
current_dir = Path.cwd()
backend_root = current_dir.parent.parent  # Go up from notebooks/model_testing to backend
app_path = backend_root / "app"

# Add the backend root to Python path so we can import app
sys.path.insert(0, str(backend_root))

print(f"🐍 Updated Python path: {sys.path[:3]}...")


🐍 Updated Python path: ['c:\\Users\\danse\\Projects\\FleetAI V1\\fleet-ai\\apps\\backend', 'c:\\Users\\danse\\Projects\\FleetAI V1\\fleet-ai\\apps\\backend', 'c:\\Users\\danse\\Projects\\FleetAI V1\\fleet-ai\\apps\\backend']...


#### Test import to make sure paths work

In [60]:
# Test import to make sure paths work
try:
    from fastapi import UploadFile
    from app.features.fuel.extractor import extract_fuel_bid
    print("✅ Successfully imported required modules")
except ImportError as e:
    print(f"❌ Import error: {e}")
    print("Check that the Python path is set correctly")


✅ Successfully imported required modules


## File Setup


In [61]:
# Find the test PDF file
test_file_path = Path("../test_files/BP Athens Bid Submission.pdf")
if test_file_path.exists():
    file_size = test_file_path.stat().st_size
    print(f"✅ File exists ({file_size/1024:.1f} KB)")
else:
    print("❌ Test file not found!")


✅ File exists (55.9 KB)


## Fuel Bid Extraction


In [62]:
# Simple approach - just like the API does it
if test_file_path.exists():
    try:
        # Import what we need
        from fastapi import UploadFile
        from app.features.fuel.extractor import extract_fuel_bid
                
        # Read the PDF file
        with open(test_file_path, 'rb') as f:
            file_content = f.read()
        
        # Create UploadFile object (like FastAPI does)
        upload_file = UploadFile(
            filename="BP Athens Bid Submission.pdf",
            file=BytesIO(file_content),
            size=len(file_content),
            headers={"content-type": "application/pdf"}
        )
                
        # Call the extractor (same as API route)
        result = extract_fuel_bid(upload_file)
                
    except Exception as e:
        print(f"❌ Error: {str(e)}")
        traceback.print_exc()
else:
    print("❌ Cannot run test - PDF file not found")


2025-08-25 13:50:16 | app.features.fuel.extractor | INFO | extractor.py:14 | 🔍 Starting fuel bid extraction...
2025-08-25 13:50:16 | app.utils.io.file_helpers | INFO | file_helpers.py:14 | 📁 Saved temp file: C:\Users\danse\AppData\Local\Temp\tmp1_l2uywk_BP Athens Bid Submission.pdf
2025-08-25 13:50:16 | app.services.document_extraction_service | INFO | document_extraction_service.py:40 | 📄 Fuel Bid saved to temporary location: C:\Users\danse\AppData\Local\Temp\tmp1_l2uywk_BP Athens Bid Submission.pdf
2025-08-25 13:50:16 | app.core.agents.extractor.llama_extractor_initializer | INFO | llama_extractor_initializer.py:56 | LlamaExtract config: target=ExtractTarget.PER_DOC mode=ExtractMode.BALANCED chunk_mode=ChunkMode.PAGE reasoning=False cite=False invalidate_cache=False
2025-08-25 13:50:16 | app.core.agents.extractor.llama_extractor_initializer | INFO | llama_extractor_initializer.py:61 | Initializing LlamaExtract client with project_id: f9b2e64d-491f-4d7e-bcb3-db1e2a73ef28
2025-08-25 13

KeyboardInterrupt: 

## Helper Functions


In [43]:
def create_upload_file(file_path: str) -> UploadFile:
    """
    Create an UploadFile object from a local file path.
    
    Args:
        file_path (str): Path to the file to upload
        
    Returns:
        UploadFile: FastAPI UploadFile object
    """
    if not os.path.exists(file_path):
        raise FileNotFoundError(f"File not found: {file_path}")
    
    # Read file content
    with open(file_path, 'rb') as f:
        file_content = f.read()
    
    # Determine content type based on extension
    file_extension = Path(file_path).suffix.lower()
    content_type_mapping = {
        '.pdf': 'application/pdf',
        '.docx': 'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
        '.doc': 'application/msword',
        '.txt': 'text/plain'
    }
    content_type = content_type_mapping.get(file_extension, 'application/octet-stream')
    
    # Create UploadFile object
    upload_file = UploadFile(
        filename=os.path.basename(file_path),
        file=BytesIO(file_content),
        size=len(file_content),
        headers={"content-type": content_type}
    )
    
    print(f"📄 Created UploadFile: {upload_file.filename} ({content_type}, {len(file_content)} bytes)")
    return upload_file


def save_extraction_result(result, output_file: str = "fuel_bid_extraction_result.json"):
    """
    Save extraction result to a JSON file with proper serialization.
    
    Args:
        result: The extraction result from the fuel bid extractor
        output_file (str): Output filename
    """
    try:
        # Extract data from result
        if hasattr(result, 'data'):
            result_data = result.data
        else:
            result_data = result
        
        # Convert to dict for JSON serialization
        if hasattr(result_data, 'dict'):
            # Pydantic model
            json_data = result_data.dict()
        elif hasattr(result_data, 'model_dump'):
            # Pydantic v2 model
            json_data = result_data.model_dump()
        elif hasattr(result_data, '__dict__'):
            # Regular object
            json_data = result_data.__dict__
        else:
            # Already a dict or primitive
            json_data = result_data
        
        # Add metadata
        output_data = {
            "extraction_timestamp": datetime.now().isoformat(),
            "result_type": str(type(result_data)),
            "data": json_data
        }
        
        # Save to file
        output_path = Path.cwd() / output_file
        with open(output_path, 'w', encoding='utf-8') as f:
            json.dump(output_data, f, indent=2, default=str, ensure_ascii=False)
        
        print(f"💾 Extraction result saved to: {output_path}")
        return output_path
        
    except Exception as e:
        print(f"⚠️ Could not save result to JSON: {e}")
        # Fallback: save as string representation
        try:
            fallback_path = Path.cwd() / f"{output_file}.txt"
            with open(fallback_path, 'w', encoding='utf-8') as f:
                f.write(f"Extraction Result ({datetime.now().isoformat()})\n")
                f.write("=" * 50 + "\n")
                f.write(str(result))
            print(f"💾 Result saved as text to: {fallback_path}")
            return fallback_path
        except Exception as e2:
            print(f"❌ Failed to save result: {e2}")
            return None

print("✅ Helper functions defined successfully")


✅ Helper functions defined successfully


In [44]:
def display_extraction_summary(result):
    """
    Display a formatted summary of the extraction results.
    
    Args:
        result: The extraction result from the fuel bid extractor
    """
    print("\n" + "=" * 60)
    print("📊 FUEL BID EXTRACTION SUMMARY")
    print("=" * 60)
    
    try:
        # Extract the data
        if hasattr(result, 'data'):
            data = result.data
        else:
            data = result
        
        print(f"📋 Result type: {type(data)}")
        
        # Convert to dict for easier access
        if hasattr(data, 'dict'):
            data_dict = data.dict()
        elif hasattr(data, 'model_dump'):
            data_dict = data.model_dump()
        elif hasattr(data, '__dict__'):
            data_dict = data.__dict__
        elif isinstance(data, dict):
            data_dict = data
        else:
            print(f"⚠️ Unable to extract data dictionary from: {type(data)}")
            return
        
        # Display key information
        print(f"\n🏢 VENDOR INFORMATION:")
        vendor_info = data_dict.get('vendor', {})
        if isinstance(vendor_info, dict):
            print(f"   Company: {vendor_info.get('name', 'N/A')}")
            print(f"   Contact: {vendor_info.get('contact_name', 'N/A')}")
            print(f"   Email: {vendor_info.get('contact_email', 'N/A')}")
            print(f"   Phone: {vendor_info.get('contact_phone', 'N/A')}")
        else:
            print(f"   {vendor_info}")
        
        print(f"\n📝 BID DETAILS:")
        print(f"   Title: {data_dict.get('title', 'N/A')}")
        print(f"   Fuel Type: {data_dict.get('fuel_type', 'N/A')}")
        print(f"   Price Type: {data_dict.get('price_type', 'N/A')}")
        print(f"   Currency: {data_dict.get('currency', 'N/A')}")
        print(f"   Unit of Measure: {data_dict.get('uom', 'N/A')}")
        
        print(f"\n💰 PRICING:")
        base_price = data_dict.get('base_unit_price')
        if base_price is not None:
            currency = data_dict.get('currency', '')
            uom = data_dict.get('uom', '')
            print(f"   Base Unit Price: {base_price} {currency}/{uom}")
        
        # Index-based pricing
        index_name = data_dict.get('index_name')
        if index_name:
            print(f"   Index: {index_name}")
            print(f"   Index Location: {data_dict.get('index_location', 'N/A')}")
            differential = data_dict.get('differential')
            if differential is not None:
                print(f"   Differential: {differential} {data_dict.get('differential_unit', '')}")
        
        # Fees
        into_plane_fee = data_dict.get('into_plane_fee')
        handling_fee = data_dict.get('handling_fee')
        if into_plane_fee or handling_fee:
            print(f"\n💳 FEES:")
            if into_plane_fee:
                print(f"   Into-plane Fee: {into_plane_fee}")
            if handling_fee:
                print(f"   Handling Fee: {handling_fee}")
        
        print(f"\n📅 VALIDITY:")
        print(f"   Start Date: {data_dict.get('validity_start_date', 'N/A')}")
        print(f"   End Date: {data_dict.get('validity_end_date', 'N/A')}")
        print(f"   Payment Terms: {data_dict.get('payment_terms', 'N/A')}")
        
        # AI Summary
        ai_summary = data_dict.get('ai_summary')
        if ai_summary:
            print(f"\n🤖 AI SUMMARY:")
            print(f"   {ai_summary}")
        
        # Show all extracted fields
        print(f"\n🔍 ALL EXTRACTED FIELDS:")
        for key, value in data_dict.items():
            if value is not None and value != '':
                if isinstance(value, dict):
                    print(f"   {key}: [Object with {len(value)} fields]")
                else:
                    # Truncate long values
                    value_str = str(value)
                    if len(value_str) > 100:
                        value_str = value_str[:97] + "..."
                    print(f"   {key}: {value_str}")
        
    except Exception as e:
        print(f"❌ Error displaying summary: {e}")
        traceback.print_exc()
    
    print("=" * 60)

print("✅ Display function defined successfully")


✅ Display function defined successfully


# Display Result

In [58]:
display_extraction_summary(result)


📊 FUEL BID EXTRACTION SUMMARY
📋 Result type: <class 'dict'>

🏢 VENDOR INFORMATION:
   Company: BP Aviation
   Contact: None
   Email: aviation.sales@bp.com
   Phone: +44 20 0000 0000

📝 BID DETAILS:
   Title: Fuel Supply Proposal – Athens International Airport (ATH)
   Fuel Type: Jet A-1
   Price Type: fixed
   Currency: USD
   Unit of Measure: USG

💰 PRICING:
   Base Unit Price: 3.185 USD/USG

💳 FEES:
   Into-plane Fee: 0.035
   Handling Fee: 0.005

📅 VALIDITY:
   Start Date: 2025-08-15
   End Date: 2025-09-30
   Payment Terms: Net 30 days from invoice date

🤖 AI SUMMARY:
   BP Aviation offers a fixed price of USD 3.185 per USG for Jet A-1 fuel at Athens International Airport, valid from 15 August to 30 September 2025. Into-plane and handling fees are USD 0.035 and USD 0.005 per USG, respectively, with taxes and airport fees excluded. Payment terms are net 30 days from invoice date, and there is no take-or-pay for Year 1. The fuel meets ASTM D1655 and DEF STAN 91-91 standards, and a 