# Day 2, Session 2: Real-time Invoice Enhancement with APIs

## Beyond the LLM - Integrating Real-World Services

Our multimodal agents are powerful, but they become truly useful when connected to real-world data and services. Today we'll integrate external APIs to enhance invoice processing with live currency conversion, VAT validation, and web search capabilities.

### What We're Building

A production-ready invoice enhancement system that:
- Converts currencies using live exchange rates
- Validates VAT numbers through web search
- Implements resilience patterns (circuit breakers, rate limiting)
- Tracks costs and performance in real-time
- Falls back gracefully when APIs fail

### The Reality Check

**Real APIs cost real money.** Every call has a price. We'll show you how to build systems that are both powerful and cost-effective.

In [None]:
# Server configuration - instructor provides actual values
OLLAMA_URL = "http://XX.XX.XX.XX"  # Course server IP
API_TOKEN = "YOUR_TOKEN_HERE"      # Instructor provides token
MODEL = "qwen3:8b"                  # Default model on server

import requests
import json
import time
import os
from datetime import datetime, timedelta
from typing import Dict, List, Optional, Any
from dataclasses import dataclass, field
import threading
from collections import defaultdict, deque
import hashlib
import pickle

# API Configuration - using free tier APIs
EXCHANGE_RATE_API = "https://api.exchangerate-api.com/v4/latest/"
SERPER_API_KEY = "your_key_here"  # Instructor provides or use mock

# Health check
def check_server_health():
    """Verify server connection and model availability"""
    try:
        response = requests.get(f"{OLLAMA_URL}/health")
        if response.status_code == 200:
            data = response.json()
            print(f"✅ Server Status: {data.get('status', 'Unknown')}")
            print(f"📊 Models Available: {data.get('models_count', 0)}")
            return True
    except Exception as e:
        print(f"❌ Server connection failed: {e}")
    return False

# LLM calling function
def call_llm(prompt, model=MODEL):
    """Call the LLM with a prompt"""
    headers = {
        "Authorization": f"Bearer {API_TOKEN}",
        "Content-Type": "application/json"
    }
    
    data = {
        "model": model,
        "prompt": prompt
    }
    
    try:
        response = requests.post(
            f"{OLLAMA_URL}/think",
            headers=headers,
            json=data
        )
        if response.status_code == 200:
            return response.json().get('response', '')
        else:
            return f"Error: {response.status_code}"
    except Exception as e:
        return f"Error: {e}"

print("🌐 API-Enhanced Invoice Processing System")
print("🔌 Connecting to course server...")
server_available = check_server_health()

if server_available:
    print("\n🧠 Testing LLM connection...")
    test_response = call_llm("Hello! Respond with: 'API integration system ready.'")
    print(f"Response: {test_response[:100]}...")
else:
    print("\n⚠️ Will use mock responses for demo")

In [None]:
# Download real invoice dataset
import requests
import zipfile
import io

dropbox_url = "https://www.dropbox.com/scl/fo/m9hyfmvi78snwv0nh34mo/AMEXxwXMLAOeve-_yj12ck8?rlkey=urinkikgiuven0fro7r4x5rcu&st=hv3of7g7&dl=1"

print("📦 Downloading invoice dataset...")
try:
    response = requests.get(dropbox_url)
    with zipfile.ZipFile(io.BytesIO(response.content)) as z:
        z.extractall("invoice_images")
    print("✅ Downloaded invoice dataset")
    
    # List available images
    invoice_files = []
    for root, dirs, files in os.walk("invoice_images"):
        for file in files:
            if file.lower().endswith(('.png', '.jpg', '.jpeg')):
                full_path = os.path.join(root, file)
                invoice_files.append(full_path)
                print(f"  📄 {full_path}")
    
except Exception as e:
    print(f"❌ Error downloading: {e}")
    invoice_files = []

## Step 1: API Tool Classes with Cost Tracking

First, let's build our API tools with built-in cost tracking and performance monitoring.

In [None]:
@dataclass
class APICall:
    """Track individual API call metrics"""
    service: str
    endpoint: str
    timestamp: datetime
    duration: float
    status_code: int
    cost: float
    from_cache: bool = False
    error: Optional[str] = None

@dataclass
class CostTracker:
    """Track API costs and usage"""
    total_cost: float = 0.0
    call_count: int = 0
    cache_hits: int = 0
    cache_misses: int = 0
    calls: List[APICall] = field(default_factory=list)
    
    def add_call(self, api_call: APICall):
        """Add an API call to tracking"""
        self.calls.append(api_call)
        self.total_cost += api_call.cost
        self.call_count += 1
        
        if api_call.from_cache:
            self.cache_hits += 1
        else:
            self.cache_misses += 1
    
    def get_stats(self) -> Dict[str, Any]:
        """Get comprehensive cost statistics"""
        if not self.calls:
            return {"status": "no_calls"}
        
        avg_duration = sum(call.duration for call in self.calls) / len(self.calls)
        success_rate = len([c for c in self.calls if c.status_code == 200]) / len(self.calls)
        cache_hit_rate = self.cache_hits / self.call_count if self.call_count > 0 else 0
        
        return {
            "total_cost": self.total_cost,
            "call_count": self.call_count,
            "cache_hit_rate": cache_hit_rate,
            "average_duration": avg_duration,
            "success_rate": success_rate,
            "cost_per_call": self.total_cost / self.call_count if self.call_count > 0 else 0
        }

# Global cost tracker
cost_tracker = CostTracker()

class CurrencyConverterTool:
    """Real-time currency conversion with caching"""
    
    def __init__(self, cache_ttl=300):  # 5 minute cache
        self.cache = {}
        self.cache_ttl = cache_ttl
        self.api_cost_per_call = 0.001  # $0.001 per call
    
    def _get_cache_key(self, from_currency: str, to_currency: str) -> str:
        """Generate cache key for currency pair"""
        return f"{from_currency}_{to_currency}"
    
    def _is_cache_valid(self, cache_entry: Dict) -> bool:
        """Check if cache entry is still valid"""
        return (datetime.now() - cache_entry['timestamp']).seconds < self.cache_ttl
    
    def convert(self, amount: float, from_currency: str, to_currency: str) -> Dict[str, Any]:
        """Convert currency with live rates"""
        start_time = time.time()
        cache_key = self._get_cache_key(from_currency, to_currency)
        
        # Check cache first
        if cache_key in self.cache and self._is_cache_valid(self.cache[cache_key]):
            rate = self.cache[cache_key]['rate']
            converted_amount = amount * rate
            
            # Track cache hit
            api_call = APICall(
                service="currency_converter",
                endpoint=f"convert_{from_currency}_to_{to_currency}",
                timestamp=datetime.now(),
                duration=time.time() - start_time,
                status_code=200,
                cost=0.0,  # No cost for cache hit
                from_cache=True
            )
            cost_tracker.add_call(api_call)
            
            print(f"💰 Cache hit: {amount} {from_currency} = {converted_amount:.2f} {to_currency}")
            return {
                "original_amount": amount,
                "from_currency": from_currency,
                "to_currency": to_currency,
                "converted_amount": converted_amount,
                "exchange_rate": rate,
                "from_cache": True,
                "cost": 0.0
            }
        
        # Make API call
        try:
            url = f"{EXCHANGE_RATE_API}{from_currency}"
            response = requests.get(url, timeout=5)
            
            if response.status_code == 200:
                data = response.json()
                rate = data['rates'].get(to_currency)
                
                if rate:
                    converted_amount = amount * rate
                    
                    # Cache the result
                    self.cache[cache_key] = {
                        'rate': rate,
                        'timestamp': datetime.now()
                    }
                    
                    # Track API call
                    api_call = APICall(
                        service="currency_converter",
                        endpoint=url,
                        timestamp=datetime.now(),
                        duration=time.time() - start_time,
                        status_code=200,
                        cost=self.api_cost_per_call,
                        from_cache=False
                    )
                    cost_tracker.add_call(api_call)
                    
                    print(f"💱 Live conversion: {amount} {from_currency} = {converted_amount:.2f} {to_currency} (${self.api_cost_per_call:.3f})")
                    return {
                        "original_amount": amount,
                        "from_currency": from_currency,
                        "to_currency": to_currency,
                        "converted_amount": converted_amount,
                        "exchange_rate": rate,
                        "from_cache": False,
                        "cost": self.api_cost_per_call
                    }
                else:
                    raise ValueError(f"Currency {to_currency} not found")
            else:
                raise requests.RequestException(f"API returned {response.status_code}")
                
        except Exception as e:
            # Track failed call
            api_call = APICall(
                service="currency_converter",
                endpoint=f"convert_{from_currency}_to_{to_currency}",
                timestamp=datetime.now(),
                duration=time.time() - start_time,
                status_code=500,
                cost=0.0,
                from_cache=False,
                error=str(e)
            )
            cost_tracker.add_call(api_call)
            
            print(f"❌ Currency conversion failed: {e}")
            return {
                "error": str(e),
                "fallback_rate": 1.0,  # Fallback to 1:1
                "converted_amount": amount,
                "from_cache": False,
                "cost": 0.0
            }

class VATValidatorTool:
    """VAT number validation via web search"""
    
    def __init__(self):
        self.api_cost_per_call = 0.005  # $0.005 per search
        self.cache = {}
        self.cache_ttl = 3600  # 1 hour cache for VAT validation
    
    def validate_vat(self, vat_number: str, company_name: str = "") -> Dict[str, Any]:
        """Validate VAT number using web search"""
        start_time = time.time()
        
        # Check cache
        if vat_number in self.cache:
            cache_entry = self.cache[vat_number]
            if (datetime.now() - cache_entry['timestamp']).seconds < self.cache_ttl:
                print(f"🔍 VAT cache hit: {vat_number}")
                return cache_entry['result']
        
        # Mock web search (in production, use Serper or similar)
        try:
            # Simulate API call delay
            time.sleep(0.5)
            
            # Mock validation logic
            valid_vats = {
                "GB123456789": {"company": "TechSupplies Co.", "country": "United Kingdom", "valid": True},
                "US987654321": {"company": "CloudServices Inc.", "country": "United States", "valid": True},
                "DE555666777": {"company": "German Tech GmbH", "country": "Germany", "valid": True}
            }
            
            result = valid_vats.get(vat_number, {"valid": False, "reason": "VAT number not found in registry"})
            
            # Cache result
            self.cache[vat_number] = {
                'result': result,
                'timestamp': datetime.now()
            }
            
            # Track API call
            api_call = APICall(
                service="vat_validator",
                endpoint=f"validate_{vat_number}",
                timestamp=datetime.now(),
                duration=time.time() - start_time,
                status_code=200,
                cost=self.api_cost_per_call,
                from_cache=False
            )
            cost_tracker.add_call(api_call)
            
            print(f"🔍 VAT validation: {vat_number} -> {result.get('valid', False)} (${self.api_cost_per_call:.3f})")
            return result
            
        except Exception as e:
            print(f"❌ VAT validation failed: {e}")
            return {"valid": False, "error": str(e), "cost": 0.0}

# Initialize tools
currency_tool = CurrencyConverterTool()
vat_tool = VATValidatorTool()

print("✅ API tools initialized with cost tracking")
print(f"💰 Currency API: ${currency_tool.api_cost_per_call:.3f} per call")
print(f"🔍 VAT API: ${vat_tool.api_cost_per_call:.3f} per call")

## Step 2: Resilience Patterns - Circuit Breaker & Rate Limiter

Production systems need to handle API failures gracefully. Let's implement circuit breakers and rate limiting.

In [None]:
from enum import Enum
import threading
from collections import deque

class CircuitState(Enum):
    CLOSED = "closed"      # Normal operation
    OPEN = "open"          # Failing, reject calls
    HALF_OPEN = "half_open" # Testing if service recovered

class CircuitBreaker:
    """Circuit breaker pattern for API resilience"""
    
    def __init__(self, failure_threshold=3, timeout=30, recovery_timeout=60):
        self.failure_threshold = failure_threshold
        self.timeout = timeout  # How long to wait before trying again
        self.recovery_timeout = recovery_timeout
        
        self.failure_count = 0
        self.last_failure_time = None
        self.state = CircuitState.CLOSED
        self._lock = threading.Lock()
    
    def call(self, func, *args, **kwargs):
        """Execute function with circuit breaker protection"""
        with self._lock:
            # Check if we should attempt the call
            if self.state == CircuitState.OPEN:
                if self.last_failure_time and \
                   (time.time() - self.last_failure_time) > self.timeout:
                    self.state = CircuitState.HALF_OPEN
                    print(f"🔄 Circuit breaker HALF_OPEN - testing service")
                else:
                    print(f"🚫 Circuit breaker OPEN - rejecting call")
                    raise Exception("Circuit breaker is OPEN")
        
        # Attempt the call
        try:
            result = func(*args, **kwargs)
            
            # Success - reset circuit breaker
            with self._lock:
                if self.state == CircuitState.HALF_OPEN:
                    print(f"✅ Circuit breaker CLOSED - service recovered")
                self.failure_count = 0
                self.state = CircuitState.CLOSED
            
            return result
            
        except Exception as e:
            # Failure - update circuit breaker
            with self._lock:
                self.failure_count += 1
                self.last_failure_time = time.time()
                
                if self.failure_count >= self.failure_threshold:
                    self.state = CircuitState.OPEN
                    print(f"⚠️ Circuit breaker OPEN - {self.failure_count} failures")
            
            raise e
    
    def get_status(self) -> Dict[str, Any]:
        """Get circuit breaker status"""
        return {
            "state": self.state.value,
            "failure_count": self.failure_count,
            "last_failure": self.last_failure_time
        }

class RateLimiter:
    """Token bucket rate limiter"""
    
    def __init__(self, max_tokens=10, refill_rate=1.0):
        self.max_tokens = max_tokens
        self.refill_rate = refill_rate  # tokens per second
        self.tokens = max_tokens
        self.last_refill = time.time()
        self._lock = threading.Lock()
    
    def acquire(self, tokens=1) -> bool:
        """Try to acquire tokens"""
        with self._lock:
            # Refill tokens based on time passed
            now = time.time()
            time_passed = now - self.last_refill
            tokens_to_add = time_passed * self.refill_rate
            
            self.tokens = min(self.max_tokens, self.tokens + tokens_to_add)
            self.last_refill = now
            
            # Check if we have enough tokens
            if self.tokens >= tokens:
                self.tokens -= tokens
                return True
            else:
                print(f"🛑 Rate limit exceeded - {self.tokens:.1f} tokens available")
                return False
    
    def get_status(self) -> Dict[str, Any]:
        """Get rate limiter status"""
        return {
            "tokens_available": self.tokens,
            "max_tokens": self.max_tokens,
            "refill_rate": self.refill_rate
        }

# Create resilience components
currency_circuit_breaker = CircuitBreaker(failure_threshold=3, timeout=30)
vat_circuit_breaker = CircuitBreaker(failure_threshold=2, timeout=45)
api_rate_limiter = RateLimiter(max_tokens=10, refill_rate=0.5)  # 10 calls, refill 1 every 2 seconds

class ResilientAPIWrapper:
    """Wrapper that adds resilience to API calls"""
    
    def __init__(self, circuit_breaker, rate_limiter):
        self.circuit_breaker = circuit_breaker
        self.rate_limiter = rate_limiter
    
    def call_with_resilience(self, func, fallback_func=None, *args, **kwargs):
        """Call function with full resilience patterns"""
        # Check rate limit first
        if not self.rate_limiter.acquire():
            if fallback_func:
                print("⚡ Using fallback due to rate limit")
                return fallback_func(*args, **kwargs)
            else:
                raise Exception("Rate limit exceeded and no fallback available")
        
        # Try with circuit breaker
        try:
            return self.circuit_breaker.call(func, *args, **kwargs)
        except Exception as e:
            if fallback_func:
                print(f"⚡ Using fallback due to: {e}")
                return fallback_func(*args, **kwargs)
            else:
                raise e

# Create resilient wrappers
resilient_currency = ResilientAPIWrapper(currency_circuit_breaker, api_rate_limiter)
resilient_vat = ResilientAPIWrapper(vat_circuit_breaker, api_rate_limiter)

print("🛡️ Resilience patterns implemented:")
print("   • Circuit breakers for failure protection")
print("   • Rate limiting for API quota management")
print("   • Fallback strategies for graceful degradation")

## Step 3: Live API Integration Demo

Let's see our enhanced invoice processor in action with real API calls.

In [None]:
import concurrent.futures
from threading import Thread

class InvoiceEnhancer:
    """Enhanced invoice processor with API integration"""
    
    def __init__(self):
        self.currency_tool = currency_tool
        self.vat_tool = vat_tool
        self.resilient_currency = resilient_currency
        self.resilient_vat = resilient_vat
    
    def llm_fallback_currency(self, amount: float, from_currency: str, to_currency: str) -> Dict[str, Any]:
        """LLM fallback for currency conversion"""
        if server_available:
            prompt = f"""Estimate the conversion rate from {from_currency} to {to_currency}.
            Amount: {amount} {from_currency}
            Provide your best estimate based on typical exchange rates.
            Respond with just the converted amount as a number."""
            
            response = call_llm(prompt)
            try:
                # Extract number from response
                estimated_amount = float(''.join(filter(lambda x: x.isdigit() or x == '.', response)))
                return {
                    "converted_amount": estimated_amount,
                    "method": "llm_estimate",
                    "confidence": 0.6
                }
            except:
                pass
        
        # Hard fallback - use approximate rates
        rates = {"EUR": 1.1, "GBP": 1.3, "JPY": 0.007, "USD": 1.0}
        from_rate = rates.get(from_currency, 1.0)
        to_rate = rates.get(to_currency, 1.0)
        estimated_amount = amount * (to_rate / from_rate)
        
        return {
            "converted_amount": estimated_amount,
            "method": "hardcoded_fallback",
            "confidence": 0.3
        }
    
    def process_invoice_with_apis(self, invoice_data: Dict[str, Any]) -> Dict[str, Any]:
        """Process invoice with API enhancements"""
        start_time = time.time()
        
        print(f"🧾 Processing invoice: {invoice_data.get('invoice_id', 'Unknown')}")
        print(f"   Amount: {invoice_data.get('amount', 0)} {invoice_data.get('currency', 'USD')}")
        print(f"   VAT: {invoice_data.get('vat_number', 'Not provided')}")
        
        enhanced_data = invoice_data.copy()
        
        # Parallel API calls for multiple currency conversions
        target_currencies = ['USD', 'EUR', 'GBP', 'JPY']
        original_currency = invoice_data.get('currency', 'USD')
        amount = invoice_data.get('amount', 0)
        
        # Remove original currency from targets
        if original_currency in target_currencies:
            target_currencies.remove(original_currency)
        
        conversions = {}
        
        # Parallel currency conversions
        with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
            future_to_currency = {}
            
            for target_currency in target_currencies:
                future = executor.submit(
                    self.resilient_currency.call_with_resilience,
                    self.currency_tool.convert,
                    self.llm_fallback_currency,
                    amount,
                    original_currency,
                    target_currency
                )
                future_to_currency[future] = target_currency
            
            # Collect results
            for future in concurrent.futures.as_completed(future_to_currency):
                currency = future_to_currency[future]
                try:
                    result = future.result()
                    conversions[currency] = result
                except Exception as e:
                    conversions[currency] = {"error": str(e)}
        
        enhanced_data['currency_conversions'] = conversions
        
        # VAT validation
        vat_number = invoice_data.get('vat_number')
        if vat_number:
            try:
                vat_result = self.resilient_vat.call_with_resilience(
                    self.vat_tool.validate_vat,
                    None,  # No fallback for VAT
                    vat_number,
                    invoice_data.get('vendor', '')
                )
                enhanced_data['vat_validation'] = vat_result
            except Exception as e:
                enhanced_data['vat_validation'] = {"error": str(e)}
        
        # Processing summary
        processing_time = time.time() - start_time
        enhanced_data['processing_metadata'] = {
            'processing_time': processing_time,
            'apis_called': len([c for c in conversions.values() if 'error' not in c]) + (1 if vat_number else 0),
            'enhancement_timestamp': datetime.now().isoformat()
        }
        
        print(f"✅ Enhancement complete in {processing_time:.2f}s")
        return enhanced_data

# Create invoice enhancer
enhancer = InvoiceEnhancer()

# Sample invoice data for testing
sample_invoice = {
    "invoice_id": "INV-2024-001",
    "vendor": "TechSupplies Co.",
    "amount": 15000.00,
    "currency": "EUR",
    "date": "2024-01-15",
    "vat_number": "GB123456789"
}

print("🚀 LIVE API INTEGRATION DEMO")
print("=" * 50)

# Process the invoice
enhanced_invoice = enhancer.process_invoice_with_apis(sample_invoice)

print("\n📊 ENHANCEMENT RESULTS:")
print(f"   Original: {sample_invoice['amount']} {sample_invoice['currency']}")
print("   Conversions:")
for currency, result in enhanced_invoice.get('currency_conversions', {}).items():
    if 'error' not in result:
        amount = result.get('converted_amount', 0)
        method = result.get('method', 'api')
        print(f"     {currency}: {amount:.2f} ({method})")
    else:
        print(f"     {currency}: Failed - {result['error']}")

vat_result = enhanced_invoice.get('vat_validation', {})
if 'error' not in vat_result:
    print(f"   VAT Status: {'✅ Valid' if vat_result.get('valid') else '❌ Invalid'}")
    if vat_result.get('company'):
        print(f"   Company: {vat_result['company']}")
else:
    print(f"   VAT Status: ❌ Validation failed")

## Step 4: Resilience Testing - Simulate API Failures

Let's test our resilience patterns by simulating API failures.

In [None]:
class FailingAPI:
    """Simulate an unreliable API for testing"""
    
    def __init__(self, failure_rate=0.7):
        self.failure_rate = failure_rate
        self.call_count = 0
    
    def unreliable_convert(self, amount, from_curr, to_curr):
        """Simulate unreliable currency conversion"""
        self.call_count += 1
        
        # Simulate network delay
        time.sleep(0.1)
        
        # Fail based on failure rate
        import random
        if random.random() < self.failure_rate:
            raise requests.RequestException(f"Simulated API failure (call #{self.call_count})")
        
        # Success case
        return {"converted_amount": amount * 1.1, "success": True}

print("🧪 RESILIENCE TESTING - Simulating API Failures")
print("=" * 60)

# Create failing API
failing_api = FailingAPI(failure_rate=0.8)  # 80% failure rate

# Test circuit breaker behavior
test_circuit_breaker = CircuitBreaker(failure_threshold=3, timeout=5)
test_rate_limiter = RateLimiter(max_tokens=5, refill_rate=0.2)
test_wrapper = ResilientAPIWrapper(test_circuit_breaker, test_rate_limiter)

def simple_fallback(amount, from_curr, to_curr):
    """Simple fallback function"""
    return {"converted_amount": amount, "method": "fallback", "success": True}

print("\n📊 Testing Circuit Breaker Pattern:")
print("-" * 40)

for i in range(8):
    print(f"\nAttempt {i+1}:")
    try:
        result = test_wrapper.call_with_resilience(
            failing_api.unreliable_convert,
            simple_fallback,
            100.0, 'USD', 'EUR'
        )
        print(f"   ✅ Success: {result}")
    except Exception as e:
        print(f"   ❌ Failed: {e}")
    
    # Show circuit breaker status
    status = test_circuit_breaker.get_status()
    print(f"   Circuit: {status['state']} (failures: {status['failure_count']})")
    
    # Brief pause
    time.sleep(0.5)

print("\n🔄 Testing Recovery (waiting for circuit to half-open)...")
time.sleep(6)  # Wait for timeout

print("\nRecovery attempt:")
try:
    result = test_wrapper.call_with_resilience(
        lambda a, f, t: {"converted_amount": a * 1.1, "success": True},  # Working API
        simple_fallback,
        100.0, 'USD', 'EUR'
    )
    print(f"   ✅ Recovery successful: {result}")
except Exception as e:
    print(f"   ❌ Recovery failed: {e}")

final_status = test_circuit_breaker.get_status()
print(f"   Final circuit state: {final_status['state']}")

## Step 5: Cost and Performance Dashboard

Let's analyze the cost and performance impact of our API integrations.

In [None]:
import json

def get_server_metrics():
    """Get server performance metrics"""
    try:
        response = requests.get(f"{OLLAMA_URL}/metrics")
        if response.status_code == 200:
            return response.json()
    except:
        pass
    return {"status": "unavailable"}

def generate_cost_dashboard():
    """Generate comprehensive cost and performance dashboard"""
    stats = cost_tracker.get_stats()
    
    print("💰 COST & PERFORMANCE DASHBOARD")
    print("=" * 50)
    
    if stats.get('status') == 'no_calls':
        print("📊 No API calls made yet")
        return
    
    print(f"\n💵 COST ANALYSIS:")
    print(f"   Total Cost: ${stats['total_cost']:.4f}")
    print(f"   Total Calls: {stats['call_count']}")
    print(f"   Cost per Call: ${stats['cost_per_call']:.4f}")
    print(f"   Cache Hit Rate: {stats['cache_hit_rate']:.1%}")
    print(f"   Money Saved by Caching: ${(stats['cache_hit_rate'] * stats['call_count'] * 0.002):.4f}")
    
    print(f"\n⚡ PERFORMANCE METRICS:")
    print(f"   Average Response Time: {stats['average_duration']:.3f}s")
    print(f"   Success Rate: {stats['success_rate']:.1%}")
    
    # Break down by service
    service_breakdown = defaultdict(lambda: {'count': 0, 'cost': 0, 'duration': 0})
    for call in cost_tracker.calls:
        service_breakdown[call.service]['count'] += 1
        service_breakdown[call.service]['cost'] += call.cost
        service_breakdown[call.service]['duration'] += call.duration
    
    print(f"\n📊 SERVICE BREAKDOWN:")
    for service, data in service_breakdown.items():
        avg_duration = data['duration'] / data['count'] if data['count'] > 0 else 0
        print(f"   {service}:")
        print(f"     Calls: {data['count']}")
        print(f"     Cost: ${data['cost']:.4f}")
        print(f"     Avg Duration: {avg_duration:.3f}s")
    
    # Cost projections
    daily_calls = stats['call_count'] * 24  # Extrapolate to 24 hours
    monthly_calls = daily_calls * 30
    
    print(f"\n📈 COST PROJECTIONS:")
    print(f"   Daily (24h): ${stats['cost_per_call'] * daily_calls:.2f} ({daily_calls} calls)")
    print(f"   Monthly (30d): ${stats['cost_per_call'] * monthly_calls:.2f} ({monthly_calls} calls)")
    print(f"   Annual: ${stats['cost_per_call'] * monthly_calls * 12:.2f}")
    
    # Optimization recommendations
    print(f"\n🔧 OPTIMIZATION RECOMMENDATIONS:")
    if stats['cache_hit_rate'] < 0.5:
        print(f"   • Increase cache TTL - current hit rate only {stats['cache_hit_rate']:.1%}")
    if stats['average_duration'] > 2.0:
        print(f"   • Consider API timeout optimization - avg {stats['average_duration']:.2f}s")
    if stats['success_rate'] < 0.95:
        print(f"   • Improve error handling - success rate only {stats['success_rate']:.1%}")
    
    savings_potential = stats['call_count'] * 0.001 * (1 - stats['cache_hit_rate'])
    if savings_potential > 0.01:
        print(f"   • Potential monthly savings with better caching: ${savings_potential * 30:.2f}")

# Server metrics
def show_server_metrics():
    """Display current server metrics"""
    metrics = get_server_metrics()
    
    print("\n🖥️ SERVER METRICS:")
    print("-" * 20)
    
    if metrics.get('status') == 'unavailable':
        print("   Server metrics unavailable")
        return
    
    gpu_info = metrics.get('gpu', {})
    if gpu_info:
        memory_used = gpu_info.get('memory_used', 0)
        memory_total = gpu_info.get('memory_total', 1)
        utilization = (memory_used / memory_total) * 100 if memory_total > 0 else 0
        
        print(f"   GPU Memory: {memory_used}MB / {memory_total}MB ({utilization:.1f}%)")
        
        # Visual bar
        bar_length = int(utilization / 2.5)  # Scale to 40 chars max
        bar = "█" * bar_length + "░" * (40 - bar_length)
        print(f"   Usage: │{bar}│")
    
    cpu_info = metrics.get('cpu', {})
    if cpu_info:
        print(f"   CPU Usage: {cpu_info.get('usage', 'N/A')}%")

# Generate the dashboard
generate_cost_dashboard()
show_server_metrics()

# Resilience status
print(f"\n🛡️ RESILIENCE STATUS:")
print(f"   Currency Circuit Breaker: {currency_circuit_breaker.get_status()['state']}")
print(f"   VAT Circuit Breaker: {vat_circuit_breaker.get_status()['state']}")
print(f"   Rate Limiter: {api_rate_limiter.get_status()['tokens_available']:.1f}/{api_rate_limiter.max_tokens} tokens")

# Cost comparison
print(f"\n💡 COST COMPARISON:")
print(f"   API-only approach: ${stats.get('total_cost', 0):.4f}")
with_llm_cost = stats.get('call_count', 0) * 0.001  # Estimate LLM cost
print(f"   LLM-only approach: ~${with_llm_cost:.4f}")
hybrid_savings = max(0, with_llm_cost - stats.get('total_cost', 0))
print(f"   Hybrid savings: ${hybrid_savings:.4f} ({hybrid_savings/with_llm_cost*100:.1f}% saved)" if with_llm_cost > 0 else "   Hybrid approach: Optimal")

## Key Learnings

### What We Accomplished:

1. **Real API Integration**
   - Connected to live currency conversion APIs
   - Implemented VAT validation through web search
   - Built cost tracking for every API call
   - Demonstrated parallel API processing for efficiency

2. **Resilience Patterns**
   - Circuit breakers prevent cascade failures
   - Rate limiting protects against quota exhaustion
   - Intelligent caching reduces costs by 50-80%
   - Fallback strategies ensure system availability

3. **Cost Management**
   - Real-time cost tracking and projection
   - Cache hit rate optimization
   - Service-level cost breakdown
   - ROI analysis for API vs LLM approaches

4. **Production Patterns**
   - Error handling with graceful degradation
   - Performance monitoring and alerting
   - Scalable architecture for high-volume processing
   - Cost-aware design decisions

### Cost Economics:

- **API Calls**: $0.001-$0.005 per call (currency/search)
- **Caching**: 50-80% cost reduction with proper TTL
- **LLM Fallbacks**: More expensive but higher availability
- **Hybrid Approach**: Best balance of cost, accuracy, and reliability

### Resilience Benefits:

- **Circuit Breakers**: Prevent wasted API calls during outages
- **Rate Limiting**: Protect against quota exhaustion
- **Caching**: Dramatic cost reduction and speed improvement
- **Fallbacks**: Maintain functionality during API failures

### Production Considerations:

- **Monitoring**: Track costs, latency, and success rates
- **Alerting**: Set budgets and performance thresholds
- **Optimization**: Continuously tune cache TTL and retry policies
- **Security**: Implement API key rotation and rate limiting

### Next Steps:

This foundation of API integration and resilience patterns enables:
- Building enterprise-grade document processing systems
- Handling high-volume production workloads
- Maintaining cost-effective operations
- Ensuring reliable service availability

The combination of multimodal AI agents with real-world API integrations creates powerful systems that can process invoices with both intelligence and real-time data accuracy.