# Day 2, Session 2 Lab: Build Resilient Invoice Enhancement Tools

## Lab Overview

**Estimated Time:** 40 minutes  
**Difficulty:** Advanced  
**Prerequisites:** Day 2 Session 1 Lab completion

### Learning Objectives

By the end of this lab, you will be able to:

1. **Build Production-Ready API Tools**
   - Create currency conversion tools with real APIs
   - Implement VAT validation using web search
   - Build robust error handling and fallback strategies

2. **Implement Resilience Patterns**
   - Build circuit breakers for API failure protection
   - Create rate limiting for quota management
   - Implement intelligent caching strategies

3. **Optimize for Production**
   - Track API costs and performance metrics
   - Handle partial failures gracefully
   - Build scalable tool orchestration

### Real-World Application

This lab simulates building an enterprise invoice processing system that:
- Converts currencies using live exchange rates
- Validates VAT numbers against external registries
- Handles API outages gracefully
- Optimizes costs through intelligent caching
- Maintains high availability under load

### Lab Structure

1. **Currency Converter Tool** (10 minutes)
2. **VAT Validator Tool** (10 minutes)  
3. **Circuit Breaker Implementation** (8 minutes)
4. **Tool Orchestration** (8 minutes)
5. **Production Testing** (4 minutes)

Let's build tools that work reliably in the real world!

In [None]:
# Server configuration - instructor provides actual values
OLLAMA_URL = "http://XX.XX.XX.XX"  # Course server IP
API_TOKEN = "YOUR_TOKEN_HERE"      # Instructor provides token
MODEL = "qwen3:8b"                  # Default model on server

import requests
import json
import time
import os
from datetime import datetime, timedelta
from typing import Dict, List, Optional, Any
from dataclasses import dataclass, field
import threading
from collections import defaultdict, deque
import hashlib
from enum import Enum
import concurrent.futures

# API Configuration
EXCHANGE_RATE_API = "https://api.exchangerate-api.com/v4/latest/{currency}"
SERPER_API_KEY = "your_key_here"  # Instructor provides or mock

# Mock data for when APIs are unavailable
MOCK_EXCHANGE_RATES = {
    "EUR": {"USD": 1.1, "GBP": 0.85, "JPY": 120.5, "EUR": 1.0},
    "USD": {"EUR": 0.91, "GBP": 0.77, "JPY": 109.5, "USD": 1.0},
    "GBP": {"EUR": 1.18, "USD": 1.30, "JPY": 142.3, "GBP": 1.0}
}

MOCK_VAT_DATA = {
    "GB123456789": {"valid": True, "company": "TechSupplies Co.", "country": "United Kingdom"},
    "NL123456789": {"valid": True, "company": "Dutch Tech BV", "country": "Netherlands"},
    "DE123456789": {"valid": True, "company": "German Tech GmbH", "country": "Germany"},
    "INVALID123": {"valid": False, "reason": "Invalid format"}
}

# Health check
def check_server_health():
    """Verify server connection"""
    try:
        response = requests.get(f"{OLLAMA_URL}/health")
        if response.status_code == 200:
            data = response.json()
            print(f"✅ Server Status: {data.get('status', 'Unknown')}")
            return True
    except Exception as e:
        print(f"❌ Server connection failed: {e}")
    return False

print("🔗 API Integration Lab Setup")
print("🔌 Connecting to course server...")
server_available = check_server_health()

print("\n📦 Installing required packages...")
!pip install -q requests python-dateutil
print("✅ Packages ready")

In [None]:
# Download real invoice dataset
import requests
import zipfile
import io

dropbox_url = "https://www.dropbox.com/scl/fo/m9hyfmvi78snwv0nh34mo/AMEXxwXMLAOeve-_yj12ck8?rlkey=urinkikgiuven0fro7r4x5rcu&st=hv3of7g7&dl=1"

print("📦 Downloading invoice dataset...")
try:
    response = requests.get(dropbox_url)
    with zipfile.ZipFile(io.BytesIO(response.content)) as z:
        z.extractall("invoice_images")
    print("✅ Downloaded invoice dataset")
    
    # Sample invoice data for testing
    SAMPLE_INVOICES = [
        {
            "invoice_id": "INV-2024-001",
            "vendor": "TechSupplies Co.",
            "amount": 15000.00,
            "currency": "EUR",
            "vat_number": "GB123456789",
            "date": "2024-01-15"
        },
        {
            "invoice_id": "INV-2024-002",
            "vendor": "CloudServices Inc.",
            "amount": 8500.00,
            "currency": "USD",
            "vat_number": "NL123456789",
            "date": "2024-02-01"
        },
        {
            "invoice_id": "INV-2024-003",
            "vendor": "Software Solutions Ltd",
            "amount": 12500.00,
            "currency": "GBP",
            "vat_number": "INVALID123",
            "date": "2024-02-15"
        }
    ]
    
    print(f"📊 Sample invoices prepared: {len(SAMPLE_INVOICES)}")
    
except Exception as e:
    print(f"❌ Error downloading: {e}")
    SAMPLE_INVOICES = []

## Foundation: Resilience Components

First, let's build the foundational resilience patterns that all our tools will use.

**Your Task:** Complete the cache and rate limiter implementations.

In [None]:
@dataclass
class CacheEntry:
    """Cache entry with expiration"""
    value: Any
    expires_at: datetime
    created_at: datetime = field(default_factory=datetime.now)

class SimpleCache:
    """Simple in-memory cache with TTL support"""
    
    def __init__(self):
        self.cache: Dict[str, CacheEntry] = {}
        self._lock = threading.Lock()
        self.stats = {
            'hits': 0,
            'misses': 0,
            'evictions': 0
        }
    
    def get(self, key: str) -> Optional[Any]:
        """Get value from cache if not expired"""
        # TODO: Implement cache retrieval with TTL checking
        with self._lock:
            if key in self.cache:
                entry = self.cache[key]
                if datetime.now() < entry.expires_at:
                    # TODO: Increment hit counter
                    self.stats['hits'] += 1
                    return entry.value
                else:
                    # TODO: Remove expired entry and increment evictions
                    del self.cache[key]
                    self.stats['evictions'] += 1
            
            # TODO: Increment miss counter
            self.stats['misses'] += 1
            return None
    
    def set(self, key: str, value: Any, ttl_seconds: int):
        """Store value in cache with TTL"""
        # TODO: Implement cache storage with expiration
        with self._lock:
            expires_at = datetime.now() + timedelta(seconds=ttl_seconds)
            self.cache[key] = CacheEntry(value=value, expires_at=expires_at)
    
    def clear(self):
        """Clear all cache entries"""
        with self._lock:
            self.cache.clear()
    
    def get_stats(self) -> Dict[str, Any]:
        """Get cache performance statistics"""
        total_requests = self.stats['hits'] + self.stats['misses']
        hit_rate = self.stats['hits'] / total_requests if total_requests > 0 else 0
        
        return {
            'hit_rate': hit_rate,
            'total_entries': len(self.cache),
            **self.stats
        }

class RateLimiter:
    """Token bucket rate limiter"""
    
    def __init__(self, max_calls: int, time_window: int):
        # TODO: Initialize token bucket parameters
        self.max_calls = max_calls
        self.time_window = time_window  # seconds
        self.tokens = max_calls
        self.last_refill = time.time()
        self._lock = threading.Lock()
        
        # Refill rate: tokens per second
        self.refill_rate = max_calls / time_window
    
    def acquire(self) -> bool:
        """Try to acquire a token"""
        # TODO: Implement token bucket logic
        with self._lock:
            now = time.time()
            
            # Calculate tokens to add based on time passed
            time_passed = now - self.last_refill
            tokens_to_add = time_passed * self.refill_rate
            
            # Add tokens up to maximum
            self.tokens = min(self.max_calls, self.tokens + tokens_to_add)
            self.last_refill = now
            
            # Check if we can consume a token
            if self.tokens >= 1:
                self.tokens -= 1
                return True
            else:
                return False
    
    def wait_time(self) -> float:
        """Get seconds to wait before next token is available"""
        if self.tokens >= 1:
            return 0.0
        return (1 - self.tokens) / self.refill_rate

# Test the implementations
print("🧪 Testing resilience components...")

# Test cache
cache = SimpleCache()
cache.set("test_key", "test_value", 2)  # 2 second TTL
print(f"Cache get: {cache.get('test_key')}")
time.sleep(1)
print(f"Cache get after 1s: {cache.get('test_key')}")
time.sleep(1.5)
print(f"Cache get after 2.5s: {cache.get('test_key')}")
print(f"Cache stats: {cache.get_stats()}")

# Test rate limiter
rate_limiter = RateLimiter(max_calls=3, time_window=5)  # 3 calls per 5 seconds
print(f"\nRate limiter tests:")
for i in range(5):
    allowed = rate_limiter.acquire()
    print(f"  Request {i+1}: {'✅ Allowed' if allowed else '❌ Rate limited'}")

print("✅ Resilience components ready")

## Task 1: Currency Converter Tool (10 minutes)

Build a production-ready currency converter with caching and error handling.

**Your Task:** Complete the CurrencyConverterTool implementation.

In [None]:
class CurrencyConverterTool:
    """Production-ready currency converter with resilience patterns"""
    
    def __init__(self, cache_ttl=3600, rate_limit_calls=60, rate_limit_window=60):
        # TODO: Initialize converter components
        self.cache = SimpleCache()
        self.rate_limiter = RateLimiter(rate_limit_calls, rate_limit_window)
        self.cache_ttl = cache_ttl  # 1 hour default
        
        # Cost tracking
        self.api_cost_per_call = 0.001  # $0.001 per call
        self.total_cost = 0.0
        self.call_count = 0
        
        print(f"💱 Currency converter initialized (cache: {cache_ttl}s, rate: {rate_limit_calls}/{rate_limit_window}s)")
    
    def _get_cache_key(self, from_currency: str, to_currency: str) -> str:
        """Generate cache key for currency pair"""
        return f"rate_{from_currency}_{to_currency}"
    
    def _call_exchange_api(self, from_currency: str) -> Dict[str, float]:
        """Call real exchange rate API"""
        # TODO: Implement real API call
        try:
            url = EXCHANGE_RATE_API.format(currency=from_currency)
            response = requests.get(url, timeout=5)
            
            if response.status_code == 200:
                data = response.json()
                return data.get('rates', {})
            else:
                raise requests.RequestException(f"API returned {response.status_code}")
                
        except Exception as e:
            print(f"⚠️ API call failed: {e}")
            # TODO: Return mock data as fallback
            return MOCK_EXCHANGE_RATES.get(from_currency, {})
    
    def get_exchange_rate(self, from_currency: str, to_currency: str) -> Dict[str, Any]:
        """Get exchange rate with caching and rate limiting"""
        # TODO: Implement complete exchange rate retrieval
        cache_key = self._get_cache_key(from_currency, to_currency)
        
        # Check cache first
        cached_rate = self.cache.get(cache_key)
        if cached_rate is not None:
            return {
                'rate': cached_rate,
                'from_cache': True,
                'cost': 0.0
            }
        
        # Check rate limit
        if not self.rate_limiter.acquire():
            wait_time = self.rate_limiter.wait_time()
            return {
                'error': f"Rate limited. Wait {wait_time:.1f}s",
                'wait_time': wait_time
            }
        
        # Make API call
        try:
            rates = self._call_exchange_api(from_currency)
            
            if to_currency in rates:
                rate = rates[to_currency]
                
                # Cache the result
                self.cache.set(cache_key, rate, self.cache_ttl)
                
                # Update cost tracking
                self.total_cost += self.api_cost_per_call
                self.call_count += 1
                
                return {
                    'rate': rate,
                    'from_cache': False,
                    'cost': self.api_cost_per_call
                }
            else:
                return {'error': f"Currency {to_currency} not supported"}
                
        except Exception as e:
            return {'error': str(e)}
    
    def convert(self, amount: float, from_currency: str, to_currency: str) -> Dict[str, Any]:
        """Convert amount between currencies"""
        # TODO: Implement currency conversion
        if from_currency == to_currency:
            return {
                'original_amount': amount,
                'converted_amount': amount,
                'rate': 1.0,
                'from_currency': from_currency,
                'to_currency': to_currency,
                'cost': 0.0
            }
        
        rate_result = self.get_exchange_rate(from_currency, to_currency)
        
        if 'error' in rate_result:
            return rate_result
        
        rate = rate_result['rate']
        converted_amount = amount * rate
        
        return {
            'original_amount': amount,
            'converted_amount': converted_amount,
            'rate': rate,
            'from_currency': from_currency,
            'to_currency': to_currency,
            'from_cache': rate_result['from_cache'],
            'cost': rate_result['cost']
        }
    
    def convert_to_multiple(self, amount: float, from_currency: str, to_currencies: List[str]) -> Dict[str, Any]:
        """Convert to multiple currencies in parallel"""
        # TODO: Implement batch conversion
        results = {}
        total_cost = 0.0
        
        # Use ThreadPoolExecutor for parallel requests
        with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
            future_to_currency = {
                executor.submit(self.convert, amount, from_currency, to_currency): to_currency
                for to_currency in to_currencies
            }
            
            for future in concurrent.futures.as_completed(future_to_currency):
                to_currency = future_to_currency[future]
                try:
                    result = future.result()
                    results[to_currency] = result
                    total_cost += result.get('cost', 0)
                except Exception as e:
                    results[to_currency] = {'error': str(e)}
        
        return {
            'conversions': results,
            'total_cost': total_cost,
            'currencies_processed': len(results)
        }
    
    def get_stats(self) -> Dict[str, Any]:
        """Get converter performance statistics"""
        cache_stats = self.cache.get_stats()
        
        return {
            'total_cost': self.total_cost,
            'api_calls': self.call_count,
            'cache_hit_rate': cache_stats['hit_rate'],
            'money_saved': cache_stats['hits'] * self.api_cost_per_call,
            'average_cost_per_conversion': self.total_cost / max(1, self.call_count)
        }

# Test the currency converter
print("🧪 Testing Currency Converter...")
converter = CurrencyConverterTool()

# Test single conversion
result = converter.convert(1000, "EUR", "USD")
print(f"Convert €1000 to USD: {result}")

# Test cache hit
result2 = converter.convert(500, "EUR", "USD")
print(f"Second conversion (should hit cache): {result2.get('from_cache')}")

# Test batch conversion
batch_result = converter.convert_to_multiple(1000, "EUR", ["USD", "GBP", "JPY"])
print(f"Batch conversion results: {len(batch_result['conversions'])} currencies")

# Show stats
stats = converter.get_stats()
print(f"Converter stats: {stats}")

print("✅ Currency Converter implementation complete!")

## Task 2: VAT Validator Tool (10 minutes)

Build a VAT validation tool using web search APIs.

**Your Task:** Complete the VATValidatorTool implementation.

In [None]:
class VATValidatorTool:
    """VAT number validator using web search"""
    
    def __init__(self, cache_ttl=86400):  # 24 hour cache
        # TODO: Initialize VAT validator
        self.cache = SimpleCache()
        self.rate_limiter = RateLimiter(max_calls=2, time_window=1)  # 2 calls per second
        self.cache_ttl = cache_ttl
        
        # Cost tracking
        self.api_cost_per_call = 0.005  # $0.005 per search
        self.total_cost = 0.0
        self.call_count = 0
        
        print(f"🔍 VAT validator initialized (cache: {cache_ttl/3600:.0f}h, rate: 2/1s)")
    
    def _validate_vat_format(self, vat_number: str) -> Dict[str, Any]:
        """Basic VAT number format validation"""
        # TODO: Implement format validation
        vat_number = vat_number.strip().upper()
        
        # Basic format patterns
        format_patterns = {
            'GB': (11, r'^GB\d{9}$'),  # UK: GB + 9 digits
            'NL': (12, r'^NL\d{9}B\d{2}$'),  # Netherlands: NL + 9 digits + B + 2 digits
            'DE': (11, r'^DE\d{9}$'),  # Germany: DE + 9 digits
            'FR': (13, r'^FR[A-Z0-9]{2}\d{9}$'),  # France: FR + 2 chars + 9 digits
        }
        
        if len(vat_number) < 4:
            return {'valid_format': False, 'reason': 'Too short'}
        
        country_code = vat_number[:2]
        if country_code in format_patterns:
            expected_length, pattern = format_patterns[country_code]
            if len(vat_number) == expected_length:
                return {'valid_format': True, 'country': country_code}
            else:
                return {'valid_format': False, 'reason': f'Invalid length for {country_code}'}
        else:
            return {'valid_format': True, 'country': 'unknown'}  # Accept unknown formats
    
    def _search_vat_online(self, vat_number: str, company_name: str = "") -> Dict[str, Any]:
        """Search for VAT number online"""
        # TODO: Implement web search (use Serper API or mock)
        try:
            # For production, use real search API:
            # query = f"VAT number {vat_number} {company_name} company registration"
            # headers = {'X-API-KEY': SERPER_API_KEY}
            # response = requests.post(
            #     'https://google.serper.dev/search',
            #     headers=headers,
            #     json={'q': query}
            # )
            
            # For now, use mock data
            time.sleep(0.2)  # Simulate API delay
            
            if vat_number in MOCK_VAT_DATA:
                return MOCK_VAT_DATA[vat_number]
            else:
                return {
                    'valid': False,
                    'reason': 'Not found in registry',
                    'confidence': 0.7
                }
                
        except Exception as e:
            return {'valid': False, 'error': str(e), 'confidence': 0.0}
    
    def validate(self, vat_number: str, company_name: str = "") -> Dict[str, Any]:
        """Validate VAT number with comprehensive checks"""
        # TODO: Implement complete VAT validation
        start_time = time.time()
        
        # Clean input
        vat_number = vat_number.strip().upper().replace(' ', '')
        cache_key = f"vat_{vat_number}_{company_name}"
        
        # Check cache first
        cached_result = self.cache.get(cache_key)
        if cached_result is not None:
            cached_result['from_cache'] = True
            cached_result['processing_time'] = time.time() - start_time
            return cached_result
        
        # Format validation
        format_check = self._validate_vat_format(vat_number)
        if not format_check['valid_format']:
            result = {
                'vat_number': vat_number,
                'valid': False,
                'reason': format_check['reason'],
                'confidence': 0.9,
                'from_cache': False,
                'cost': 0.0,
                'processing_time': time.time() - start_time
            }
            return result
        
        # Rate limiting
        if not self.rate_limiter.acquire():
            wait_time = self.rate_limiter.wait_time()
            return {
                'error': f"Rate limited. Wait {wait_time:.1f}s",
                'wait_time': wait_time
            }
        
        # Online search
        search_result = self._search_vat_online(vat_number, company_name)
        
        # Build final result
        result = {
            'vat_number': vat_number,
            'valid': search_result.get('valid', False),
            'company': search_result.get('company', 'Unknown'),
            'country': format_check.get('country', search_result.get('country', 'Unknown')),
            'confidence': search_result.get('confidence', 0.8),
            'reason': search_result.get('reason', ''),
            'from_cache': False,
            'cost': self.api_cost_per_call,
            'processing_time': time.time() - start_time
        }
        
        # Update cost tracking
        self.total_cost += self.api_cost_per_call
        self.call_count += 1
        
        # Cache the result
        self.cache.set(cache_key, result, self.cache_ttl)
        
        return result
    
    def validate_multiple(self, vat_numbers: List[str]) -> Dict[str, Any]:
        """Validate multiple VAT numbers"""
        # TODO: Implement batch validation
        results = {}
        total_cost = 0.0
        
        for vat_number in vat_numbers:
            try:
                result = self.validate(vat_number)
                results[vat_number] = result
                total_cost += result.get('cost', 0)
            except Exception as e:
                results[vat_number] = {'error': str(e)}
        
        return {
            'validations': results,
            'total_cost': total_cost,
            'numbers_processed': len(results),
            'valid_count': sum(1 for r in results.values() if r.get('valid', False))
        }
    
    def get_stats(self) -> Dict[str, Any]:
        """Get validator performance statistics"""
        cache_stats = self.cache.get_stats()
        
        return {
            'total_cost': self.total_cost,
            'api_calls': self.call_count,
            'cache_hit_rate': cache_stats['hit_rate'],
            'money_saved': cache_stats['hits'] * self.api_cost_per_call,
            'average_cost_per_validation': self.total_cost / max(1, self.call_count)
        }

# Test the VAT validator
print("🧪 Testing VAT Validator...")
vat_validator = VATValidatorTool()

# Test valid VAT numbers
test_vats = ["GB123456789", "NL123456789", "INVALID123"]

for vat in test_vats:
    result = vat_validator.validate(vat)
    print(f"VAT {vat}: {'✅ Valid' if result.get('valid') else '❌ Invalid'} - {result.get('company', result.get('reason', ''))}")

# Test batch validation
batch_result = vat_validator.validate_multiple(test_vats)
print(f"\nBatch validation: {batch_result['valid_count']}/{batch_result['numbers_processed']} valid")

# Show stats
stats = vat_validator.get_stats()
print(f"Validator stats: {stats}")

print("✅ VAT Validator implementation complete!")

## Task 3: Circuit Breaker Implementation (8 minutes)

Implement a circuit breaker to protect against cascading failures.

**Your Task:** Complete the CircuitBreaker class and test it thoroughly.

In [None]:
class CircuitState(Enum):
    CLOSED = "closed"      # Normal operation
    OPEN = "open"          # Failing, reject calls
    HALF_OPEN = "half_open" # Testing recovery

class CircuitBreaker:
    """Circuit breaker for API resilience"""
    
    def __init__(self, failure_threshold=3, timeout=30, recovery_timeout=60):
        # TODO: Initialize circuit breaker state
        self.failure_threshold = failure_threshold
        self.timeout = timeout  # Seconds to wait before trying again
        self.recovery_timeout = recovery_timeout
        
        self.failure_count = 0
        self.success_count = 0
        self.last_failure_time = None
        self.state = CircuitState.CLOSED
        self._lock = threading.Lock()
        
        # Statistics
        self.stats = {
            'total_calls': 0,
            'successful_calls': 0,
            'failed_calls': 0,
            'rejected_calls': 0,
            'state_changes': 0
        }
    
    def _change_state(self, new_state: CircuitState, reason: str = ""):
        """Change circuit breaker state"""
        old_state = self.state
        self.state = new_state
        self.stats['state_changes'] += 1
        print(f"🔄 Circuit breaker: {old_state.value} → {new_state.value} ({reason})")
    
    def call(self, func, *args, **kwargs):
        """Execute function with circuit breaker protection"""
        # TODO: Implement circuit breaker logic
        with self._lock:
            self.stats['total_calls'] += 1
            
            # Check current state
            if self.state == CircuitState.OPEN:
                # Check if we should try recovery
                if self.last_failure_time and \
                   (time.time() - self.last_failure_time) > self.timeout:
                    self._change_state(CircuitState.HALF_OPEN, "timeout expired")
                else:
                    # Still in open state, reject call
                    self.stats['rejected_calls'] += 1
                    raise Exception("Circuit breaker is OPEN - call rejected")
        
        # Attempt the call
        try:
            result = func(*args, **kwargs)
            
            # Success - update state
            with self._lock:
                self.stats['successful_calls'] += 1
                
                if self.state == CircuitState.HALF_OPEN:
                    # Recovery successful
                    self.failure_count = 0
                    self._change_state(CircuitState.CLOSED, "recovery successful")
                elif self.state == CircuitState.CLOSED:
                    # Reset failure count on success
                    self.failure_count = 0
            
            return result
            
        except Exception as e:
            # Failure - update state
            with self._lock:
                self.stats['failed_calls'] += 1
                self.failure_count += 1
                self.last_failure_time = time.time()
                
                # Check if we should open the circuit
                if self.failure_count >= self.failure_threshold:
                    if self.state != CircuitState.OPEN:
                        self._change_state(CircuitState.OPEN, f"{self.failure_count} failures")
            
            raise e
    
    def get_status(self) -> Dict[str, Any]:
        """Get circuit breaker status"""
        with self._lock:
            success_rate = (self.stats['successful_calls'] / 
                          max(1, self.stats['total_calls']))
            
            return {
                'state': self.state.value,
                'failure_count': self.failure_count,
                'last_failure': self.last_failure_time,
                'success_rate': success_rate,
                'stats': self.stats.copy()
            }
    
    def reset(self):
        """Reset circuit breaker to closed state"""
        with self._lock:
            self.failure_count = 0
            self.last_failure_time = None
            if self.state != CircuitState.CLOSED:
                self._change_state(CircuitState.CLOSED, "manual reset")

# Test the circuit breaker
print("🧪 Testing Circuit Breaker...")

# Create a failing function for testing
class UnreliableService:
    def __init__(self, failure_rate=0.7):
        self.failure_rate = failure_rate
        self.call_count = 0
    
    def unreliable_call(self):
        self.call_count += 1
        if self.call_count <= 3:  # First 3 calls always fail
            raise requests.RequestException(f"Simulated failure #{self.call_count}")
        return f"Success on call #{self.call_count}"

# Test circuit breaker behavior
service = UnreliableService()
circuit_breaker = CircuitBreaker(failure_threshold=3, timeout=2)

print("\n📊 Circuit Breaker Test Sequence:")
for i in range(8):
    try:
        result = circuit_breaker.call(service.unreliable_call)
        print(f"  Call {i+1}: ✅ {result}")
    except Exception as e:
        print(f"  Call {i+1}: ❌ {str(e)[:50]}")
    
    # Show circuit state
    status = circuit_breaker.get_status()
    print(f"    State: {status['state']}, Failures: {status['failure_count']}")
    
    # Brief pause between calls
    time.sleep(0.1)

# Test recovery after timeout
print("\n⏰ Testing recovery after timeout...")
time.sleep(2.5)  # Wait for timeout

try:
    result = circuit_breaker.call(service.unreliable_call)
    print(f"Recovery call: ✅ {result}")
except Exception as e:
    print(f"Recovery call: ❌ {e}")

final_status = circuit_breaker.get_status()
print(f"\n📈 Final Stats: {final_status['stats']}")
print(f"Success Rate: {final_status['success_rate']:.1%}")

print("✅ Circuit Breaker implementation complete!")

## Task 4: Tool Orchestration (8 minutes)

Combine all tools into a comprehensive invoice enhancement system.

**Your Task:** Complete the invoice enhancement orchestrator.

In [None]:
class InvoiceEnhancementOrchestrator:
    """Orchestrates all tools for comprehensive invoice enhancement"""
    
    def __init__(self):
        # TODO: Initialize all tools with circuit breakers
        self.currency_converter = CurrencyConverterTool()
        self.vat_validator = VATValidatorTool()
        
        # Circuit breakers for each service
        self.currency_circuit = CircuitBreaker(failure_threshold=3, timeout=30)
        self.vat_circuit = CircuitBreaker(failure_threshold=2, timeout=45)
        
        # Standard currencies for conversion
        self.target_currencies = ["USD", "EUR", "GBP"]
        
        print("🎯 Invoice Enhancement Orchestrator initialized")
    
    def enhance_invoice_data(self, invoice: Dict[str, Any]) -> Dict[str, Any]:
        """Enhance invoice with currency conversion and VAT validation"""
        # TODO: Implement comprehensive invoice enhancement
        start_time = time.time()
        
        print(f"\n🧾 Enhancing invoice: {invoice.get('invoice_id', 'Unknown')}")
        
        enhanced_invoice = invoice.copy()
        enhancement_metadata = {
            'processing_start': datetime.now().isoformat(),
            'services_used': [],
            'total_cost': 0.0,
            'errors': [],
            'warnings': []
        }
        
        # Extract invoice details
        amount = invoice.get('amount', 0)
        original_currency = invoice.get('currency', 'USD')
        vat_number = invoice.get('vat_number')
        
        # 1. Currency Conversion (parallel for multiple currencies)
        if amount > 0 and original_currency:
            try:
                print(f"💱 Converting {amount} {original_currency} to multiple currencies...")
                
                # Filter out the original currency
                target_currencies = [c for c in self.target_currencies if c != original_currency]
                
                conversion_result = self.currency_circuit.call(
                    self.currency_converter.convert_to_multiple,
                    amount,
                    original_currency,
                    target_currencies
                )
                
                enhanced_invoice['currency_conversions'] = conversion_result['conversions']
                enhancement_metadata['total_cost'] += conversion_result['total_cost']
                enhancement_metadata['services_used'].append('currency_converter')
                
                print(f"   ✅ Converted to {conversion_result['currencies_processed']} currencies")
                
            except Exception as e:
                error_msg = f"Currency conversion failed: {e}"
                enhancement_metadata['errors'].append(error_msg)
                print(f"   ❌ {error_msg}")
        
        # 2. VAT Validation
        if vat_number:
            try:
                print(f"🔍 Validating VAT number: {vat_number}")
                
                vat_result = self.vat_circuit.call(
                    self.vat_validator.validate,
                    vat_number,
                    invoice.get('vendor', '')
                )
                
                enhanced_invoice['vat_validation'] = vat_result
                enhancement_metadata['total_cost'] += vat_result.get('cost', 0)
                enhancement_metadata['services_used'].append('vat_validator')
                
                if vat_result.get('valid'):
                    print(f"   ✅ VAT valid: {vat_result.get('company', 'Unknown company')}")
                else:
                    warning_msg = f"VAT validation failed: {vat_result.get('reason', 'Unknown')}"
                    enhancement_metadata['warnings'].append(warning_msg)
                    print(f"   ⚠️ {warning_msg}")
                    
            except Exception as e:
                error_msg = f"VAT validation failed: {e}"
                enhancement_metadata['errors'].append(error_msg)
                print(f"   ❌ {error_msg}")
        
        # 3. Risk Assessment
        risk_score = self._calculate_risk_score(enhanced_invoice)
        enhanced_invoice['risk_assessment'] = risk_score
        
        # 4. Processing Summary
        processing_time = time.time() - start_time
        enhancement_metadata.update({
            'processing_end': datetime.now().isoformat(),
            'processing_time_seconds': processing_time,
            'enhancement_version': '1.0'
        })
        
        enhanced_invoice['enhancement_metadata'] = enhancement_metadata
        
        print(f"✅ Enhancement complete in {processing_time:.2f}s (cost: ${enhancement_metadata['total_cost']:.4f})")
        
        return enhanced_invoice
    
    def _calculate_risk_score(self, invoice: Dict[str, Any]) -> Dict[str, Any]:
        """Calculate overall risk score for the invoice"""
        # TODO: Implement risk scoring logic
        risk_factors = []
        risk_score = 0.0
        
        # Amount-based risk
        amount = invoice.get('amount', 0)
        if amount > 50000:
            risk_factors.append("High amount (>$50k)")
            risk_score += 0.3
        elif amount > 10000:
            risk_factors.append("Medium amount (>$10k)")
            risk_score += 0.1
        
        # VAT validation risk
        vat_validation = invoice.get('vat_validation', {})
        if not vat_validation.get('valid', True):
            risk_factors.append("Invalid VAT number")
            risk_score += 0.4
        
        # Currency conversion risk
        conversions = invoice.get('currency_conversions', {})
        failed_conversions = sum(1 for conv in conversions.values() if 'error' in conv)
        if failed_conversions > 0:
            risk_factors.append(f"{failed_conversions} currency conversion failures")
            risk_score += 0.2
        
        # Overall risk level
        if risk_score >= 0.7:
            risk_level = "HIGH"
        elif risk_score >= 0.3:
            risk_level = "MEDIUM"
        else:
            risk_level = "LOW"
        
        return {
            'risk_score': risk_score,
            'risk_level': risk_level,
            'risk_factors': risk_factors,
            'requires_review': risk_score >= 0.5
        }
    
    def process_batch(self, invoices: List[Dict[str, Any]]) -> Dict[str, Any]:
        """Process multiple invoices in batch"""
        # TODO: Implement batch processing
        results = []
        total_cost = 0.0
        errors = 0
        
        print(f"\n📊 Processing batch of {len(invoices)} invoices...")
        
        for i, invoice in enumerate(invoices):
            try:
                print(f"\n[{i+1}/{len(invoices)}] Processing {invoice.get('invoice_id', f'Invoice #{i+1}')}")
                enhanced = self.enhance_invoice_data(invoice)
                results.append(enhanced)
                total_cost += enhanced['enhancement_metadata']['total_cost']
            except Exception as e:
                print(f"❌ Failed to process invoice {i+1}: {e}")
                errors += 1
                results.append({
                    'invoice_id': invoice.get('invoice_id', f'Invoice #{i+1}'),
                    'error': str(e),
                    'original_data': invoice
                })
        
        return {
            'processed_invoices': results,
            'total_processed': len(invoices),
            'successful': len(invoices) - errors,
            'failed': errors,
            'total_cost': total_cost,
            'average_cost_per_invoice': total_cost / len(invoices) if invoices else 0
        }
    
    def get_performance_stats(self) -> Dict[str, Any]:
        """Get comprehensive performance statistics"""
        return {
            'currency_converter': self.currency_converter.get_stats(),
            'vat_validator': self.vat_validator.get_stats(),
            'currency_circuit_breaker': self.currency_circuit.get_status(),
            'vat_circuit_breaker': self.vat_circuit.get_status()
        }

# Test the orchestrator
print("🧪 Testing Invoice Enhancement Orchestrator...")
orchestrator = InvoiceEnhancementOrchestrator()

# Test single invoice enhancement
if SAMPLE_INVOICES:
    test_invoice = SAMPLE_INVOICES[0]
    enhanced = orchestrator.enhance_invoice_data(test_invoice)
    
    print(f"\n📋 Enhancement Summary for {enhanced['invoice_id']}:")
    print(f"   Services used: {enhanced['enhancement_metadata']['services_used']}")
    print(f"   Total cost: ${enhanced['enhancement_metadata']['total_cost']:.4f}")
    print(f"   Risk level: {enhanced['risk_assessment']['risk_level']}")
    print(f"   Currencies converted: {len(enhanced.get('currency_conversions', {}))}")
    
    vat_result = enhanced.get('vat_validation', {})
    if vat_result:
        print(f"   VAT status: {'✅ Valid' if vat_result.get('valid') else '❌ Invalid'}")

print("\n✅ Tool Orchestration implementation complete!")

## Task 5: Production Testing (4 minutes)

Test your complete system with various failure scenarios.

**Your Task:** Run comprehensive tests to validate resilience.

In [None]:
def run_production_tests():
    """Run comprehensive production tests"""
    
    print("🧪 PRODUCTION TESTING SUITE")
    print("=" * 50)
    
    orchestrator = InvoiceEnhancementOrchestrator()
    
    # Test 1: Normal operation with all sample invoices
    print("\n📊 Test 1: Batch Processing")
    print("-" * 30)
    
    if SAMPLE_INVOICES:
        batch_result = orchestrator.process_batch(SAMPLE_INVOICES)
        
        print(f"✅ Batch Results:")
        print(f"   Total processed: {batch_result['total_processed']}")
        print(f"   Successful: {batch_result['successful']}")
        print(f"   Failed: {batch_result['failed']}")
        print(f"   Total cost: ${batch_result['total_cost']:.4f}")
        print(f"   Average cost: ${batch_result['average_cost_per_invoice']:.4f}")
    
    # Test 2: Cache performance
    print("\n💾 Test 2: Cache Performance")
    print("-" * 30)
    
    # Process same invoice twice to test caching
    if SAMPLE_INVOICES:
        test_invoice = SAMPLE_INVOICES[0]
        
        print("First processing (cache miss):")
        start_time = time.time()
        result1 = orchestrator.enhance_invoice_data(test_invoice)
        time1 = time.time() - start_time
        
        print("\nSecond processing (should hit cache):")
        start_time = time.time()
        result2 = orchestrator.enhance_invoice_data(test_invoice)
        time2 = time.time() - start_time
        
        print(f"\n📈 Cache Performance:")
        print(f"   First run: {time1:.2f}s")
        print(f"   Second run: {time2:.2f}s")
        print(f"   Speedup: {time1/time2:.1f}x faster")
    
    # Test 3: Rate limiting
    print("\n🛑 Test 3: Rate Limiting")
    print("-" * 30)
    
    # Create rapid fire requests to test rate limiting
    rate_limiter = RateLimiter(max_calls=3, time_window=2)
    
    print("Testing rate limiter (3 calls per 2 seconds):")
    for i in range(6):
        allowed = rate_limiter.acquire()
        print(f"   Request {i+1}: {'✅ Allowed' if allowed else '❌ Rate limited'}")
        if not allowed:
            wait_time = rate_limiter.wait_time()
            print(f"     Wait time: {wait_time:.1f}s")
    
    # Test 4: Circuit breaker simulation
    print("\n⚡ Test 4: Circuit Breaker Resilience")
    print("-" * 40)
    
    # Simulate service failures
    failing_circuit = CircuitBreaker(failure_threshold=2, timeout=1)
    
    def failing_service():
        """Always fails for testing"""
        raise requests.RequestException("Simulated service failure")
    
    print("Simulating repeated service failures:")
    for i in range(5):
        try:
            failing_circuit.call(failing_service)
            print(f"   Call {i+1}: ✅ Success")
        except Exception as e:
            status = failing_circuit.get_status()
            print(f"   Call {i+1}: ❌ {e} (State: {status['state']})")
    
    # Test 5: Performance statistics
    print("\n📊 Test 5: Performance Statistics")
    print("-" * 35)
    
    stats = orchestrator.get_performance_stats()
    
    print("Currency Converter:")
    currency_stats = stats['currency_converter']
    print(f"   API calls: {currency_stats['api_calls']}")
    print(f"   Cache hit rate: {currency_stats['cache_hit_rate']:.1%}")
    print(f"   Money saved: ${currency_stats['money_saved']:.4f}")
    
    print("\nVAT Validator:")
    vat_stats = stats['vat_validator']
    print(f"   API calls: {vat_stats['api_calls']}")
    print(f"   Cache hit rate: {vat_stats['cache_hit_rate']:.1%}")
    print(f"   Money saved: ${vat_stats['money_saved']:.4f}")
    
    print("\nCircuit Breakers:")
    currency_circuit = stats['currency_circuit_breaker']
    vat_circuit = stats['vat_circuit_breaker']
    print(f"   Currency CB: {currency_circuit['state']} (success rate: {currency_circuit['success_rate']:.1%})")
    print(f"   VAT CB: {vat_circuit['state']} (success rate: {vat_circuit['success_rate']:.1%})")
    
    # Test 6: Error handling
    print("\n🚨 Test 6: Error Handling")
    print("-" * 25)
    
    # Test with malformed invoice
    malformed_invoice = {
        "invoice_id": "MALFORMED-001",
        "amount": "not_a_number",
        "currency": "INVALID_CURRENCY",
        "vat_number": "CLEARLY_INVALID"
    }
    
    try:
        result = orchestrator.enhance_invoice_data(malformed_invoice)
        errors = result['enhancement_metadata']['errors']
        warnings = result['enhancement_metadata']['warnings']
        
        print(f"   Errors handled: {len(errors)}")
        print(f"   Warnings generated: {len(warnings)}")
        print(f"   Processing completed: {'✅' if result else '❌'}")
        
    except Exception as e:
        print(f"   ❌ Unhandled error: {e}")
    
    print("\n🎯 PRODUCTION TESTING COMPLETE")
    print("✅ All resilience patterns validated")
    print("✅ Error handling confirmed")
    print("✅ Performance metrics collected")
    print("✅ Cost tracking operational")

# Run the complete test suite
run_production_tests()

## Lab Completion and Self-Assessment

### What You've Built

Congratulations! You've built a production-ready invoice enhancement system with:

1. **Production-Ready API Tools**
   - Currency converter with live exchange rates
   - VAT validator using web search
   - Comprehensive error handling and fallback strategies

2. **Resilience Patterns**
   - Circuit breakers for failure protection
   - Rate limiting for quota management
   - Intelligent caching for cost optimization

3. **Enterprise Features**
   - Cost tracking and optimization
   - Performance monitoring
   - Batch processing capabilities
   - Comprehensive testing suite

### Self-Assessment Questions

Rate your understanding (1-5 scale) and provide brief explanations:

1. **API Integration** (1-5): ___
   - How do you balance API costs with system reliability?
   - What factors determine appropriate cache TTL values?

2. **Circuit Breaker Pattern** (1-5): ___
   - When should a circuit breaker open vs stay half-open?
   - How do you tune failure thresholds for production?

3. **Rate Limiting** (1-5): ___
   - What's the difference between rate limiting and throttling?
   - How do you handle rate limits across multiple services?

4. **Caching Strategy** (1-5): ___
   - What types of data should and shouldn't be cached?
   - How do you invalidate stale cache entries?

5. **Error Handling** (1-5): ___
   - How do you distinguish between retryable and non-retryable errors?
   - What information should be logged for debugging failures?

### Key Production Patterns Learned

**Circuit Breaker Benefits:**
- Prevents cascade failures across services
- Reduces unnecessary API calls during outages
- Enables automatic recovery testing

**Caching Strategy:**
- 50-80% cost reduction with proper TTL
- Significantly improved response times
- Reduced load on external services

**Rate Limiting:**
- Protects against quota exhaustion
- Enables fair resource allocation
- Prevents service degradation

**Cost Optimization:**
- Real-time cost tracking enables budgeting
- Cache hit rates directly impact costs
- Batch processing reduces per-request overhead

### Common Production Issues

**Cache Stampede:**
- Multiple requests for same expired data
- Solution: Cache warming and distributed locking

**Circuit Breaker Tuning:**
- Too sensitive: Unnecessary service blocks
- Too loose: Cascade failures not prevented
- Solution: Monitor and adjust based on service characteristics

**Rate Limit Coordination:**
- Multiple service instances sharing quotas
- Solution: Centralized rate limiting or distributed algorithms

### Advanced Extensions

If you completed the lab early, consider these enhancements:

1. **Exponential Backoff with Jitter**
   - Implement retry logic with randomized delays
   - Reduce thundering herd problems

2. **Request Queuing**
   - Queue requests during rate limiting
   - Process when tokens become available

3. **Metrics Collection**
   - Export metrics to Prometheus/CloudWatch
   - Create alerting on SLA violations

4. **Health Checks**
   - Implement comprehensive health endpoints
   - Include dependency health status

### Integration with LangGraph

Your tools can be easily integrated into LangGraph workflows:

```python
# Add to existing workflow
workflow.add_node("enhance_invoice", orchestrator.enhance_invoice_data)
workflow.add_node("convert_currency", currency_converter_node)
workflow.add_node("validate_vat", vat_validator_node)

# Add conditional routing based on enhancement results
workflow.add_conditional_edges(
    "enhance_invoice",
    lambda state: "approve" if state["risk_assessment"]["risk_level"] == "LOW" else "review",
    {"approve": "auto_approve", "review": "manual_review"}
)
```

### Next Steps

To further your API integration expertise:

1. **Learn Advanced Patterns**
   - Bulkhead pattern for resource isolation
   - Saga pattern for distributed transactions

2. **Monitor in Production**
   - Set up comprehensive monitoring
   - Create runbooks for common failures

3. **Scale Considerations**
   - Implement distributed caching (Redis)
   - Use message queues for async processing

**Congratulations!** You've built enterprise-grade API integration tools that can handle real production workloads reliably and cost-effectively!

In [None]:
class ChromaVectorMemory:
    \"\"\"Semantic memory management with Chroma for context-aware processing\"\"\"
    
    def __init__(self, collection_name=\"invoice_processing_lab\"):
        \"\"\"Initialize Chroma vector database for semantic memory\"\"\"
        # TODO: Initialize Chroma client and collection
        try:\n            self.client = chromadb.Client()\n            self.collection_name = collection_name\n            \n            # Create or get collection\n            try:\n                self.collection = self.client.get_collection(collection_name)\n                print(f\"✅ Connected to existing Chroma collection: {collection_name}\")\n            except:\n                self.collection = self.client.create_collection(collection_name)\n                print(f\"✅ Created new Chroma collection: {collection_name}\")\n            \n            # Initialize embedding model\n            self.embedding_model = SentenceTransformer('all-MiniLM-L6-v2')\n            print(\"✅ Sentence transformer loaded\")\n            \n        except Exception as e:\n            print(f\"❌ Chroma initialization failed: {e}\")\n            self.client = None\n            self.collection = None\n            self.embedding_model = None\n    \n    def store_processing_context(self, invoice_data: Dict[str, Any], \n                               processing_result: Dict[str, Any], \n                               session_id: str) -> str:\n        \"\"\"Store processing context for semantic retrieval\"\"\"\n        # TODO: Create and store semantic context document\n        if not self.collection:\n            return \"\"\n        \n        try:\n            # Create context document\n            context_parts = []\n            \n            # Invoice details\n            vendor = invoice_data.get('vendor', 'unknown vendor')\n            amount = invoice_data.get('amount', 0)\n            currency = invoice_data.get('currency', 'USD')\n            context_parts.append(f\"Invoice from {vendor} for {amount} {currency}\")\n            \n            # Processing results\n            risk_assessment = processing_result.get('risk_assessment', {})\n            context_parts.append(f\"Risk level: {risk_assessment.get('risk_level', 'unknown')}\")\n            \n            # Services used\n            metadata = processing_result.get('enhancement_metadata', {})\n            services = metadata.get('services_used', [])\n            if services:\n                context_parts.append(f\"Services used: {', '.join(services)}\")\n            \n            # Combine into document\n            document = \" | \".join(context_parts)\n            \n            # Generate unique ID\n            doc_id = f\"{session_id}_{invoice_data.get('invoice_id', 'unknown')}_{int(time.time())}\"\n            \n            # Create metadata\n            doc_metadata = {\n                \"session_id\": session_id,\n                \"invoice_id\": invoice_data.get('invoice_id', ''),\n                \"vendor\": vendor,\n                \"amount\": amount,\n                \"currency\": currency,\n                \"risk_level\": risk_assessment.get('risk_level', 'unknown'),\n                \"processing_time\": metadata.get('processing_time_seconds', 0),\n                \"timestamp\": datetime.now().isoformat()\n            }\n            \n            # Store in Chroma\n            self.collection.add(\n                documents=[document],\n                ids=[doc_id],\n                metadatas=[doc_metadata]\n            )\n            \n            print(f\"💾 Stored semantic context: {doc_id}\")\n            return doc_id\n            \n        except Exception as e:\n            print(f\"❌ Failed to store context: {e}\")\n            return \"\"\n    \n    def find_similar_processing(self, invoice_data: Dict[str, Any], limit: int = 3) -> List[Dict[str, Any]]:\n        \"\"\"Find similar invoice processing contexts\"\"\"\n        # TODO: Search for similar processing contexts\n        if not self.collection:\n            return []\n        \n        try:\n            # Create query from current invoice\n            vendor = invoice_data.get('vendor', 'vendor')\n            amount = invoice_data.get('amount', 0)\n            currency = invoice_data.get('currency', 'USD')\n            \n            query = f\"Invoice from {vendor} for {amount} {currency}\"\n            \n            # Search for similar contexts\n            results = self.collection.query(\n                query_texts=[query],\n                n_results=limit,\n                include=['documents', 'metadatas', 'distances']\n            )\n            \n            # Format results\n            similar_contexts = []\n            if results['documents'] and results['documents'][0]:\n                for i, doc in enumerate(results['documents'][0]):\n                    similar_contexts.append({\n                        'document': doc,\n                        'metadata': results['metadatas'][0][i],\n                        'similarity': 1 - results['distances'][0][i]  # Convert distance to similarity\n                    })\n            \n            print(f\"🔍 Found {len(similar_contexts)} similar processing contexts\")\n            return similar_contexts\n            \n        except Exception as e:\n            print(f\"❌ Failed to find similar processing: {e}\")\n            return []\n    \n    def get_processing_recommendations(self, invoice_data: Dict[str, Any]) -> Dict[str, Any]:\n        \"\"\"Get AI-powered processing recommendations based on similar contexts\"\"\"\n        # TODO: Generate recommendations from similar contexts\n        similar_contexts = self.find_similar_processing(invoice_data, limit=5)\n        \n        if not similar_contexts:\n            return {\"message\": \"No similar processing contexts found\"}\n        \n        # Analyze patterns from similar contexts\n        recommendations = {\n            \"processing_suggestions\": [],\n            \"risk_insights\": [],\n            \"api_recommendations\": [],\n            \"confidence_score\": 0.0\n        }\n        \n        # Extract patterns\n        high_risk_count = 0\n        avg_processing_time = 0\n        common_currencies = set()\n        \n        for context in similar_contexts:\n            metadata = context['metadata']\n            \n            # Risk patterns\n            if metadata.get('risk_level') == 'HIGH':\n                high_risk_count += 1\n            \n            # Performance patterns\n            avg_processing_time += metadata.get('processing_time', 0)\n            \n            # Currency patterns\n            common_currencies.add(metadata.get('currency', 'USD'))\n        \n        avg_processing_time = avg_processing_time / len(similar_contexts) if similar_contexts else 0\n        high_risk_rate = high_risk_count / len(similar_contexts)\n        \n        # Generate recommendations\n        if high_risk_rate > 0.5:\n            recommendations[\"risk_insights\"].append(f\"Vendor has {high_risk_rate:.1%} high-risk rate - consider extra review\")\n        \n        if avg_processing_time > 2.0:\n            recommendations[\"api_recommendations\"].append(\"Previous processing was slow - ensure caching is enabled\")\n        \n        if len(common_currencies) > 1:\n            recommendations[\"processing_suggestions\"].append(f\"Vendor uses multiple currencies: {list(common_currencies)}\")\n        \n        # Calculate confidence\n        avg_similarity = sum(c['similarity'] for c in similar_contexts) / len(similar_contexts)\n        recommendations[\"confidence_score\"] = avg_similarity\n        \n        print(f\"🎯 Generated recommendations from {len(similar_contexts)} similar contexts\")\n        return recommendations\n    \n    def analyze_vendor_patterns(self, vendor_name: str) -> Dict[str, Any]:\n        \"\"\"Analyze processing patterns for a specific vendor\"\"\"\n        # TODO: Query vendor-specific patterns\n        if not self.collection:\n            return {\"error\": \"Vector memory not available\"}\n        \n        try:\n            # Query by vendor metadata\n            results = self.collection.query(\n                query_texts=[f\"Invoice from {vendor_name}\"],\n                n_results=10,\n                where={\"vendor\": vendor_name},\n                include=['documents', 'metadatas']\n            )\n            \n            if not results['documents'] or not results['documents'][0]:\n                return {\"message\": f\"No processing history found for {vendor_name}\"}\n            \n            # Analyze vendor patterns\n            total_invoices = len(results['documents'][0])\n            risk_levels = [meta.get('risk_level', 'UNKNOWN') for meta in results['metadatas'][0]]\n            amounts = [meta.get('amount', 0) for meta in results['metadatas'][0]]\n            currencies = [meta.get('currency', 'USD') for meta in results['metadatas'][0]]\n            \n            return {\n                \"vendor_name\": vendor_name,\n                \"total_invoices_processed\": total_invoices,\n                \"risk_distribution\": {level: risk_levels.count(level) for level in set(risk_levels)},\n                \"amount_range\": {\"min\": min(amounts), \"max\": max(amounts), \"avg\": sum(amounts) / len(amounts)},\n                \"currencies_used\": list(set(currencies)),\n                \"processing_confidence\": \"HIGH\" if total_invoices >= 5 else \"MEDIUM\" if total_invoices >= 2 else \"LOW\"\n            }\n            \n        except Exception as e:\n            return {\"error\": str(e)}\n\n# Test Chroma Vector Memory\nprint(\"🧪 Testing Chroma Vector Memory...\")\nchroma_memory = ChromaVectorMemory()\n\n# Test context storage (mock data)\ntest_invoice = {\n    \"invoice_id\": \"TEST-001\",\n    \"vendor\": \"Test Vendor Inc.\",\n    \"amount\": 5000.0,\n    \"currency\": \"USD\"\n}\n\ntest_result = {\n    \"risk_assessment\": {\"risk_level\": \"LOW\"},\n    \"enhancement_metadata\": {\n        \"services_used\": [\"currency_converter\"],\n        \"processing_time_seconds\": 1.5\n    }\n}\n\nif chroma_memory.collection:\n    context_id = chroma_memory.store_processing_context(test_invoice, test_result, \"test_session\")\n    recommendations = chroma_memory.get_processing_recommendations(test_invoice)\n    print(f\"Recommendations: {recommendations}\")\n\nprint(\"✅ Chroma Vector Memory implementation complete!\")

In [None]:
class DuckDBMemoryManager:
    \"\"\"Structured memory management with DuckDB for invoice processing analytics\"\"\"
    
    def __init__(self, db_path=\":memory:\"):
        \"\"\"Initialize DuckDB connection and create tables\"\"\"
        # TODO: Initialize DuckDB connection and create memory tables
        try:
            self.conn = duckdb.connect(db_path)
            self._setup_tables()
            print(f\"✅ DuckDB memory manager connected: {db_path}\")
        except Exception as e:
            print(f\"❌ DuckDB initialization failed: {e}\")
            self.conn = None
    
    def _setup_tables(self):\n        \"\"\"Create tables for structured memory\"\"\"\n        if not self.conn:\n            return\n        \n        # TODO: Create invoice processing sessions table\n        self.conn.execute(\"\"\"\n            CREATE TABLE IF NOT EXISTS processing_sessions (\n                session_id VARCHAR PRIMARY KEY,\n                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,\n                user_id VARCHAR,\n                total_invoices INTEGER DEFAULT 0,\n                total_cost DECIMAL(10,4) DEFAULT 0.0,\n                session_metadata JSON\n            )\n        \"\"\")\n        \n        # TODO: Create invoice processing records table\n        self.conn.execute(\"\"\"\n            CREATE TABLE IF NOT EXISTS invoice_records (\n                record_id VARCHAR PRIMARY KEY,\n                session_id VARCHAR,\n                invoice_id VARCHAR,\n                vendor_name VARCHAR,\n                amount DECIMAL(15,2),\n                currency VARCHAR(3),\n                processing_time DECIMAL(8,3),\n                risk_level VARCHAR,\n                apis_used VARCHAR[],\n                success BOOLEAN,\n                processed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,\n                FOREIGN KEY (session_id) REFERENCES processing_sessions(session_id)\n            )\n        \"\"\")\n        \n        # TODO: Create vendor analytics table\n        self.conn.execute(\"\"\"\n            CREATE TABLE IF NOT EXISTS vendor_analytics (\n                vendor_id VARCHAR PRIMARY KEY,\n                vendor_name VARCHAR,\n                avg_amount DECIMAL(15,2),\n                total_invoices INTEGER,\n                avg_processing_time DECIMAL(8,3),\n                typical_currency VARCHAR(3),\n                risk_frequency DECIMAL(3,2),\n                last_seen TIMESTAMP\n            )\n        \"\"\")\n        \n        print(\"✅ DuckDB memory tables created\")\n    \n    def start_session(self, session_id: str, user_id: str = \"demo_user\") -> bool:\n        \"\"\"Start a new processing session\"\"\"\n        # TODO: Create new session record\n        if not self.conn:\n            return False\n        \n        try:\n            self.conn.execute(\"\"\"\n                INSERT INTO processing_sessions (session_id, user_id, session_metadata)\n                VALUES (?, ?, ?)\n            \"\"\", [session_id, user_id, json.dumps({\"started_at\": datetime.now().isoformat()})])\n            \n            print(f\"📝 Started session: {session_id}\")\n            return True\n        except Exception as e:\n            print(f\"❌ Failed to start session: {e}\")\n            return False\n    \n    def record_processing(self, session_id: str, invoice_data: Dict[str, Any], \n                         processing_result: Dict[str, Any]) -> str:\n        \"\"\"Record invoice processing details\"\"\"\n        # TODO: Store processing record and update analytics\n        if not self.conn:\n            return \"\"\n        \n        record_id = f\"rec_{int(time.time() * 1000)}\"\n        \n        try:\n            # Extract processing details\n            metadata = processing_result.get('enhancement_metadata', {})\n            processing_time = metadata.get('processing_time_seconds', 0)\n            risk_assessment = processing_result.get('risk_assessment', {})\n            \n            # Insert processing record\n            self.conn.execute(\"\"\"\n                INSERT INTO invoice_records \n                (record_id, session_id, invoice_id, vendor_name, amount, currency, \n                 processing_time, risk_level, apis_used, success)\n                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)\n            \"\"\", [\n                record_id, session_id,\n                invoice_data.get('invoice_id', ''),\n                invoice_data.get('vendor', ''),\n                invoice_data.get('amount', 0),\n                invoice_data.get('currency', 'USD'),\n                processing_time,\n                risk_assessment.get('risk_level', 'UNKNOWN'),\n                metadata.get('services_used', []),\n                len(metadata.get('errors', [])) == 0\n            ])\n            \n            # Update vendor analytics\n            self._update_vendor_analytics(invoice_data, processing_result)\n            \n            print(f\"💾 Recorded processing: {record_id}\")\n            return record_id\n            \n        except Exception as e:\n            print(f\"❌ Failed to record processing: {e}\")\n            return \"\"\n    \n    def _update_vendor_analytics(self, invoice_data: Dict[str, Any], processing_result: Dict[str, Any]):\n        \"\"\"Update vendor analytics with new processing data\"\"\"\n        # TODO: Update or insert vendor analytics\n        vendor_name = invoice_data.get('vendor', '')\n        if not vendor_name:\n            return\n        \n        vendor_id = hashlib.md5(vendor_name.encode()).hexdigest()[:16]\n        amount = invoice_data.get('amount', 0)\n        currency = invoice_data.get('currency', 'USD')\n        processing_time = processing_result.get('enhancement_metadata', {}).get('processing_time_seconds', 0)\n        is_high_risk = processing_result.get('risk_assessment', {}).get('risk_level') == 'HIGH'\n        \n        # Check if vendor exists\n        existing = self.conn.execute(\"\"\"\n            SELECT vendor_id FROM vendor_analytics WHERE vendor_id = ?\n        \"\"\", [vendor_id]).fetchone()\n        \n        if existing:\n            # Update existing vendor\n            self.conn.execute(\"\"\"\n                UPDATE vendor_analytics \n                SET avg_amount = (avg_amount + ?) / 2,\n                    total_invoices = total_invoices + 1,\n                    avg_processing_time = (avg_processing_time + ?) / 2,\n                    risk_frequency = (risk_frequency * total_invoices + ?) / (total_invoices + 1),\n                    last_seen = CURRENT_TIMESTAMP\n                WHERE vendor_id = ?\n            \"\"\", [amount, processing_time, 1.0 if is_high_risk else 0.0, vendor_id])\n        else:\n            # Create new vendor\n            self.conn.execute(\"\"\"\n                INSERT INTO vendor_analytics \n                (vendor_id, vendor_name, avg_amount, total_invoices, avg_processing_time, \n                 typical_currency, risk_frequency, last_seen)\n                VALUES (?, ?, ?, 1, ?, ?, ?, CURRENT_TIMESTAMP)\n            \"\"\", [vendor_id, vendor_name, amount, processing_time, currency, 1.0 if is_high_risk else 0.0])\n    \n    def get_vendor_insights(self, vendor_name: str) -> Dict[str, Any]:\n        \"\"\"Get insights about vendor processing patterns\"\"\"\n        # TODO: Query vendor analytics and processing history\n        if not self.conn:\n            return {\"error\": \"Database not available\"}\n        \n        try:\n            vendor_id = hashlib.md5(vendor_name.encode()).hexdigest()[:16]\n            \n            # Get vendor analytics\n            vendor_data = self.conn.execute(\"\"\"\n                SELECT * FROM vendor_analytics WHERE vendor_id = ?\n            \"\"\", [vendor_id]).fetchone()\n            \n            if not vendor_data:\n                return {\"message\": \"No historical data for this vendor\"}\n            \n            # Convert to dict\n            columns = [desc[0] for desc in self.conn.description]\n            vendor_dict = dict(zip(columns, vendor_data))\n            \n            # Get recent processing history\n            recent_history = self.conn.execute(\"\"\"\n                SELECT COUNT(*) as recent_count, AVG(processing_time) as avg_time\n                FROM invoice_records \n                WHERE vendor_name = ? AND processed_at > datetime('now', '-30 days')\n            \"\"\", [vendor_name]).fetchone()\n            \n            vendor_dict['recent_activity'] = {\n                'invoices_last_30_days': recent_history[0] if recent_history else 0,\n                'avg_processing_time_recent': recent_history[1] if recent_history else 0\n            }\n            \n            return vendor_dict\n            \n        except Exception as e:\n            return {\"error\": str(e)}\n    \n    def get_session_analytics(self, session_id: str) -> Dict[str, Any]:\n        \"\"\"Get comprehensive session analytics\"\"\"\n        # TODO: Generate session analytics report\n        if not self.conn:\n            return {\"error\": \"Database not available\"}\n        \n        try:\n            # Session summary\n            session_stats = self.conn.execute(\"\"\"\n                SELECT \n                    COUNT(*) as total_invoices,\n                    AVG(amount) as avg_amount,\n                    SUM(amount) as total_amount,\n                    AVG(processing_time) as avg_processing_time,\n                    COUNT(DISTINCT vendor_name) as unique_vendors,\n                    COUNT(DISTINCT currency) as currencies_used,\n                    SUM(CASE WHEN success THEN 1 ELSE 0 END) as successful_processes\n                FROM invoice_records \n                WHERE session_id = ?\n            \"\"\", [session_id]).fetchone()\n            \n            if not session_stats or session_stats[0] == 0:\n                return {\"message\": \"No processing data for this session\"}\n            \n            # Risk breakdown\n            risk_breakdown = self.conn.execute(\"\"\"\n                SELECT risk_level, COUNT(*) as count\n                FROM invoice_records \n                WHERE session_id = ?\n                GROUP BY risk_level\n            \"\"\", [session_id]).fetchall()\n            \n            return {\n                'total_invoices': session_stats[0],\n                'avg_amount': session_stats[1],\n                'total_amount': session_stats[2],\n                'avg_processing_time': session_stats[3],\n                'unique_vendors': session_stats[4],\n                'currencies_used': session_stats[5],\n                'success_rate': session_stats[6] / session_stats[0] if session_stats[0] > 0 else 0,\n                'risk_breakdown': {level: count for level, count in risk_breakdown}\n            }\n            \n        except Exception as e:\n            return {\"error\": str(e)}\n\n# Test DuckDB Memory Manager\nprint(\"🧪 Testing DuckDB Memory Manager...\")\nduckdb_memory = DuckDBMemoryManager()\n\n# Test session creation\nsession_id = f\"test_session_{int(time.time())}\"\nduckdb_memory.start_session(session_id, \"test_user\")\n\nprint(\"✅ DuckDB Memory Manager implementation complete!\")

In [None]:
# Install memory system dependencies
print("📦 Installing memory system dependencies...")
!pip install -q duckdb==0.9.2 chromadb==0.4.24 sentence-transformers==2.2.2

# Import memory system libraries
import duckdb
import chromadb
from sentence_transformers import SentenceTransformer
import sqlite3  # fallback for compatibility

print("✅ Memory system libraries loaded")

## Task 6: Memory Systems Integration (12 minutes)

Implement DuckDB and Chroma memory systems to enable intelligent context management and cross-session learning.

**Your Task:** Complete the memory-integrated invoice processor that learns from past processing patterns.

## Lab Completion and Self-Assessment

### What You've Built

Congratulations! You've built a production-ready invoice enhancement system with:

1. **Production-Ready API Tools**
   - Currency converter with live exchange rates
   - VAT validator using web search
   - Comprehensive error handling and fallback strategies

2. **Resilience Patterns**
   - Circuit breakers for failure protection
   - Rate limiting for quota management
   - Intelligent caching for cost optimization

3. **Enterprise Features**
   - Cost tracking and optimization
   - Performance monitoring
   - Batch processing capabilities
   - Comprehensive testing suite

### Self-Assessment Questions

Rate your understanding (1-5 scale) and provide brief explanations:

1. **API Integration** (1-5): ___
   - How do you balance API costs with system reliability?
   - What factors determine appropriate cache TTL values?

2. **Circuit Breaker Pattern** (1-5): ___
   - When should a circuit breaker open vs stay half-open?
   - How do you tune failure thresholds for production?

3. **Rate Limiting** (1-5): ___
   - What's the difference between rate limiting and throttling?
   - How do you handle rate limits across multiple services?

4. **Caching Strategy** (1-5): ___
   - What types of data should and shouldn't be cached?
   - How do you invalidate stale cache entries?

5. **Error Handling** (1-5): ___
   - How do you distinguish between retryable and non-retryable errors?
   - What information should be logged for debugging failures?

### Key Production Patterns Learned

**Circuit Breaker Benefits:**
- Prevents cascade failures across services
- Reduces unnecessary API calls during outages
- Enables automatic recovery testing

**Caching Strategy:**
- 50-80% cost reduction with proper TTL
- Significantly improved response times
- Reduced load on external services

**Rate Limiting:**
- Protects against quota exhaustion
- Enables fair resource allocation
- Prevents service degradation

**Cost Optimization:**
- Real-time cost tracking enables budgeting
- Cache hit rates directly impact costs
- Batch processing reduces per-request overhead

### Common Production Issues

**Cache Stampede:**
- Multiple requests for same expired data
- Solution: Cache warming and distributed locking

**Circuit Breaker Tuning:**
- Too sensitive: Unnecessary service blocks
- Too loose: Cascade failures not prevented
- Solution: Monitor and adjust based on service characteristics

**Rate Limit Coordination:**
- Multiple service instances sharing quotas
- Solution: Centralized rate limiting or distributed algorithms

### Advanced Extensions

If you completed the lab early, consider these enhancements:

1. **Exponential Backoff with Jitter**
   - Implement retry logic with randomized delays
   - Reduce thundering herd problems

2. **Request Queuing**
   - Queue requests during rate limiting
   - Process when tokens become available

3. **Metrics Collection**
   - Export metrics to Prometheus/CloudWatch
   - Create alerting on SLA violations

4. **Health Checks**
   - Implement comprehensive health endpoints
   - Include dependency health status

### Integration with LangGraph

Your tools can be easily integrated into LangGraph workflows:

```python
# Add to existing workflow
workflow.add_node("enhance_invoice", orchestrator.enhance_invoice_data)
workflow.add_node("convert_currency", currency_converter_node)
workflow.add_node("validate_vat", vat_validator_node)

# Add conditional routing based on enhancement results
workflow.add_conditional_edges(
    "enhance_invoice",
    lambda state: "approve" if state["risk_assessment"]["risk_level"] == "LOW" else "review",
    {"approve": "auto_approve", "review": "manual_review"}
)
```

### Next Steps

To further your API integration expertise:

1. **Learn Advanced Patterns**
   - Bulkhead pattern for resource isolation
   - Saga pattern for distributed transactions

2. **Monitor in Production**
   - Set up comprehensive monitoring
   - Create runbooks for common failures

3. **Scale Considerations**
   - Implement distributed caching (Redis)
   - Use message queues for async processing

**Congratulations!** You've built enterprise-grade API integration tools that can handle real production workloads reliably and cost-effectively!