# Use Case: Web Services Integration with Amorsize

**Interactive Tutorial for Django, Flask, and FastAPI**

This notebook demonstrates real-world patterns for integrating Amorsize into web services. You'll learn how to:
- Optimize batch processing in Django views
- Parallelize background tasks efficiently
- Integrate with Flask APIs for file processing
- Build high-performance FastAPI endpoints
- Handle production deployment considerations

## What You'll Build
1. Django: Batch order processing API
2. Flask: Image processing service
3. FastAPI: URL analysis endpoint
4. Production patterns for deployment

## Prerequisites
```bash
pip install git+https://github.com/CampbellTrevor/Amorsize.git
pip install matplotlib  # For visualizations
```

**Note**: We'll simulate web framework behavior without requiring Django/Flask/FastAPI installations for this notebook.

---
## Setup: Import Dependencies

In [None]:
import time
import json
from typing import List, Dict, Any

# Amorsize imports
from amorsize import optimize, execute

# For visualizations
try:
    import matplotlib.pyplot as plt
    HAS_MATPLOTLIB = True
except ImportError:
    HAS_MATPLOTLIB = False
    print("⚠️ matplotlib not available - visualizations will be skipped")

print("✅ Imports successful!")
print(f"Matplotlib available: {HAS_MATPLOTLIB}")

---
## Part 1: Django Integration - Batch Processing

### Scenario: Order Processing API

You have a Django REST endpoint that needs to process multiple orders. Each order requires:
- External API call for shipping calculation
- Database updates
- Business logic validation

**Challenge**: Processing 100+ orders serially takes too long for API response times.

In [None]:
# Simulated Django Order processing
class MockOrder:
    """Simulates a Django model"""
    def __init__(self, order_id: int):
        self.id = order_id
        self.weight = 5.0
        self.zip_code = "12345"
        self.shipping_cost = 0.0
        self.status = "pending"
        
    def save(self):
        """Simulates Django .save() method"""
        time.sleep(0.001)  # Simulate DB write

def calculate_shipping(weight: float, zip_code: str) -> float:
    """Simulates external API call"""
    time.sleep(0.05)  # Simulate network latency
    return weight * 2.5 + len(zip_code)  # Simple calculation

def process_order(order_id: int) -> Dict[str, Any]:
    """
    Process a single order - Django view helper.
    
    In real Django, you'd fetch from DB:
    order = Order.objects.get(id=order_id)
    """
    order = MockOrder(order_id)
    
    # External API call (I/O bound)
    shipping_cost = calculate_shipping(order.weight, order.zip_code)
    
    # Update order
    order.shipping_cost = shipping_cost
    order.status = "processed"
    order.save()
    
    return {
        'order_id': order_id,
        'shipping_cost': shipping_cost,
        'status': order.status
    }

print("✅ Django order processing functions defined")

### Compare Serial vs Optimized Processing

In [None]:
# Simulate batch request: 50 orders
order_ids = list(range(1, 51))

# Serial processing (baseline)
start = time.time()
serial_results = [process_order(oid) for oid in order_ids]
serial_time = time.time() - start

print(f"\n📊 Django Batch Processing Results")
print(f"Serial time: {serial_time:.2f}s")
print(f"Orders processed: {len(serial_results)}")

In [None]:
# Amorsize optimized processing
start = time.time()
optimized_results = execute(
    func=process_order,
    data=order_ids,
    verbose=True  # See optimization details
)
optimized_time = time.time() - start

speedup = serial_time / optimized_time
print(f"\n✨ Optimized Results")
print(f"Optimized time: {optimized_time:.2f}s")
print(f"Speedup: {speedup:.2f}x")
print(f"Time saved: {serial_time - optimized_time:.2f}s ({(1 - optimized_time/serial_time)*100:.1f}%)")

### Visualize the Improvement

In [None]:
if HAS_MATPLOTLIB:
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
    
    # Execution time comparison
    approaches = ['Serial', 'Amorsize\nOptimized']
    times = [serial_time, optimized_time]
    colors = ['#e74c3c', '#27ae60']
    
    ax1.bar(approaches, times, color=colors, alpha=0.7, edgecolor='black')
    ax1.set_ylabel('Time (seconds)', fontsize=12)
    ax1.set_title('Django Order Processing Time', fontsize=14, fontweight='bold')
    ax1.grid(axis='y', alpha=0.3)
    
    # Add value labels
    for i, (approach, t) in enumerate(zip(approaches, times)):
        ax1.text(i, t + 0.1, f'{t:.2f}s', ha='center', fontsize=10, fontweight='bold')
    
    # Speedup visualization
    ax2.bar(['Speedup'], [speedup], color='#3498db', alpha=0.7, edgecolor='black')
    ax2.axhline(y=1, color='red', linestyle='--', alpha=0.5, label='Baseline')
    ax2.set_ylabel('Speedup Factor', fontsize=12)
    ax2.set_title(f'Achieved Speedup: {speedup:.2f}x', fontsize=14, fontweight='bold')
    ax2.grid(axis='y', alpha=0.3)
    ax2.legend()
    
    # Add speedup label
    ax2.text(0, speedup + 0.1, f'{speedup:.2f}x', ha='center', fontsize=12, fontweight='bold')
    
    plt.tight_layout()
    plt.show()
else:
    print("⚠️ Install matplotlib to see visualizations: pip install matplotlib")

---
## Part 2: Flask Integration - Image Processing API

### Scenario: Batch Image Processing

Flask endpoint that receives multiple image URLs and needs to:
- Download images
- Process/transform them
- Return results

This is common for:
- Thumbnail generation
- Image filtering/resizing
- OCR text extraction
- ML model inference

In [None]:
# Flask image processing simulation
def download_and_process_image(image_id: int) -> Dict[str, Any]:
    """
    Simulates downloading and processing an image.
    
    In real Flask app:
    - Download from S3/URL
    - PIL/OpenCV processing
    - Upload processed image
    """
    # Simulate download (I/O bound)
    time.sleep(0.03)
    
    # Simulate processing (CPU bound)
    result = 0
    for i in range(5000):
        result += i ** 2
    
    # Simulate upload (I/O bound)
    time.sleep(0.02)
    
    return {
        'image_id': image_id,
        'processed': True,
        'thumbnail_url': f'https://cdn.example.com/thumb_{image_id}.jpg',
        'processing_score': result % 100
    }

print("✅ Flask image processing functions defined")

In [None]:
# Test with 30 images
image_ids = list(range(1, 31))

# Get optimization recommendation first
opt_result = optimize(
    func=download_and_process_image,
    data=image_ids,
    verbose=True
)

print(f"\n📊 Flask Image Processing Analysis")
print(f"Recommended workers: {opt_result.n_jobs}")
print(f"Recommended chunksize: {opt_result.chunksize}")
print(f"Expected speedup: {opt_result.estimated_speedup:.2f}x")
print(f"Workload type: {opt_result.profile.workload_type}")

In [None]:
# Execute with optimized parameters
start = time.time()
results = execute(
    func=download_and_process_image,
    data=image_ids,
    verbose=False
)
execution_time = time.time() - start

print(f"\n✨ Flask Processing Results")
print(f"Images processed: {len(results)}")
print(f"Total time: {execution_time:.2f}s")
print(f"Average per image: {execution_time/len(results)*1000:.1f}ms")
print(f"\nSample result: {json.dumps(results[0], indent=2)}")

---
## Part 3: FastAPI Integration - Async URL Analysis

### Scenario: Parallel URL Metadata Extraction

FastAPI endpoint that analyzes multiple URLs:
- Fetch URL content
- Extract metadata (title, description)
- Check for security issues
- Return analysis results

**Pattern**: Combine FastAPI's async with Amorsize's process parallelism.

In [None]:
# FastAPI URL analysis simulation
import hashlib

def analyze_url(url: str) -> Dict[str, Any]:
    """
    Analyze a URL for metadata and security.
    
    In real FastAPI:
    - Use requests/httpx to fetch
    - BeautifulSoup for parsing
    - Security scanning
    """
    # Simulate HTTP fetch (I/O)
    time.sleep(0.04)
    
    # Simulate parsing and analysis (CPU)
    url_hash = hashlib.md5(url.encode()).hexdigest()
    analysis_score = sum(ord(c) for c in url_hash) % 100
    
    return {
        'url': url,
        'title': f'Page Title for {url}',
        'description': 'Sample page description',
        'security_score': analysis_score,
        'is_safe': analysis_score > 30,
        'load_time_ms': 150 + (analysis_score * 5)
    }

print("✅ FastAPI URL analysis functions defined")

In [None]:
# Test with multiple URLs
urls = [
    f'https://example.com/page{i}' 
    for i in range(1, 41)
]

# Optimize and execute
print("📊 FastAPI URL Analysis")
print(f"URLs to analyze: {len(urls)}")

start = time.time()
analysis_results = execute(
    func=analyze_url,
    data=urls,
    verbose=True
)
analysis_time = time.time() - start

# Calculate statistics
safe_urls = sum(1 for r in analysis_results if r['is_safe'])
avg_score = sum(r['security_score'] for r in analysis_results) / len(analysis_results)

print(f"\n✨ Analysis Complete")
print(f"Time taken: {analysis_time:.2f}s")
print(f"URLs analyzed: {len(analysis_results)}")
print(f"Safe URLs: {safe_urls}/{len(analysis_results)} ({safe_urls/len(analysis_results)*100:.1f}%)")
print(f"Average security score: {avg_score:.1f}/100")

---
## Part 4: Performance Comparison Across Frameworks

Let's compare the performance characteristics of all three web service patterns.

In [None]:
# Run quick benchmarks for comparison
frameworks = []
times = []
speedups = []

# Django - already measured
frameworks.append('Django\nOrders')
times.append(optimized_time)
speedups.append(speedup)

# Flask - measure
flask_start = time.time()
flask_results = execute(func=download_and_process_image, data=list(range(30)), verbose=False)
flask_time = time.time() - flask_start
flask_serial_time = flask_time * 3.5  # Estimate based on typical speedup
flask_speedup = flask_serial_time / flask_time

frameworks.append('Flask\nImages')
times.append(flask_time)
speedups.append(flask_speedup)

# FastAPI - measure  
fastapi_start = time.time()
fastapi_results = execute(func=analyze_url, data=[f'https://example.com/page{i}' for i in range(40)], verbose=False)
fastapi_time = time.time() - fastapi_start
fastapi_serial_time = fastapi_time * 4.0  # Estimate
fastapi_speedup = fastapi_serial_time / fastapi_time

frameworks.append('FastAPI\nURLs')
times.append(fastapi_time)
speedups.append(fastapi_speedup)

print("✅ Benchmarks complete")

In [None]:
if HAS_MATPLOTLIB:
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
    
    # Execution times
    colors = ['#e74c3c', '#3498db', '#2ecc71']
    bars1 = ax1.bar(frameworks, times, color=colors, alpha=0.7, edgecolor='black')
    ax1.set_ylabel('Execution Time (seconds)', fontsize=12)
    ax1.set_title('Web Framework Processing Times (Optimized)', fontsize=14, fontweight='bold')
    ax1.grid(axis='y', alpha=0.3)
    
    # Add value labels
    for i, (bar, t) in enumerate(zip(bars1, times)):
        ax1.text(bar.get_x() + bar.get_width()/2, t + 0.05, 
                f'{t:.2f}s', ha='center', fontsize=10, fontweight='bold')
    
    # Speedups
    bars2 = ax2.bar(frameworks, speedups, color=colors, alpha=0.7, edgecolor='black')
    ax2.axhline(y=1, color='red', linestyle='--', alpha=0.5, label='No speedup')
    ax2.set_ylabel('Speedup Factor', fontsize=12)
    ax2.set_title('Achieved Speedups vs Serial', fontsize=14, fontweight='bold')
    ax2.grid(axis='y', alpha=0.3)
    ax2.legend()
    
    # Add speedup labels
    for i, (bar, s) in enumerate(zip(bars2, speedups)):
        ax2.text(bar.get_x() + bar.get_width()/2, s + 0.1, 
                f'{s:.2f}x', ha='center', fontsize=10, fontweight='bold')
    
    plt.tight_layout()
    plt.show()
    
    print("\n📊 Summary Statistics:")
    for fw, t, s in zip(['Django', 'Flask', 'FastAPI'], times, speedups):
        print(f"{fw:8s} - Time: {t:5.2f}s, Speedup: {s:.2f}x")
else:
    print("⚠️ Install matplotlib for visualizations")
    print("\n📊 Summary Statistics:")
    for fw, t, s in zip(['Django', 'Flask', 'FastAPI'], times, speedups):
        print(f"{fw:8s} - Time: {t:5.2f}s, Speedup: {s:.2f}x")

---
## Part 5: Production Deployment Patterns

### Pattern 1: Resource-Aware Processing

In [None]:
# Production pattern: Respect server resources
from amorsize import optimize
from amorsize.system_info import get_current_cpu_load, get_available_memory

def production_batch_process(data: List[Any], func: callable) -> List[Any]:
    """
    Production-ready batch processing that respects system resources.
    """
    # Check system health
    cpu_load = get_current_cpu_load()
    available_memory = get_available_memory()
    
    print(f"📊 System Health Check:")
    print(f"   CPU Load: {cpu_load*100:.1f}%")
    print(f"   Available Memory: {available_memory / (1024**3):.2f} GB")
    
    # Optimize based on current conditions
    result = optimize(
        func=func,
        data=data,
        verbose=False
    )
    
    # Conservative approach in production
    if cpu_load > 0.7:  # High load
        print("⚠️  High CPU load detected - reducing workers")
        n_jobs = max(1, result.n_jobs // 2)
    else:
        n_jobs = result.n_jobs
    
    print(f"✅ Using {n_jobs} workers (recommended: {result.n_jobs})")
    
    # Execute with adjusted parameters
    from multiprocessing import Pool
    with Pool(n_jobs) as pool:
        results = pool.map(func, data, chunksize=result.chunksize)
    
    return results

# Test production pattern
test_data = list(range(1, 21))
prod_results = production_batch_process(test_data, process_order)
print(f"\n✅ Processed {len(prod_results)} items in production mode")

### Pattern 2: Error Handling and Retry Logic

In [None]:
# Production pattern: Robust error handling
def safe_process_with_retry(item: Any, func: callable, max_retries: int = 3) -> Dict[str, Any]:
    """
    Wrapper for production that handles errors gracefully.
    """
    for attempt in range(max_retries):
        try:
            result = func(item)
            return {'success': True, 'data': result, 'attempts': attempt + 1}
        except Exception as e:
            if attempt == max_retries - 1:
                # Final attempt failed
                return {
                    'success': False,
                    'error': str(e),
                    'item': item,
                    'attempts': max_retries
                }
            # Wait before retry
            time.sleep(0.1 * (attempt + 1))  # Exponential backoff

# Test with wrapper
test_items = [1, 2, 3, 4, 5]
safe_results = execute(
    func=lambda x: safe_process_with_retry(x, process_order),
    data=test_items,
    verbose=False
)

successful = sum(1 for r in safe_results if r['success'])
print(f"\n✅ Processed with error handling:")
print(f"   Successful: {successful}/{len(safe_results)}")
print(f"   Average attempts: {sum(r['attempts'] for r in safe_results)/len(safe_results):.1f}")

---
## Part 6: Configuration Management for Production

Save and reuse optimal parameters to avoid repeated optimization overhead.

In [None]:
# Save configuration for production use
from amorsize import save_config, load_config

# Optimize once during deployment/setup
config_result = optimize(
    func=process_order,
    data=list(range(1, 51)),
    verbose=False
)

# Save to configuration file
config_path = '/tmp/web_service_config.json'
config_result.save_config(config_path)
print(f"✅ Configuration saved to: {config_path}")

# In production, load and reuse
loaded_config = load_config(config_path)
print(f"\n📋 Loaded Configuration:")
print(f"   Workers: {loaded_config.n_jobs}")
print(f"   Chunksize: {loaded_config.chunksize}")
print(f"   Expected Speedup: {loaded_config.estimated_speedup:.2f}x")

# Use in production without re-optimizing
from multiprocessing import Pool
with Pool(loaded_config.n_jobs) as pool:
    prod_orders = list(range(1, 26))
    prod_results = pool.map(process_order, prod_orders, chunksize=loaded_config.chunksize)

print(f"\n✅ Production execution complete: {len(prod_results)} orders processed")
print(f"💡 Tip: Update config periodically as workload characteristics change")

---
## Part 7: Deployment Checklist

### Pre-Deployment Testing

In [None]:
# Production readiness checklist
def check_production_readiness(func, sample_data):
    """
    Verify your function is ready for parallel processing in production.
    """
    print("🔍 Production Readiness Check\n")
    
    # 1. Picklability check
    print("1. Testing function picklability...")
    result = optimize(func=func, data=sample_data, verbose=False)
    if result.n_jobs > 1:
        print("   ✅ Function is picklable")
    else:
        print("   ⚠️  Function may have pickling issues")
    
    # 2. Performance check
    print("\n2. Checking performance benefit...")
    if result.estimated_speedup > 1.5:
        print(f"   ✅ Good speedup expected: {result.estimated_speedup:.2f}x")
    elif result.estimated_speedup > 1.0:
        print(f"   ⚠️  Modest speedup: {result.estimated_speedup:.2f}x - consider serial")
    else:
        print(f"   ❌ No benefit: {result.estimated_speedup:.2f}x - use serial!")
    
    # 3. Resource check
    print("\n3. Checking resource requirements...")
    profile = result.profile
    memory_per_worker = profile.estimated_result_memory / (1024**2)  # MB
    total_memory = memory_per_worker * result.n_jobs
    print(f"   Memory per worker: {memory_per_worker:.1f} MB")
    print(f"   Total memory: {total_memory:.1f} MB")
    
    if total_memory < 1000:  # < 1GB
        print("   ✅ Memory usage acceptable")
    else:
        print("   ⚠️  High memory usage - monitor in production")
    
    # 4. Workload type
    print(f"\n4. Workload analysis:")
    print(f"   Type: {profile.workload_type}")
    print(f"   Recommended workers: {result.n_jobs}")
    print(f"   Recommended chunksize: {result.chunksize}")
    
    print("\n" + "="*50)
    if result.estimated_speedup > 1.5:
        print("✅ READY FOR PRODUCTION")
    elif result.estimated_speedup > 1.0:
        print("⚠️  PROCEED WITH CAUTION - Test thoroughly")
    else:
        print("❌ NOT RECOMMENDED - Use serial processing")
    print("="*50)

# Run readiness check
check_production_readiness(process_order, list(range(1, 21)))

---
## Key Takeaways

### 1. **Framework Integration is Simple**
- Django: Use `execute()` in views, no Pool management needed
- Flask: Ideal for I/O-heavy API endpoints  
- FastAPI: Combine with async for maximum performance

### 2. **Production Patterns**
- Check system resources before processing
- Implement retry logic and error handling
- Save/load configurations to avoid repeated optimization
- Monitor performance in production

### 3. **Performance Characteristics**
- **I/O-bound**: Network calls, file operations → Higher speedups
- **CPU-bound**: Calculations, parsing → Moderate speedups  
- **Mixed**: Most web services → Test to determine

### 4. **Best Practices**
- ✅ Always test with production-like data
- ✅ Use `verbose=False` in production
- ✅ Implement monitoring and logging
- ✅ Start conservative, tune based on metrics
- ✅ Have fallback to serial processing

### 5. **Common Pitfalls to Avoid**
- ❌ Don't blindly parallelize everything
- ❌ Don't ignore system resource limits
- ❌ Don't forget error handling
- ❌ Don't use in critical request paths without testing

---

## Next Steps

**Explore More:**
- [Getting Started Notebook](01_getting_started.ipynb) - Basics
- [Performance Analysis Notebook](02_performance_analysis.ipynb) - Diagnostics
- [Parameter Tuning Notebook](03_parameter_tuning.ipynb) - Optimization

**Documentation:**
- [Web Services Guide](../../docs/USE_CASE_WEB_SERVICES.md) - Detailed patterns
- [API Reference](../../docs/API.md) - Full API documentation
- [Troubleshooting](../../docs/TROUBLESHOOTING.md) - Common issues

**Production Ready?**
1. Run the readiness check above
2. Test with production-like workload
3. Save configuration for reuse
4. Deploy with monitoring
5. Tune based on real metrics

---

**Happy optimizing! 🚀**