# 🧪 API Testing: Comprehensive KubeSentiment API Testing

This notebook provides comprehensive testing of the KubeSentiment API endpoints, including functional testing, load testing, error handling, and integration testing.

## 🎯 Learning Objectives

By the end of this notebook, you will:
1. Understand all API endpoints and their functionality
2. Perform comprehensive functional testing
3. Conduct load and performance testing
4. Test error handling and edge cases
5. Validate API responses and schemas
6. Test caching and performance optimizations

## 📦 Setup and Dependencies

First, let's install the required dependencies and set up our environment.

In [None]:
# Install required packages for this notebook
# Note: This cell might take a few minutes to run
!pip install -r ../requirements.txt

### ✅ Version Check
Let's check the versions of the installed libraries to ensure our environment is reproducible.

In [None]:
# List installed packages to ensure reproducibility
!pip list

## 📋 API Endpoints Overview

### Core Endpoints

| Endpoint | Method | Purpose | Expected Response Time |
|----------|--------|---------|----------------------|
| `/health` | GET | Service health check | <10ms |
| `/predict` | POST | Sentiment analysis | <100ms |
| `/model-info` | GET | Model information | <20ms |
| `/metrics` | GET | Prometheus metrics | <50ms |
| `/metrics-json` | GET | JSON metrics | <50ms |

### Request/Response Formats

**Health Check:**
```json
{
  "status": "healthy",
  "model_status": "available",
  "version": "1.0.0",
  "timestamp": 1703123456.789
}
```

**Prediction:**
```json
{
  "text": "I love this product!",
  "label": "POSITIVE",
  "score": 0.9998,
  "inference_time_ms": 45.2,
  "model_name": "distilbert-base-uncased-finetuned-sst-2-english",
  "text_length": 20
}
```

In [None]:
# Setup and imports
import requests
import httpx
import json
import time
import asyncio
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from typing import List, Dict, Any, Optional
from concurrent.futures import ThreadPoolExecutor, as_completed
import warnings
warnings.filterwarnings('ignore')

# Set style
plt.style.use('default')
sns.set_palette("husl")

# Configuration
API_BASE_URL = "http://localhost:8000"
TIMEOUT = 30

print("✅ Libraries imported successfully!")
print(f"🎯 Testing API at: {API_BASE_URL}")
print(f"⏱️ Timeout: {TIMEOUT} seconds")

## 🔍 Initial Health Check

Let's start by checking if the API is available and healthy.

In [None]:
# API Client class for organized testing
class KubeSentimentAPIClient:
    """Client for testing KubeSentiment API endpoints."""
    
    def __init__(self, base_url: str, timeout: int = 30):
        self.base_url = base_url.rstrip('/')
        self.timeout = timeout
        self.session = requests.Session()
        
    def health_check(self) -> Dict[str, Any]:
        """Check service health."""
        try:
            start_time = time.time()
            response = self.session.get(f"{self.base_url}/health", timeout=self.timeout)
            response_time = (time.time() - start_time) * 1000
            
            return {
                "success": response.status_code == 200,
                "status_code": response.status_code,
                "response_time_ms": round(response_time, 2),
                "data": response.json() if response.status_code == 200 else None,
                "error": None
            }
        except Exception as e:
            return {
                "success": False,
                "status_code": None,
                "response_time_ms": None,
                "data": None,
                "error": str(e)
            }
    
    def predict(self, text: str) -> Dict[str, Any]:
        """Make sentiment prediction."""
        try:
            start_time = time.time()
            response = self.session.post(
                f"{self.base_url}/predict",
                json={"text": text},
                timeout=self.timeout
            )
            response_time = (time.time() - start_time) * 1000
            
            return {
                "success": response.status_code == 200,
                "status_code": response.status_code,
                "response_time_ms": round(response_time, 2),
                "data": response.json() if response.status_code == 200 else None,
                "error": response.text if response.status_code != 200 else None,
                "text": text
            }
        except Exception as e:
            return {
                "success": False,
                "status_code": None,
                "response_time_ms": None,
                "data": None,
                "error": str(e),
                "text": text
            }
    
    def get_model_info(self) -> Dict[str, Any]:
        """Get model information."""
        try:
            start_time = time.time()
            response = self.session.get(f"{self.base_url}/model-info", timeout=self.timeout)
            response_time = (time.time() - start_time) * 1000
            
            return {
                "success": response.status_code == 200,
                "status_code": response.status_code,
                "response_time_ms": round(response_time, 2),
                "data": response.json() if response.status_code == 200 else None,
                "error": None
            }
        except Exception as e:
            return {
                "success": False,
                "status_code": None,
                "response_time_ms": None,
                "data": None,
                "error": str(e)
            }
    
    def get_metrics(self) -> Dict[str, Any]:
        """Get metrics in JSON format."""
        try:
            start_time = time.time()
            response = self.session.get(f"{self.base_url}/metrics-json", timeout=self.timeout)
            response_time = (time.time() - start_time) * 1000
            
            return {
                "success": response.status_code == 200,
                "status_code": response.status_code,
                "response_time_ms": round(response_time, 2),
                "data": response.json() if response.status_code == 200 else None,
                "error": None
            }
        except Exception as e:
            return {
                "success": False,
                "status_code": None,
                "response_time_ms": None,
                "data": None,
                "error": str(e)
            }

# Initialize API client
client = KubeSentimentAPIClient(API_BASE_URL, TIMEOUT)

# Test basic connectivity
print("🔍 Initial API Health Check:")
print("=" * 50)

health_result = client.health_check()

if health_result["success"]:
    print("✅ API is healthy!")
    print(f"📊 Status: {health_result['data']['status']}")
    print(f"🤖 Model Status: {health_result['data']['model_status']}")
    print(f"🏷️ Version: {health_result['data']['version']}")
    print(f"⚡ Response Time: {health_result['response_time_ms']}ms")
    
    api_available = True
else:
    print("❌ API is not available")
    print(f"🔍 Error: {health_result['error']}")
    print("\n💡 Make sure the service is running:")
    print("   docker run -d -p 8000:8000 sentiment-service:latest")
    print("   # or")
    print("   python -m uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload")
    
    api_available = False

## 🧪 Functional Testing

Let's perform comprehensive functional testing of all API endpoints.

In [None]:
# Functional tests for all endpoints
def run_functional_tests(client: KubeSentimentAPIClient) -> Dict[str, Any]:
    """Run comprehensive functional tests."""
    results = {
        "health_check": client.health_check(),
        "model_info": client.get_model_info(),
        "metrics": client.get_metrics(),
        "predictions": []
    }
    
    # Test predictions with various inputs
    test_texts = [
        "I love this amazing product!",
        "This is absolutely terrible.",
        "It's okay, nothing special.",
        "Outstanding customer service!",
        "Complete disaster. Never again."
    ]
    
    for text in test_texts:
        results["predictions"].append(client.predict(text))
    
    return results

# Run functional tests
if api_available:
    print("🧪 Running Functional Tests:")
    print("=" * 50)
    
    functional_results = run_functional_tests(client)
    
    # Analyze results
    tests_passed = 0
    total_tests = 0
    
    # Health check
    total_tests += 1
    if functional_results["health_check"]["success"]:
        tests_passed += 1
        print("✅ Health Check: PASSED")
    else:
        print("❌ Health Check: FAILED")
    
    # Model info
    total_tests += 1
    if functional_results["model_info"]["success"]:
        tests_passed += 1
        print("✅ Model Info: PASSED")
    else:
        print("❌ Model Info: FAILED")
    
    # Metrics
    total_tests += 1
    if functional_results["metrics"]["success"]:
        tests_passed += 1
        print("✅ Metrics: PASSED")
    else:
        print("❌ Metrics: FAILED")
    
    # Predictions
    for i, pred in enumerate(functional_results["predictions"]):
        total_tests += 1
        if pred["success"]:
            tests_passed += 1
            print(f"✅ Prediction {i+1}: PASSED")
        else:
            print(f"❌ Prediction {i+1}: FAILED")
    
    print(f"\n📊 Functional Test Results: {tests_passed}/{total_tests} PASSED")
    
    # Show sample prediction results
    if functional_results["predictions"]:
        print("\n🎯 Sample Prediction Results:")
        for i, pred in enumerate(functional_results["predictions"][:3]):
            if pred["success"] and pred["data"]:
                data = pred["data"]
                print(f"  {i+1}. '{pred['text'][:30]}...' → {data['label']} ({data['score']:.3f}) - {data['inference_time_ms']:.1f}ms")

else:
    print("⏭️ Skipping functional tests - API not available")

## 🚨 Error Handling Testing

Let's test how the API handles various error conditions and edge cases.

In [None]:
# Error handling test cases
error_test_cases = [
    {
        "name": "Empty text",
        "text": "",
        "expected_error": True
    },
    {
        "name": "Whitespace only",
        "text": "   ",
        "expected_error": True
    },
    {
        "name": "Text too long",
        "text": "a" * 10000,
        "expected_error": True
    },
    {
        "name": "Very short text",
        "text": "Hi",
        "expected_error": False
    },
    {
        "name": "Normal positive text",
        "text": "This is great!",
        "expected_error": False
    },
    {
        "name": "Normal negative text",
        "text": "This is terrible.",
        "expected_error": False
    },
    {
        "name": "Special characters",
        "text": "Hello! @#$%^&*() 🌟",
        "expected_error": False
    }
]

# Run error handling tests
if api_available:
    print("🚨 Error Handling Tests:")
    print("=" * 50)
    
    error_results = []
    
    for test_case in error_test_cases:
        result = client.predict(test_case["text"])
        
        # Determine if this behaved as expected
        actual_error = not result["success"]
        expected_error = test_case["expected_error"]
        test_passed = actual_error == expected_error
        
        error_results.append({
            "test_name": test_case["name"],
            "expected_error": expected_error,
            "actual_error": actual_error,
            "passed": test_passed,
            "status_code": result["status_code"],
            "response_time_ms": result["response_time_ms"],
            "error_message": result["error"]
        })
        
        status = "✅ PASSED" if test_passed else "❌ FAILED"
        print(f"{status} {test_case['name']}: Expected error={expected_error}, Got error={actual_error}")
        
        if not test_passed:
            if result["error"]:
                print(f"      Error: {result['error'][:100]}...")
    
    # Summary
    passed_tests = sum(1 for r in error_results if r["passed"])
    total_tests = len(error_results)
    
    print(f"\n📊 Error Handling Test Results: {passed_tests}/{total_tests} PASSED")
    
    # Show detailed results
    error_df = pd.DataFrame(error_results)
    display(error_df)

else:
    print("⏭️ Skipping error handling tests - API not available")

## ⚡ Performance Testing

Let's test the API performance under different loads.

In [None]:
# Performance testing functions
def run_load_test(client: KubeSentimentAPIClient, num_requests: int, concurrent_users: int = 1) -> Dict[str, Any]:
    """Run load test with specified parameters."""
    
    test_texts = [
        "I love this product! It's amazing.",
        "This is terrible. Complete waste of money.",
        "It's okay, nothing special but it works.",
        "Outstanding quality and great service!",
        "Awful experience. Never again."
    ]
    
    results = []
    
    def single_request(text: str) -> Dict[str, Any]:
        result = client.predict(text)
        return {
            "success": result["success"],
            "response_time_ms": result["response_time_ms"],
            "status_code": result["status_code"],
            "text_length": len(text)
        }
    
    start_time = time.time()
    
    if concurrent_users == 1:
        # Sequential requests
        for i in range(num_requests):
            text = test_texts[i % len(test_texts)]
            results.append(single_request(text))
    else:
        # Concurrent requests
        with ThreadPoolExecutor(max_workers=concurrent_users) as executor:
            futures = []
            for i in range(num_requests):
                text = test_texts[i % len(test_texts)]
                futures.append(executor.submit(single_request, text))
            
            for future in as_completed(futures):
                results.append(future.result())
    
    total_time = time.time() - start_time
    
    # Calculate metrics
    successful_requests = [r for r in results if r["success"]]
    response_times = [r["response_time_ms"] for r in successful_requests if r["response_time_ms"]]
    
    return {
        "total_requests": num_requests,
        "concurrent_users": concurrent_users,
        "total_time_seconds": round(total_time, 2),
        "successful_requests": len(successful_requests),
        "success_rate": len(successful_requests) / num_requests,
        "avg_response_time_ms": round(np.mean(response_times), 2) if response_times else None,
        "min_response_time_ms": round(min(response_times), 2) if response_times else None,
        "max_response_time_ms": round(max(response_times), 2) if response_times else None,
        "p95_response_time_ms": round(np.percentile(response_times, 95), 2) if response_times else None,
        "requests_per_second": round(num_requests / total_time, 2)
    }

# Run performance tests
if api_available:
    print("⚡ Performance Testing:")
    print("=" * 50)
    
    # Test scenarios
    test_scenarios = [
        {"name": "Light Load", "requests": 10, "concurrency": 1},
        {"name": "Medium Load", "requests": 50, "concurrency": 2},
        {"name": "Heavy Load", "requests": 100, "concurrency": 5}
    ]
    
    performance_results = []
    
    for scenario in test_scenarios:
        print(f"🧪 Running {scenario['name']} Test...")
        result = run_load_test(client, scenario["requests"], scenario["concurrency"])
        result["scenario"] = scenario["name"]
        performance_results.append(result)
        
        print(f"   ✅ Success Rate: {result['success_rate']:.1%}")
        print(f"   ⚡ Requests/sec: {result['requests_per_second']:.2f}")
        print(f"   📊 Avg Response Time: {result['avg_response_time_ms']}ms")
        print(f"   📈 P95 Response Time: {result['p95_response_time_ms']}ms")
        print()
    
    # Create performance comparison chart
    perf_df = pd.DataFrame(performance_results)
    
    fig, axes = plt.subplots(2, 2, figsize=(12, 10))
    fig.suptitle('API Performance Test Results', fontsize=16)
    
    # Success rates
    axes[0, 0].bar(perf_df['scenario'], perf_df['success_rate'], color='skyblue')
    axes[0, 0].set_title('Success Rate by Test Scenario')
    axes[0, 0].set_ylabel('Success Rate')
    axes[0, 0].set_ylim(0, 1)
    
    # Response times
    axes[0, 1].bar(perf_df['scenario'], perf_df['avg_response_time_ms'], color='lightcoral')
    axes[0, 1].set_title('Average Response Time')
    axes[0, 1].set_ylabel('Response Time (ms)')
    
    # Throughput
    axes[1, 0].bar(perf_df['scenario'], perf_df['requests_per_second'], color='lightgreen')
    axes[1, 0].set_title('Requests Per Second')
    axes[1, 0].set_ylabel('Requests/sec')
    
    # P95 response times
    axes[1, 1].bar(perf_df['scenario'], perf_df['p95_response_time_ms'], color='orange')
    axes[1, 1].set_title('95th Percentile Response Time')
    axes[1, 1].set_ylabel('Response Time (ms)')
    
    plt.tight_layout()
    plt.show()
    
    # Display detailed results
    print("📊 Detailed Performance Results:")
    display(perf_df)

else:
    print("⏭️ Skipping performance tests - API not available")

## 🔄 Caching Performance Test

Let's test the prediction caching functionality.

In [None]:
# Test prediction caching
def test_prediction_caching(client: KubeSentimentAPIClient, num_iterations: int = 5) -> Dict[str, Any]:
    """Test prediction caching by making repeated requests."""
    
    test_texts = [
        "This is an amazing product!",
        "I absolutely love this service.",
        "Outstanding quality and support.",
        "Best purchase I've ever made.",
        "Highly recommended to everyone."
    ]
    
    all_results = []
    
    for iteration in range(num_iterations):
        print(f"🔄 Iteration {iteration + 1}/{num_iterations}...")
        
        for text in test_texts:
            result = client.predict(text)
            
            all_results.append({
                "iteration": iteration + 1,
                "text": text,
                "success": result["success"],
                "response_time_ms": result["response_time_ms"],
                "cached": result["data"].get("cached", False) if result["data"] else False
            })
    
    # Analyze caching performance
    df = pd.DataFrame(all_results)
    
    cached_results = df[df["cached"] == True]
    uncached_results = df[df["cached"] == False]
    
    return {
        "total_requests": len(df),
        "cached_requests": len(cached_results),
        "cache_hit_rate": len(cached_results) / len(df),
        "avg_cached_time_ms": cached_results["response_time_ms"].mean() if len(cached_results) > 0 else None,
        "avg_uncached_time_ms": uncached_results["response_time_ms"].mean() if len(uncached_results) > 0 else None,
        "time_improvement_percent": (
            (uncached_results["response_time_ms"].mean() - cached_results["response_time_ms"].mean()) 
            / uncached_results["response_time_ms"].mean() * 100
        ) if len(cached_results) > 0 and len(uncached_results) > 0 else None,
        "detailed_results": df
    }

# Run caching test
if api_available:
    print("🔄 Testing Prediction Caching:")
    print("=" * 50)
    
    caching_results = test_prediction_caching(client, num_iterations=3)
    
    print(f"📊 Total Requests: {caching_results['total_requests']}")
    print(f"💾 Cached Requests: {caching_results['cached_requests']}")
    print(f"🎯 Cache Hit Rate: {caching_results['cache_hit_rate']:.1%}")
    
    if caching_results['avg_cached_time_ms']:
        print(f"⚡ Avg Cached Response Time: {caching_results['avg_cached_time_ms']:.2f}ms")
    
    if caching_results['avg_uncached_time_ms']:
        print(f"🐌 Avg Uncached Response Time: {caching_results['avg_uncached_time_ms']:.2f}ms")
    
    if caching_results['time_improvement_percent']:
        print(f"🚀 Time Improvement: {caching_results['time_improvement_percent']:.1f}%")
    
    # Visualize caching performance
    if caching_results['detailed_results'] is not None:
        df = caching_results['detailed_results']
        
        fig, axes = plt.subplots(1, 2, figsize=(12, 5))
        
        # Response time by cache status
        cache_status_data = [
            df[df['cached'] == False]['response_time_ms'],
            df[df['cached'] == True]['response_time_ms']
        ]
        
        axes[0].boxplot(cache_status_data, labels=['Uncached', 'Cached'])
        axes[0].set_title('Response Time: Cached vs Uncached')
        axes[0].set_ylabel('Response Time (ms)')
        
        # Cache hits over iterations
        iteration_cache_hits = df.groupby('iteration')['cached'].sum()
        axes[1].plot(iteration_cache_hits.index, iteration_cache_hits.values, marker='o')
        axes[1].set_title('Cache Hits by Iteration')
        axes[1].set_xlabel('Iteration')
        axes[1].set_ylabel('Number of Cache Hits')
        axes[1].set_xticks(range(1, len(iteration_cache_hits) + 1))
        
        plt.tight_layout()
        plt.show()

else:
    print("⏭️ Skipping caching test - API not available")

## 🔄 Async Testing

Let's test the API using async HTTP requests to simulate real-world usage patterns.

In [None]:
# Async testing with httpx
async def async_predict(client: httpx.AsyncClient, text: str) -> Dict[str, Any]:
    """Make async prediction request."""
    try:
        start_time = time.time()
        response = await client.post(
            f"{API_BASE_URL}/predict",
            json={"text": text},
            timeout=TIMEOUT
        )
        response_time = (time.time() - start_time) * 1000
        
        return {
            "success": response.status_code == 200,
            "status_code": response.status_code,
            "response_time_ms": round(response_time, 2),
            "data": response.json() if response.status_code == 200 else None,
            "error": response.text if response.status_code != 200 else None,
            "text": text
        }
    except Exception as e:
        return {
            "success": False,
            "status_code": None,
            "response_time_ms": None,
            "data": None,
            "error": str(e),
            "text": text
        }

async def run_async_load_test(num_requests: int, concurrent_requests: int) -> Dict[str, Any]:
    """Run async load test."""
    
    test_texts = [
        "I love this amazing product!",
        "This is absolutely terrible.",
        "It's decent, does the job.",
        "Outstanding customer service!",
        "Complete waste of money."
    ]
    
    async def make_requests():
        async with httpx.AsyncClient() as client:
            tasks = []
            for i in range(num_requests):
                text = test_texts[i % len(test_texts)]
                tasks.append(async_predict(client, text))
            
            results = await asyncio.gather(*tasks, return_exceptions=True)
            return results
    
    start_time = time.time()
    results = await make_requests()
    total_time = time.time() - start_time
    
    # Process results
    successful_requests = [r for r in results if isinstance(r, dict) and r.get("success", False)]
    response_times = [r["response_time_ms"] for r in successful_requests if r.get("response_time_ms")]
    
    return {
        "total_requests": num_requests,
        "concurrent_requests": concurrent_requests,
        "total_time_seconds": round(total_time, 2),
        "successful_requests": len(successful_requests),
        "success_rate": len(successful_requests) / num_requests if num_requests > 0 else 0,
        "avg_response_time_ms": round(np.mean(response_times), 2) if response_times else None,
        "min_response_time_ms": round(min(response_times), 2) if response_times else None,
        "max_response_time_ms": round(max(response_times), 2) if response_times else None,
        "p95_response_time_ms": round(np.percentile(response_times, 95), 2) if response_times else None,
        "requests_per_second": round(num_requests / total_time, 2) if total_time > 0 else 0
    }

# Run async performance test
if api_available:
    print("🔄 Async Performance Testing:")
    print("=" * 50)
    
    async_results = await run_async_load_test(num_requests=50, concurrent_requests=10)
    
    print(f"📊 Total Requests: {async_results['total_requests']}")
    print(f"🔄 Concurrent Requests: {async_results['concurrent_requests']}")
    print(f"✅ Success Rate: {async_results['success_rate']:.1%}")
    print(f"⚡ Requests/sec: {async_results['requests_per_second']:.2f}")
    print(f"📊 Avg Response Time: {async_results['avg_response_time_ms']}ms")
    print(f"📈 P95 Response Time: {async_results['p95_response_time_ms']}ms")
    
    print("\n🔍 Async vs Sync Comparison:")
    if 'performance_results' in locals():
        sync_avg = performance_results[1]['avg_response_time_ms']  # Medium load test
        async_avg = async_results['avg_response_time_ms']
        
        print(f"   🔄 Sync (50 requests, 2 concurrent): {sync_avg}ms avg")
        print(f"   ⚡ Async (50 requests, 10 concurrent): {async_avg}ms avg")
        
        if sync_avg and async_avg:
            improvement = ((sync_avg - async_avg) / sync_avg) * 100
            print(f"   🚀 Async improvement: {improvement:.1f}%")

else:
    print("⏭️ Skipping async testing - API not available")

## 🧪 Automated Testing

We can integrate automated tests directly into our notebooks using `pytest`.

In [None]:
# Create a simple test file
test_code = """
import requests
def test_health_check():
    # Health check should be available without auth
    response = requests.get('http://localhost:8000/health')
    assert response.status_code == 200, f\"Expected 200, got {response.status_code}\"
    assert response.json()[\"status\"] == \"healthy\", \"Service is not healthy\"

def test_prediction():
    # Test that a prediction can be made successfully
    response = requests.post('http://localhost:8000/predict', json={'text': 'This is a test'})
    assert response.status_code == 200, f\"Expected 200, got {response.status_code}\"
    assert 'label' in response.json(), \"Response should contain a 'label' key\"
"""
with open("test_api.py", "w") as f:
    f.write(test_code)

# Run pytest
!pytest test_api.py -v

## 📋 Test Report Generation

Let's generate a comprehensive test report summarizing all our findings.

In [None]:
# Generate comprehensive test report
def generate_test_report():
    """Generate a comprehensive test report."""
    
    report = {
        "test_timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
        "api_endpoint": API_BASE_URL,
        "api_available": api_available,
        "sections": {}
    }
    
    if not api_available:
        return report
    
    # Health Check Section
    if 'health_result' in locals():
        report["sections"]["health_check"] = {
            "status": health_result["success"],
            "response_time_ms": health_result["response_time_ms"],
            "data": health_result["data"]
        }
    
    # Functional Tests Section
    if 'functional_results' in locals():
        functional_summary = {
            "total_tests": 3 + len(functional_results["predictions"]),  # health, model_info, metrics + predictions
            "passed_tests": sum([
                functional_results["health_check"]["success"],
                functional_results["model_info"]["success"],
                functional_results["metrics"]["success"],
                sum(1 for p in functional_results["predictions"] if p["success"])
            ]),
            "success_rate": None
        }
        functional_summary["success_rate"] = functional_summary["passed_tests"] / functional_summary["total_tests"]
        report["sections"]["functional_tests"] = functional_summary
    
    # Error Handling Section
    if 'error_results' in locals():
        error_summary = {
            "total_tests": len(error_results),
            "passed_tests": sum(1 for r in error_results if r["passed"]),
            "success_rate": sum(1 for r in error_results if r["passed"]) / len(error_results),
            "details": error_results
        }
        report["sections"]["error_handling"] = error_summary
    
    # Performance Section
    if 'performance_results' in locals():
        report["sections"]["performance"] = {
            "scenarios_tested": len(performance_results),
            "best_throughput": max(r["requests_per_second"] for r in performance_results),
            "avg_response_time_ms": np.mean([r["avg_response_time_ms"] for r in performance_results if r["avg_response_time_ms"]]),
            "overall_success_rate": np.mean([r["success_rate"] for r in performance_results]),
            "details": performance_results
        }
    
    # Caching Section
    if 'caching_results' in locals():
        report["sections"]["caching"] = {
            "cache_hit_rate": caching_results["cache_hit_rate"],
            "time_improvement_percent": caching_results["time_improvement_percent"],
            "avg_cached_time_ms": caching_results["avg_cached_time_ms"],
            "avg_uncached_time_ms": caching_results["avg_uncached_time_ms"]
        }
    
    # Async Performance Section
    if 'async_results' in locals():
        report["sections"]["async_performance"] = async_results
    
    return report

# Generate and display test report
print("📋 API Testing Report:")
print("=" * 60)

test_report = generate_test_report()

print(f"🕒 Test Timestamp: {test_report['test_timestamp']}")
print(f"🌐 API Endpoint: {test_report['api_endpoint']}")
print(f"✅ API Available: {test_report['api_available']}")

if test_report['api_available']:
    for section_name, section_data in test_report['sections'].items():
        print(f"\n📊 {section_name.replace('_', ' ').title()}:")
        
        if section_name == "health_check":
            print(f"   Status: {'✅ Healthy' if section_data['status'] else '❌ Unhealthy'}")
            print(f"   Response Time: {section_data['response_time_ms']}ms")
            if section_data['data']:
                print(f"   Version: {section_data['data'].get('version', 'N/A')}")
                print(f"   Model Status: {section_data['data'].get('model_status', 'N/A')}")
        
        elif section_name in ["functional_tests", "error_handling"]:
            print(f"   Success Rate: {section_data['success_rate']:.1%}")
            print(f"   Passed: {section_data['passed_tests']}/{section_data['total_tests']}")
        
        elif section_name == "performance":
            print(f"   Best Throughput: {section_data['best_throughput']:.1f} req/sec")
            print(f"   Avg Response Time: {section_data['avg_response_time_ms']:.1f}ms")
            print(f"   Overall Success Rate: {section_data['overall_success_rate']:.1%}")
        
        elif section_name == "caching":
            print(f"   Cache Hit Rate: {section_data['cache_hit_rate']:.1%}")
            if section_data['time_improvement_percent']:
                print(f"   Time Improvement: {section_data['time_improvement_percent']:.1f}%")
            print(f"   Cached Avg Time: {section_data['avg_cached_time_ms']:.1f}ms")
            print(f"   Uncached Avg Time: {section_data['avg_uncached_time_ms']:.1f}ms")
        
        elif section_name == "async_performance":
            print(f"   Throughput: {section_data['requests_per_second']:.1f} req/sec")
            print(f"   Avg Response Time: {section_data['avg_response_time_ms']:.1f}ms")
            print(f"   Success Rate: {section_data['success_rate']:.1%}")

# Save report to JSON
import json
report_filename = f"api_test_report_{int(time.time())}.json"
with open(report_filename, 'w') as f:
    json.dump(test_report, f, indent=2, default=str)

print(f"\n💾 Test report saved to: {report_filename}")

print("\n🎉 API Testing Complete!")
print("\n💡 Next Steps:")
print("   • Review the test results above")
print("   • Check the generated JSON report for detailed metrics")
print("   • Explore other notebooks for advanced analysis")
print("   • Consider running the benchmarking suite for production testing")