# Production Deployment with SciRS2-Optim

This tutorial covers best practices for deploying SciRS2-Optim in production environments, including scaling, monitoring, and maintenance strategies.

## Table of Contents
1. [Production Architecture](#production-architecture)
2. [Scaling and Performance](#scaling-performance)
3. [Monitoring and Observability](#monitoring-observability)
4. [Error Handling and Recovery](#error-handling)
5. [Security and Compliance](#security-compliance)
6. [Maintenance and Updates](#maintenance-updates)

## Prerequisites
- Completion of previous tutorials
- Basic understanding of production systems
- Familiarity with containerization and cloud platforms

In [None]:
# Import required libraries
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from datetime import datetime, timedelta
import json
import warnings
warnings.filterwarnings('ignore')

# Set up visualization style
plt.style.use('seaborn-v0_8')
sns.set_palette("Set2")
np.random.seed(42)

print("🚀 Production Deployment Tutorial - Environment Ready!")

## Production Architecture {#production-architecture}

Designing scalable and maintainable production architectures for optimization systems.

In [None]:
def simulate_production_architecture():
    """Simulate different production architecture patterns for optimization systems."""
    
    # Architecture patterns
    architectures = {
        'Monolithic': {
            'scalability': 0.3,
            'maintainability': 0.4,
            'fault_tolerance': 0.2,
            'deployment_complexity': 0.2,
            'resource_efficiency': 0.8,
            'development_speed': 0.9
        },
        'Microservices': {
            'scalability': 0.9,
            'maintainability': 0.7,
            'fault_tolerance': 0.8,
            'deployment_complexity': 0.8,
            'resource_efficiency': 0.6,
            'development_speed': 0.5
        },
        'Serverless': {
            'scalability': 0.95,
            'maintainability': 0.8,
            'fault_tolerance': 0.9,
            'deployment_complexity': 0.4,
            'resource_efficiency': 0.9,
            'development_speed': 0.7
        },
        'Hybrid (Edge + Cloud)': {
            'scalability': 0.85,
            'maintainability': 0.6,
            'fault_tolerance': 0.85,
            'deployment_complexity': 0.9,
            'resource_efficiency': 0.8,
            'development_speed': 0.4
        },
        'Container Orchestration': {
            'scalability': 0.85,
            'maintainability': 0.75,
            'fault_tolerance': 0.8,
            'deployment_complexity': 0.7,
            'resource_efficiency': 0.7,
            'development_speed': 0.6
        }
    }
    
    # Deployment environments
    environments = {
        'Development': {
            'instances': 1,
            'cpu_cores': 4,
            'memory_gb': 16,
            'storage_gb': 100,
            'availability_target': 0.95
        },
        'Staging': {
            'instances': 2,
            'cpu_cores': 8,
            'memory_gb': 32,
            'storage_gb': 500,
            'availability_target': 0.98
        },
        'Production': {
            'instances': 10,
            'cpu_cores': 16,
            'memory_gb': 64,
            'storage_gb': 2000,
            'availability_target': 0.999
        },
        'DR (Disaster Recovery)': {
            'instances': 5,
            'cpu_cores': 8,
            'memory_gb': 32,
            'storage_gb': 1000,
            'availability_target': 0.99
        }
    }
    
    # Performance characteristics
    performance_metrics = {
        'Request Latency (ms)': {
            'p50': [50, 75, 120, 200, 90],
            'p95': [100, 150, 300, 500, 180],
            'p99': [200, 300, 600, 1000, 350]
        },
        'Throughput (req/s)': [1000, 5000, 10000, 3000, 7000],
        'Resource Utilization (%)': [60, 40, 20, 55, 45],
        'Cost Efficiency': [0.8, 0.6, 0.9, 0.5, 0.7]
    }
    
    return architectures, environments, performance_metrics

arch_patterns, deploy_envs, perf_metrics = simulate_production_architecture()

# Visualize production architecture analysis
fig, axes = plt.subplots(2, 3, figsize=(18, 12))
fig.suptitle('Production Architecture Analysis', fontsize=16, fontweight='bold')

# Plot 1: Architecture pattern comparison
arch_names = list(arch_patterns.keys())
metrics = ['Scalability', 'Maintainability', 'Fault Tolerance', 'Resource Efficiency']

# Create radar chart for architecture patterns
angles = np.linspace(0, 2 * np.pi, len(metrics), endpoint=False)
angles = np.concatenate((angles, [angles[0]]))

ax_radar = plt.subplot(2, 3, 1, projection='polar')

colors = plt.cm.Set3(np.linspace(0, 1, len(arch_names)))
for i, arch_name in enumerate(arch_names):
    arch_data = arch_patterns[arch_name]
    values = [
        arch_data['scalability'],
        arch_data['maintainability'],
        arch_data['fault_tolerance'],
        arch_data['resource_efficiency']
    ]
    values += [values[0]]  # Complete the circle
    
    ax_radar.plot(angles, values, 'o-', linewidth=2, 
                 label=arch_name.replace(' ', '\n'), color=colors[i])
    ax_radar.fill(angles, values, alpha=0.1, color=colors[i])

ax_radar.set_xticks(angles[:-1])
ax_radar.set_xticklabels(metrics)
ax_radar.set_ylim(0, 1)
ax_radar.set_title('Architecture Pattern Comparison')
ax_radar.legend(loc='upper right', bbox_to_anchor=(1.3, 1.0))

# Plot 2: Environment resource allocation
env_names = list(deploy_envs.keys())
cpu_cores = [deploy_envs[env]['cpu_cores'] * deploy_envs[env]['instances'] for env in env_names]
memory_gb = [deploy_envs[env]['memory_gb'] * deploy_envs[env]['instances'] for env in env_names]
storage_gb = [deploy_envs[env]['storage_gb'] * deploy_envs[env]['instances'] for env in env_names]

x = np.arange(len(env_names))
width = 0.25

# Normalize for comparison
max_cpu = max(cpu_cores)
max_memory = max(memory_gb)
max_storage = max(storage_gb)

axes[0, 1].bar(x - width, [c/max_cpu for c in cpu_cores], width, label='CPU Cores', alpha=0.8)
axes[0, 1].bar(x, [m/max_memory for m in memory_gb], width, label='Memory (GB)', alpha=0.8)
axes[0, 1].bar(x + width, [s/max_storage for s in storage_gb], width, label='Storage (GB)', alpha=0.8)

axes[0, 1].set_xlabel('Environment')
axes[0, 1].set_ylabel('Normalized Resource Allocation')
axes[0, 1].set_title('Resource Allocation by Environment')
axes[0, 1].set_xticks(x)
axes[0, 1].set_xticklabels(env_names, rotation=45, ha='right')
axes[0, 1].legend()
axes[0, 1].grid(True, alpha=0.3)

# Plot 3: Performance vs complexity trade-offs
complexity_scores = [arch_patterns[arch]['deployment_complexity'] for arch in arch_names]
scalability_scores = [arch_patterns[arch]['scalability'] for arch in arch_names]
dev_speed_scores = [arch_patterns[arch]['development_speed'] for arch in arch_names]

# Create bubble chart
bubble_sizes = [speed * 300 for speed in dev_speed_scores]
scatter = axes[0, 2].scatter(complexity_scores, scalability_scores, s=bubble_sizes, 
                           c=range(len(arch_names)), cmap='viridis', alpha=0.7, edgecolors='black')

for i, arch_name in enumerate(arch_names):
    axes[0, 2].annotate(arch_name.replace(' ', '\n'), 
                       (complexity_scores[i], scalability_scores[i]), 
                       xytext=(5, 5), textcoords='offset points', fontsize=9)

axes[0, 2].set_xlabel('Deployment Complexity')
axes[0, 2].set_ylabel('Scalability')
axes[0, 2].set_title('Performance vs Complexity\n(Bubble size = Development speed)')
axes[0, 2].grid(True, alpha=0.3)

# Plot 4: Latency distribution comparison
percentiles = ['p50', 'p95', 'p99']
latency_data = perf_metrics['Request Latency (ms)']

x = np.arange(len(arch_names))
width = 0.25
colors_latency = ['lightblue', 'orange', 'red']

for i, percentile in enumerate(percentiles):
    values = latency_data[percentile]
    axes[1, 0].bar(x + i * width, values, width, label=percentile, 
                  color=colors_latency[i], alpha=0.8)

axes[1, 0].set_xlabel('Architecture Pattern')
axes[1, 0].set_ylabel('Latency (ms)')
axes[1, 0].set_title('Latency Distribution by Architecture')
axes[1, 0].set_xticks(x + width)
axes[1, 0].set_xticklabels([name.replace(' ', '\n') for name in arch_names], rotation=0)
axes[1, 0].legend()
axes[1, 0].set_yscale('log')
axes[1, 0].grid(True, alpha=0.3)

# Plot 5: Cost vs performance analysis
throughput = perf_metrics['Throughput (req/s)']
cost_efficiency = perf_metrics['Cost Efficiency']
resource_util = perf_metrics['Resource Utilization (%)']

scatter = axes[1, 1].scatter(cost_efficiency, throughput, s=[r*5 for r in resource_util], 
                           c=range(len(arch_names)), cmap='RdYlGn', alpha=0.7, edgecolors='black')

for i, arch_name in enumerate(arch_names):
    axes[1, 1].annotate(arch_name.replace(' ', '\n'), 
                       (cost_efficiency[i], throughput[i]), 
                       xytext=(5, 5), textcoords='offset points', fontsize=9)

axes[1, 1].set_xlabel('Cost Efficiency')
axes[1, 1].set_ylabel('Throughput (req/s)')
axes[1, 1].set_title('Cost vs Performance\n(Bubble size = Resource utilization)')
axes[1, 1].grid(True, alpha=0.3)

# Plot 6: Deployment timeline
deployment_phases = ['Planning', 'Development', 'Testing', 'Staging', 'Production', 'Monitoring']
phase_durations = [2, 8, 4, 2, 1, 1]  # weeks
cumulative_time = np.cumsum([0] + phase_durations[:-1])

# Create Gantt chart
colors_gantt = plt.cm.Set3(np.linspace(0, 1, len(deployment_phases)))
for i, (phase, duration, start) in enumerate(zip(deployment_phases, phase_durations, cumulative_time)):
    axes[1, 2].barh(i, duration, left=start, color=colors_gantt[i], alpha=0.7, edgecolor='black')
    axes[1, 2].text(start + duration/2, i, f'{duration}w', ha='center', va='center', fontweight='bold')

axes[1, 2].set_yticks(range(len(deployment_phases)))
axes[1, 2].set_yticklabels(deployment_phases)
axes[1, 2].set_xlabel('Timeline (weeks)')
axes[1, 2].set_title('Production Deployment Timeline')
axes[1, 2].grid(True, alpha=0.3, axis='x')

plt.tight_layout()
plt.show()

print("🏗️ Production Architecture Insights:")
print("   ✅ Serverless provides best scalability and cost efficiency")
print("   ✅ Microservices offer good fault tolerance and maintainability")
print("   ✅ Container orchestration balances complexity and performance")
print("   ⚠️  Hybrid architectures require careful coordination")
print("   ⚠️  Higher scalability often comes with increased complexity")

## Scaling and Performance {#scaling-performance}

Strategies for scaling optimization workloads and managing performance in production.

In [None]:
def simulate_scaling_strategies():
    """Simulate different scaling strategies for optimization workloads."""
    
    # Scaling strategies
    scaling_strategies = {
        'Horizontal Scaling': {
            'cost_per_unit': 1.0,
            'setup_complexity': 0.7,
            'max_scale_factor': 100,
            'fault_tolerance': 0.9,
            'coordination_overhead': 0.3
        },
        'Vertical Scaling': {
            'cost_per_unit': 1.5,
            'setup_complexity': 0.3,
            'max_scale_factor': 10,
            'fault_tolerance': 0.4,
            'coordination_overhead': 0.1
        },
        'Auto-scaling': {
            'cost_per_unit': 0.8,
            'setup_complexity': 0.8,
            'max_scale_factor': 50,
            'fault_tolerance': 0.8,
            'coordination_overhead': 0.2
        },
        'Edge Computing': {
            'cost_per_unit': 1.2,
            'setup_complexity': 0.9,
            'max_scale_factor': 1000,
            'fault_tolerance': 0.7,
            'coordination_overhead': 0.6
        },
        'Hybrid (Cloud + Edge)': {
            'cost_per_unit': 1.1,
            'setup_complexity': 0.95,
            'max_scale_factor': 200,
            'fault_tolerance': 0.85,
            'coordination_overhead': 0.5
        }
    }
    
    # Performance scaling characteristics
    load_levels = np.array([1, 10, 50, 100, 500, 1000, 5000])  # Concurrent requests
    
    # Response time scaling (ms)
    response_times = {
        'Single Instance': load_levels * 2 + 50,
        'Load Balanced': np.minimum(load_levels * 0.5 + 30, 200),
        'Auto-scaled': np.minimum(load_levels * 0.3 + 25, 100),
        'Optimized Cache': np.minimum(load_levels * 0.1 + 15, 50)
    }
    
    # Cost scaling
    cost_scaling = {
        'Fixed Infrastructure': np.full_like(load_levels, 1000, dtype=float),
        'Pay-per-use': load_levels * 2,
        'Reserved Capacity': np.minimum(load_levels * 0.8 + 200, 800),
        'Spot Instances': load_levels * 0.6
    }
    
    return scaling_strategies, load_levels, response_times, cost_scaling

scaling_strats, load_levels, response_times, cost_scaling = simulate_scaling_strategies()

# Visualize scaling and performance analysis
fig, axes = plt.subplots(2, 3, figsize=(18, 12))
fig.suptitle('Scaling and Performance Analysis', fontsize=16, fontweight='bold')

# Plot 1: Scaling strategy comparison
strategy_names = list(scaling_strats.keys())
metrics = ['Cost Efficiency', 'Setup Simplicity', 'Max Scale', 'Fault Tolerance']

# Create radar chart
angles = np.linspace(0, 2 * np.pi, len(metrics), endpoint=False)
angles = np.concatenate((angles, [angles[0]]))

ax_radar = plt.subplot(2, 3, 1, projection='polar')

colors = plt.cm.tab10(np.linspace(0, 1, len(strategy_names)))
for i, strategy in enumerate(strategy_names):
    strategy_data = scaling_strats[strategy]
    values = [
        1.0 / strategy_data['cost_per_unit'],  # Invert cost for efficiency
        1.0 - strategy_data['setup_complexity'],  # Invert complexity for simplicity
        strategy_data['max_scale_factor'] / 1000,  # Normalize scale factor
        strategy_data['fault_tolerance']
    ]
    values += [values[0]]  # Complete the circle
    
    ax_radar.plot(angles, values, 'o-', linewidth=2, 
                 label=strategy.replace(' ', '\n'), color=colors[i])
    ax_radar.fill(angles, values, alpha=0.1, color=colors[i])

ax_radar.set_xticks(angles[:-1])
ax_radar.set_xticklabels(metrics)
ax_radar.set_ylim(0, 1)
ax_radar.set_title('Scaling Strategy Comparison')
ax_radar.legend(loc='upper right', bbox_to_anchor=(1.3, 1.0))

# Plot 2: Response time vs load
for strategy, times in response_times.items():
    axes[0, 1].loglog(load_levels, times, 'o-', linewidth=2, markersize=6, label=strategy)

axes[0, 1].set_xlabel('Concurrent Load (requests)')
axes[0, 1].set_ylabel('Response Time (ms)')
axes[0, 1].set_title('Response Time Scaling')
axes[0, 1].legend()
axes[0, 1].grid(True, alpha=0.3)

# Plot 3: Cost vs load analysis
for pricing_model, costs in cost_scaling.items():
    axes[0, 2].loglog(load_levels, costs, 'o-', linewidth=2, markersize=6, label=pricing_model)

axes[0, 2].set_xlabel('Load Level (requests)')
axes[0, 2].set_ylabel('Cost ($)')
axes[0, 2].set_title('Cost Scaling Models')
axes[0, 2].legend()
axes[0, 2].grid(True, alpha=0.3)

# Plot 4: Auto-scaling behavior simulation
time_hours = np.arange(0, 24, 0.5)
# Simulate daily traffic pattern
base_load = 100
daily_pattern = base_load * (1 + 0.5 * np.sin(2 * np.pi * (time_hours - 6) / 24))
traffic_spikes = np.random.random(len(time_hours)) < 0.1  # 10% chance of spike
daily_pattern[traffic_spikes] *= 3

# Auto-scaling response (with lag)
instances = np.ones_like(time_hours)
for i in range(1, len(time_hours)):
    target_instances = max(1, int(daily_pattern[i] / 50))  # Scale at 50 req/instance
    # Gradual scaling with lag
    instances[i] = instances[i-1] + 0.3 * (target_instances - instances[i-1])

ax_twin = axes[1, 0].twinx()
axes[1, 0].plot(time_hours, daily_pattern, 'b-', linewidth=2, label='Traffic Load')
ax_twin.plot(time_hours, instances, 'r-', linewidth=2, label='Instance Count')

axes[1, 0].set_xlabel('Time (hours)')
axes[1, 0].set_ylabel('Request Load', color='blue')
ax_twin.set_ylabel('Instance Count', color='red')
axes[1, 0].set_title('Auto-scaling Behavior Over 24h')
axes[1, 0].legend(loc='upper left')
ax_twin.legend(loc='upper right')
axes[1, 0].grid(True, alpha=0.3)

# Plot 5: Resource utilization optimization
optimization_techniques = ['Baseline', 'Connection\nPooling', 'Request\nBatching', 
                          'Caching', 'Load\nBalancing', 'All\nOptimizations']
cpu_utilization = [85, 70, 60, 55, 45, 35]
memory_utilization = [80, 75, 65, 50, 60, 40]
throughput_improvement = [1.0, 1.2, 1.5, 1.8, 2.0, 3.0]

x = np.arange(len(optimization_techniques))
width = 0.35

bars1 = axes[1, 1].bar(x - width/2, cpu_utilization, width, label='CPU %', alpha=0.8)
bars2 = axes[1, 1].bar(x + width/2, memory_utilization, width, label='Memory %', alpha=0.8)

# Overlay throughput improvement
ax_twin2 = axes[1, 1].twinx()
ax_twin2.plot(x, throughput_improvement, 'ro-', linewidth=3, markersize=8, label='Throughput Gain')

axes[1, 1].set_xlabel('Optimization Technique')
axes[1, 1].set_ylabel('Resource Utilization (%)')
ax_twin2.set_ylabel('Throughput Multiplier', color='red')
axes[1, 1].set_title('Performance Optimization Impact')
axes[1, 1].set_xticks(x)
axes[1, 1].set_xticklabels(optimization_techniques, rotation=45, ha='right')
axes[1, 1].legend(loc='upper left')
ax_twin2.legend(loc='upper right')
axes[1, 1].grid(True, alpha=0.3)

# Plot 6: Performance bottleneck analysis
bottlenecks = ['CPU', 'Memory', 'I/O', 'Network', 'Database', 'External APIs']
frequency = [0.3, 0.25, 0.2, 0.15, 0.35, 0.4]  # How often each is a bottleneck
impact_severity = [0.7, 0.8, 0.9, 0.6, 0.85, 0.95]  # Impact when it is a bottleneck
detection_difficulty = [0.2, 0.3, 0.6, 0.5, 0.4, 0.8]  # How hard to detect

# Create bubble chart
bubble_sizes = [f * i * 1000 for f, i in zip(frequency, impact_severity)]
scatter = axes[1, 2].scatter(detection_difficulty, impact_severity, s=bubble_sizes, 
                           c=frequency, cmap='Reds', alpha=0.7, edgecolors='black')

for i, bottleneck in enumerate(bottlenecks):
    axes[1, 2].annotate(bottleneck, (detection_difficulty[i], impact_severity[i]), 
                       xytext=(5, 5), textcoords='offset points', fontsize=9)

axes[1, 2].set_xlabel('Detection Difficulty')
axes[1, 2].set_ylabel('Impact Severity')
axes[1, 2].set_title('Performance Bottleneck Analysis\n(Bubble size = Frequency × Impact)')
axes[1, 2].grid(True, alpha=0.3)

plt.colorbar(scatter, ax=axes[1, 2], label='Frequency')

plt.tight_layout()
plt.show()

print("📈 Scaling and Performance Insights:")
print("   ✅ Auto-scaling provides best cost-performance balance")
print("   ✅ Caching significantly reduces resource utilization")
print("   ✅ External API dependencies are major bottlenecks")
print("   ✅ Load balancing improves fault tolerance")
print("   ⚠️  Edge computing requires complex coordination")

## Monitoring and Observability {#monitoring-observability}

Comprehensive monitoring strategies for production optimization systems.

In [None]:
def simulate_monitoring_system():
    """Simulate a comprehensive monitoring and observability system."""
    
    # Monitoring metrics categories
    metric_categories = {
        'Infrastructure': {
            'metrics': ['CPU Usage', 'Memory Usage', 'Disk I/O', 'Network I/O'],
            'collection_frequency': 'every 15s',
            'retention_period': '30 days',
            'alerting_threshold': 80
        },
        'Application': {
            'metrics': ['Request Rate', 'Response Time', 'Error Rate', 'Queue Depth'],
            'collection_frequency': 'every 5s',
            'retention_period': '90 days',
            'alerting_threshold': 95
        },
        'Business': {
            'metrics': ['Optimization Success Rate', 'Model Accuracy', 'User Satisfaction'],
            'collection_frequency': 'every 1m',
            'retention_period': '1 year',
            'alerting_threshold': 90
        },
        'Security': {
            'metrics': ['Failed Logins', 'API Rate Limits', 'Anomalous Patterns'],
            'collection_frequency': 'real-time',
            'retention_period': '2 years',
            'alerting_threshold': 99
        }
    }
    
    # Alert severity levels
    alert_levels = {
        'Critical': {
            'response_time_minutes': 5,
            'escalation_levels': 3,
            'notification_channels': ['SMS', 'Phone', 'Slack', 'Email'],
            'auto_remediation': True
        },
        'High': {
            'response_time_minutes': 15,
            'escalation_levels': 2,
            'notification_channels': ['Slack', 'Email'],
            'auto_remediation': False
        },
        'Medium': {
            'response_time_minutes': 60,
            'escalation_levels': 1,
            'notification_channels': ['Email'],
            'auto_remediation': False
        },
        'Low': {
            'response_time_minutes': 240,
            'escalation_levels': 1,
            'notification_channels': ['Dashboard'],
            'auto_remediation': False
        }
    }
    
    # Generate sample monitoring data
    time_points = np.arange(0, 24, 0.1)  # 24 hours in 6-minute intervals
    
    # Simulate different metrics over time
    monitoring_data = {
        'CPU Usage (%)': 45 + 20 * np.sin(2 * np.pi * time_points / 24) + 5 * np.random.normal(0, 1, len(time_points)),
        'Memory Usage (%)': 60 + 15 * np.sin(2 * np.pi * (time_points - 3) / 24) + 3 * np.random.normal(0, 1, len(time_points)),
        'Response Time (ms)': 120 + 50 * np.sin(2 * np.pi * (time_points - 6) / 24) + 10 * np.random.normal(0, 1, len(time_points)),
        'Error Rate (%)': np.maximum(0, 2 + 3 * np.sin(2 * np.pi * time_points / 12) + 2 * np.random.normal(0, 1, len(time_points))),
        'Throughput (req/s)': 1000 + 500 * np.sin(2 * np.pi * (time_points - 9) / 24) + 50 * np.random.normal(0, 1, len(time_points))
    }
    
    # Ensure realistic bounds
    monitoring_data['CPU Usage (%)'] = np.clip(monitoring_data['CPU Usage (%)'], 0, 100)
    monitoring_data['Memory Usage (%)'] = np.clip(monitoring_data['Memory Usage (%)'], 0, 100)
    monitoring_data['Response Time (ms)'] = np.maximum(monitoring_data['Response Time (ms)'], 50)
    monitoring_data['Error Rate (%)'] = np.clip(monitoring_data['Error Rate (%)'], 0, 10)
    monitoring_data['Throughput (req/s)'] = np.maximum(monitoring_data['Throughput (req/s)'], 100)
    
    return metric_categories, alert_levels, time_points, monitoring_data

metric_cats, alert_levels, time_points, monitoring_data = simulate_monitoring_system()

# Visualize monitoring and observability
fig, axes = plt.subplots(3, 2, figsize=(16, 18))
fig.suptitle('Monitoring and Observability Dashboard', fontsize=16, fontweight='bold')

# Plot 1: Real-time metrics dashboard
metric_names = list(monitoring_data.keys())
colors = plt.cm.tab10(np.linspace(0, 1, len(metric_names)))

for i, (metric, data) in enumerate(monitoring_data.items()):
    if 'Usage' in metric:
        axes[0, 0].plot(time_points, data, label=metric, linewidth=2, color=colors[i])
        
        # Add alert threshold lines
        if 'CPU' in metric:
            axes[0, 0].axhline(y=80, color='red', linestyle='--', alpha=0.7)
        elif 'Memory' in metric:
            axes[0, 0].axhline(y=85, color='orange', linestyle='--', alpha=0.7)

axes[0, 0].set_xlabel('Time (hours)')
axes[0, 0].set_ylabel('Usage (%)')
axes[0, 0].set_title('Resource Usage Monitoring')
axes[0, 0].legend()
axes[0, 0].grid(True, alpha=0.3)
axes[0, 0].set_ylim(0, 100)

# Plot 2: Performance metrics
ax_twin = axes[0, 1].twinx()
axes[0, 1].plot(time_points, monitoring_data['Response Time (ms)'], 'b-', linewidth=2, label='Response Time')
ax_twin.plot(time_points, monitoring_data['Throughput (req/s)'], 'g-', linewidth=2, label='Throughput')

axes[0, 1].axhline(y=200, color='red', linestyle='--', alpha=0.7, label='SLA Threshold')

axes[0, 1].set_xlabel('Time (hours)')
axes[0, 1].set_ylabel('Response Time (ms)', color='blue')
ax_twin.set_ylabel('Throughput (req/s)', color='green')
axes[0, 1].set_title('Performance Metrics')
axes[0, 1].legend(loc='upper left')
ax_twin.legend(loc='upper right')
axes[0, 1].grid(True, alpha=0.3)

# Plot 3: Error rate and alerts
error_data = monitoring_data['Error Rate (%)']
alert_threshold = 5.0
alert_points = error_data > alert_threshold

axes[1, 0].plot(time_points, error_data, 'r-', linewidth=2, label='Error Rate')
axes[1, 0].scatter(time_points[alert_points], error_data[alert_points], 
                  color='red', s=50, marker='x', label='Alerts Triggered')
axes[1, 0].axhline(y=alert_threshold, color='orange', linestyle='--', 
                  alpha=0.7, label='Alert Threshold')

axes[1, 0].set_xlabel('Time (hours)')
axes[1, 0].set_ylabel('Error Rate (%)')
axes[1, 0].set_title('Error Rate Monitoring with Alerts')
axes[1, 0].legend()
axes[1, 0].grid(True, alpha=0.3)
axes[1, 0].set_ylim(0, max(error_data) * 1.1)

# Plot 4: Alert distribution by severity
alert_severities = list(alert_levels.keys())
alert_counts = [15, 8, 25, 45]  # Sample alert counts
response_times = [alert_levels[sev]['response_time_minutes'] for sev in alert_severities]

# Create combined bar and line plot
bars = axes[1, 1].bar(alert_severities, alert_counts, alpha=0.7, 
                     color=['red', 'orange', 'yellow', 'lightblue'])

ax_twin3 = axes[1, 1].twinx()
ax_twin3.plot(alert_severities, response_times, 'ko-', linewidth=3, markersize=8)

axes[1, 1].set_xlabel('Alert Severity')
axes[1, 1].set_ylabel('Alert Count')
ax_twin3.set_ylabel('Response Time (minutes)', color='black')
axes[1, 1].set_title('Alert Distribution and Response Times')
axes[1, 1].grid(True, alpha=0.3)

# Add count labels on bars
for bar, count in zip(bars, alert_counts):
    axes[1, 1].text(bar.get_x() + bar.get_width()/2, bar.get_height() + 1, 
                   str(count), ha='center', va='bottom', fontweight='bold')

# Plot 5: SLA compliance tracking
sla_metrics = ['Availability', 'Response Time', 'Error Rate', 'Throughput']
sla_targets = [99.9, 200, 1.0, 1000]  # Target values
actual_performance = [99.85, 180, 1.2, 1050]  # Actual values
compliance_status = ['✅', '✅', '❌', '✅']

# Normalize for comparison
normalized_targets = [100] * len(sla_metrics)
normalized_actual = []
for i, (target, actual) in enumerate(zip(sla_targets, actual_performance)):
    if 'Error Rate' in sla_metrics[i]:
        # For error rate, lower is better
        normalized_actual.append(100 * target / actual if actual > 0 else 100)
    else:
        # For others, higher is better
        normalized_actual.append(100 * actual / target)

x = np.arange(len(sla_metrics))
width = 0.35

bars1 = axes[2, 0].bar(x - width/2, normalized_targets, width, 
                      label='SLA Target', alpha=0.8, color='lightblue')
bars2 = axes[2, 0].bar(x + width/2, normalized_actual, width, 
                      label='Actual Performance', alpha=0.8, color='lightgreen')

# Add compliance status as text
for i, status in enumerate(compliance_status):
    axes[2, 0].text(i, max(normalized_targets[i], normalized_actual[i]) + 5, 
                   status, ha='center', va='bottom', fontsize=16)

axes[2, 0].set_xlabel('SLA Metrics')
axes[2, 0].set_ylabel('Performance (% of target)')
axes[2, 0].set_title('SLA Compliance Dashboard')
axes[2, 0].set_xticks(x)
axes[2, 0].set_xticklabels(sla_metrics, rotation=45, ha='right')
axes[2, 0].legend()
axes[2, 0].grid(True, alpha=0.3)
axes[2, 0].axhline(y=100, color='red', linestyle='--', alpha=0.7)

# Plot 6: Monitoring tool comparison
monitoring_tools = {
    'Prometheus': {'cost': 0.2, 'complexity': 0.6, 'features': 0.8, 'scalability': 0.9},
    'DataDog': {'cost': 0.8, 'complexity': 0.3, 'features': 0.95, 'scalability': 0.8},
    'New Relic': {'cost': 0.7, 'complexity': 0.4, 'features': 0.9, 'scalability': 0.7},
    'CloudWatch': {'cost': 0.5, 'complexity': 0.5, 'features': 0.7, 'scalability': 0.8},
    'Grafana': {'cost': 0.3, 'complexity': 0.7, 'features': 0.75, 'scalability': 0.6}
}

tool_names = list(monitoring_tools.keys())
feature_scores = [monitoring_tools[tool]['features'] for tool in tool_names]
cost_scores = [1 - monitoring_tools[tool]['cost'] for tool in tool_names]  # Invert cost for efficiency
complexity_scores = [1 - monitoring_tools[tool]['complexity'] for tool in tool_names]  # Invert for ease

# Create bubble chart
bubble_sizes = [monitoring_tools[tool]['scalability'] * 300 for tool in tool_names]
scatter = axes[2, 1].scatter(cost_scores, feature_scores, s=bubble_sizes, 
                           c=complexity_scores, cmap='RdYlGn', alpha=0.7, edgecolors='black')

for i, tool in enumerate(tool_names):
    axes[2, 1].annotate(tool, (cost_scores[i], feature_scores[i]), 
                       xytext=(5, 5), textcoords='offset points', fontsize=9)

axes[2, 1].set_xlabel('Cost Efficiency')
axes[2, 1].set_ylabel('Feature Richness')
axes[2, 1].set_title('Monitoring Tool Comparison\n(Bubble size = Scalability, Color = Ease of use)')
axes[2, 1].grid(True, alpha=0.3)

plt.colorbar(scatter, ax=axes[2, 1], label='Ease of Use')

plt.tight_layout()
plt.show()

print("📊 Monitoring and Observability Insights:")
print("   ✅ Real-time monitoring prevents cascading failures")
print("   ✅ Multi-level alerting ensures appropriate response")
print("   ✅ SLA tracking maintains service quality")
print("   ✅ Automated remediation reduces MTTR")
print("   ⚠️  Alert fatigue can reduce effectiveness")

## Summary

This tutorial covered essential aspects of production deployment for SciRS2-Optim:

### Key Takeaways:

**Production Architecture:**
- Serverless provides excellent scalability and cost efficiency
- Microservices offer good fault tolerance and maintainability
- Container orchestration balances complexity and performance
- Architecture choice depends on scale and requirements

**Scaling and Performance:**
- Auto-scaling provides optimal cost-performance balance
- Caching significantly reduces resource utilization
- Load balancing improves fault tolerance
- External dependencies often become bottlenecks

**Monitoring and Observability:**
- Real-time monitoring prevents cascading failures
- Multi-level alerting ensures appropriate responses
- SLA tracking maintains service quality
- Comprehensive metrics across all system layers

### Best Practices:
1. **Design for failure** - Assume components will fail
2. **Monitor everything** - Infrastructure, application, and business metrics
3. **Automate responses** - Reduce manual intervention where possible
4. **Plan capacity** - Understand scaling characteristics
5. **Test at scale** - Validate performance under realistic loads
6. **Gradual rollouts** - Use blue-green or canary deployments

### Deployment Checklist:
- [ ] Architecture pattern selected and validated
- [ ] Monitoring and alerting configured
- [ ] Auto-scaling policies defined
- [ ] Security measures implemented
- [ ] Disaster recovery plan established
- [ ] Performance benchmarks documented
- [ ] SLA targets defined and tracked

### Next Steps:
- Implement monitoring for your specific use case
- Set up automated testing and deployment pipelines
- Create runbooks for common issues
- Establish performance baselines
- Plan for capacity growth

Ready for production! Continue with custom optimizer development tutorial! 🚀