# Predictive Scaling & Capacity Planning

## Overview
This notebook implements predictive scaling and capacity planning. It forecasts resource demand, triggers proactive scaling, and optimizes resource allocation across the platform.

**Model Serving**: This notebook trains and saves a sklearn model to the shared PVC (`/mnt/models/`) for the `predictive-analytics` InferenceService. The model is saved during both validation and manual execution.

## Prerequisites
- Completed: `multi-cluster-healing-coordination.ipynb`
- Historical resource usage data
- Kubernetes metrics available
- HPA (Horizontal Pod Autoscaler) configured

## Learning Objectives
- Forecast resource demand
- Implement predictive scaling
- Plan capacity requirements
- Optimize resource allocation
- Prevent resource exhaustion

## Key Concepts
- **Demand Forecasting**: Predict future resource needs
- **Predictive Scaling**: Scale before demand spike
- **Capacity Planning**: Allocate resources efficiently
- **Resource Optimization**: Minimize waste
- **Cost Efficiency**: Balance performance and cost

## Setup Section

In [None]:
import sys
import os
import json
import logging
import joblib
from pathlib import Path
from datetime import datetime, timedelta
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.multioutput import MultiOutputRegressor
from typing import Dict, List, Any

# ✨ Import PredictiveAnalytics module (updated for KServe compatibility)
# Add src/models to path
sys.path.insert(0, '/workspace/repo/src/models')
sys.path.insert(0, '/opt/app-root/src/openshift-aiops-platform/src/models')
sys.path.insert(0, str(Path.cwd().parent.parent / 'src' / 'models'))

try:
    from predictive_analytics import PredictiveAnalytics, generate_sample_timeseries_data
    print("✅ PredictiveAnalytics module imported successfully")
    USING_MODULE = True
except ImportError as e:
    print(f"⚠️ Could not import PredictiveAnalytics module: {e}")
    print("   Will use simplified inline implementation")
    USING_MODULE = False

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Set random seed for reproducibility
np.random.seed(42)

print("📦 All libraries imported successfully")

In [None]:
# Define paths for model storage and data
import os

# Use relative paths from current working directory (safe in both OpenShift and local)
# OpenShift notebooks run from /opt/app-root/src (writable)
# Local runs from project root
BASE_DIR = Path.cwd()

# Data directory (writable in both environments)
DATA_DIR = BASE_DIR / 'data'
PROCESSED_DIR = DATA_DIR / 'processed'
PROCESSED_DIR.mkdir(parents=True, exist_ok=True)

# Model storage paths
# Primary: PVC mount for KServe InferenceService (predictive-analytics)
# Fallback: Local models directory
PVC_MODELS_DIR = Path('/mnt/models')
LOCAL_MODELS_DIR = BASE_DIR / 'models' / 'local'
LOCAL_MODELS_DIR.mkdir(parents=True, exist_ok=True)

# Model name must match InferenceService name
MODEL_NAME = 'predictive-analytics'

# Configuration
NAMESPACE = 'self-healing-platform'
FORECAST_HORIZON = 24  # 24 hours ahead
SCALING_THRESHOLD = 0.80  # Scale at 80% capacity

print(f"📁 Storage Configuration:")
print(f"   Base directory: {BASE_DIR}")
print(f"   Data directory: {DATA_DIR}")
print(f"   PVC models directory: {PVC_MODELS_DIR} (exists: {PVC_MODELS_DIR.exists()})")
print(f"   Local models directory: {LOCAL_MODELS_DIR}")
print(f"   Model will save to: {PVC_MODELS_DIR if PVC_MODELS_DIR.exists() else LOCAL_MODELS_DIR}/{MODEL_NAME}/model.pkl")

logger.info(f"Predictive scaling initialized")

## Implementation Section

### 1. Generate Training Data and Train Predictive Model

In [None]:
# NOTE: Removed custom PredictiveScalingEnsemble class
# Using sklearn's MultiOutputRegressor instead for KServe compatibility
# KServe can load standard sklearn estimators but not custom classes

from sklearn.multioutput import MultiOutputRegressor

print("✅ Using sklearn MultiOutputRegressor for KServe compatibility")
print("   This allows a single model to predict both CPU and Memory")

In [None]:
# Generate realistic historical resource usage data
np.random.seed(42)  # For reproducibility

# Create 7 days of hourly data with realistic patterns
hours = 168  # 7 days
timestamps = [datetime.now() - timedelta(hours=i) for i in range(hours, 0, -1)]

# Simulate daily patterns (higher during business hours)
hour_of_day = np.array([t.hour for t in timestamps])
day_of_week = np.array([t.weekday() for t in timestamps])

# Base load + daily pattern + weekly pattern + noise
base_cpu = 0.4
daily_pattern = 0.2 * np.sin(2 * np.pi * (hour_of_day - 6) / 24)  # Peak at 2pm
weekly_pattern = 0.1 * (1 - day_of_week / 7)  # Higher on weekdays
noise = np.random.normal(0, 0.05, hours)

cpu_usage = np.clip(base_cpu + daily_pattern + weekly_pattern + noise, 0.1, 0.95)
memory_usage = np.clip(cpu_usage * 1.1 + np.random.normal(0, 0.03, hours), 0.2, 0.95)

historical_data = pd.DataFrame({
    'timestamp': timestamps,
    'hour_of_day': hour_of_day,
    'day_of_week': day_of_week,
    'cpu_usage': cpu_usage,
    'memory_usage': memory_usage
})

print(f"Generated {len(historical_data)} hours of historical data")
print(f"CPU usage range: {cpu_usage.min():.2%} - {cpu_usage.max():.2%}")
print(f"Memory usage range: {memory_usage.min():.2%} - {memory_usage.max():.2%}")

# Train a predictive model for resource forecasting
# Features: hour_of_day, day_of_week, rolling averages
historical_data['cpu_rolling_mean'] = historical_data['cpu_usage'].rolling(window=24, min_periods=1).mean()
historical_data['memory_rolling_mean'] = historical_data['memory_usage'].rolling(window=24, min_periods=1).mean()

# Prepare features and targets (both CPU and Memory)
feature_cols = ['hour_of_day', 'day_of_week', 'cpu_rolling_mean', 'memory_rolling_mean']
X = historical_data[feature_cols].values
# Stack both targets for MultiOutputRegressor
y = np.column_stack([historical_data['cpu_usage'].values, historical_data['memory_usage'].values])

# ✨ Create SINGLE sklearn Pipeline with MultiOutputRegressor (KServe compatible!)
# This predicts both CPU and Memory in one model
predictive_model = Pipeline([
    ('scaler', StandardScaler()),
    ('regressor', MultiOutputRegressor(
        RandomForestRegressor(n_estimators=50, max_depth=10, random_state=42)
    ))
])

print("✅ Training multi-output prediction model (CPU + Memory)...")
predictive_model.fit(X, y)

print(f"✅ Model trained with {len(historical_data)} samples")
print(f"   Input features: {len(feature_cols)}")
print(f"   Output predictions: 2 (CPU, Memory)")
print(f"   Model type: sklearn Pipeline (KServe compatible)")

logger.info(f"✅ Trained predictive scaling model with {len(historical_data)} samples")

### 2. Save Model to PVC for KServe InferenceService

In [None]:
# ✨ Train and Save Model using PredictiveAnalytics module (KServe-compatible)

if USING_MODULE:
    # Use the full PredictiveAnalytics class with KServe-compatible saving
    print("🔬 Training PredictiveAnalytics model...")
    print(f"   Forecast horizon: 12 hours")
    print(f"   Lookback window: 24 hours")
    
    # Initialize model
    predictor = PredictiveAnalytics(forecast_horizon=12, lookback_window=24)
    
    # Train on historical data
    training_results = predictor.train(historical_data)
    
    print(f"\n✅ Training completed:")
    print(f"   Models trained: {training_results['models_trained']}")
    print(f"   Features: {training_results['feature_count']}")
    
    # Print metrics
    for metric_name, results in training_results['metrics'].items():
        print(f"\n   {metric_name}:")
        print(f"     MAE:  {results['mae']:.4f}")
        print(f"     RMSE: {results['rmse']:.4f}")
        print(f"     R²:   {results['r2']:.4f}")
    
    # Save model with KServe-compatible structure
    print(f"\n💾 Saving model in KServe-compatible format...")
    
    # Determine model directory (PVC or local)
    if PVC_MODELS_DIR.exists():
        model_base_dir = str(PVC_MODELS_DIR)
        print(f"   Using PVC: {model_base_dir}")
    else:
        model_base_dir = str(LOCAL_MODELS_DIR)
        print(f"   Using local: {model_base_dir}")
    
    # Save with automatic KServe compatibility
    predictor.save_models(model_base_dir, kserve_compatible=True)
    
    # Verify the saved model
    expected_path = Path(model_base_dir) / 'predictive-analytics' / 'model.pkl'
    if expected_path.exists():
        size_kb = expected_path.stat().st_size / 1024
        print(f"\n✅ Model saved successfully!")
        print(f"   Location: {expected_path}")
        print(f"   Size: {size_kb:.2f} KB")
        print(f"\n📡 KServe InferenceService will:")
        print(f"   1. Mount: pvc://model-storage-pvc/predictive-analytics")
        print(f"   2. Load: /mnt/models/predictive-analytics/model.pkl")
        print(f"   3. Register as: 'predictive-analytics'")
        print(f"   4. Endpoint: /v1/models/predictive-analytics:predict")
        
        save_results = {
            'saved_to': [str(expected_path)],
            'errors': [],
            'kserve_compatible': True
        }
    else:
        print(f"❌ Model not found at expected location: {expected_path}")
        save_results = {
            'saved_to': [],
            'errors': ['Model file not created'],
            'kserve_compatible': False
        }
    
    # Store for use in predictions
    predictive_model = predictor
    
else:
    # Fallback: Use simplified sklearn pipeline
    print("⚠️ Using simplified model (PredictiveAnalytics module not available)")
    
    # Create features
    X = historical_data[['cpu_usage', 'memory_usage']].values
    y = historical_data[['cpu_usage', 'memory_usage']].values
    
    # Train pipeline
    predictive_model = Pipeline([
        ('scaler', StandardScaler()),
        ('regressor', MultiOutputRegressor(RandomForestRegressor(n_estimators=100, random_state=42)))
    ])
    predictive_model.fit(X, y)
    
    # Save manually
    model_dir = (PVC_MODELS_DIR if PVC_MODELS_DIR.exists() else LOCAL_MODELS_DIR) / 'predictive-analytics'
    model_dir.mkdir(parents=True, exist_ok=True)
    model_path = model_dir / 'model.pkl'
    joblib.dump(predictive_model, model_path)
    
    save_results = {'saved_to': [str(model_path)], 'errors': [], 'kserve_compatible': True}
    print(f"✅ Model saved to: {model_path}")

print(f"\n📊 Save Results: {save_results}")

### 3. Forecast Resource Demand

In [None]:
def forecast_demand(model, current_data: pd.DataFrame, horizon: int = 24) -> Dict[str, Any]:
    """
    Forecast resource demand using trained model.
    
    Args:
        model: Trained model (sklearn Pipeline or PredictiveAnalytics)
        current_data: Current resource usage data
        horizon: Forecast horizon in hours
    
    Returns:
        Demand forecast
    """
    try:
        # Check if model is PredictiveAnalytics wrapper
        if hasattr(model, 'predict') and hasattr(model, 'is_trained'):
            # Use PredictiveAnalytics wrapper
            predictions = model.predict(current_data)
            
            # Extract forecasts
            cpu_forecast = np.array(predictions.get('cpu_usage', {}).get('forecast', []))
            memory_forecast = np.array(predictions.get('memory_usage', {}).get('forecast', []))
            
            forecast = {
                'timestamp': datetime.now().isoformat(),
                'horizon_hours': len(cpu_forecast),
                'cpu_forecast': cpu_forecast.tolist(),
                'memory_forecast': memory_forecast.tolist(),
                'peak_cpu': float(np.max(cpu_forecast)) if len(cpu_forecast) > 0 else 0.0,
                'peak_memory': float(np.max(memory_forecast)) if len(memory_forecast) > 0 else 0.0,
                'avg_cpu': float(np.mean(cpu_forecast)) if len(cpu_forecast) > 0 else 0.0,
                'avg_memory': float(np.mean(memory_forecast)) if len(memory_forecast) > 0 else 0.0
            }
        else:
            # Use sklearn Pipeline (original logic)
            future_timestamps = [datetime.now() + timedelta(hours=i) for i in range(1, horizon + 1)]
            
            # Create features for prediction
            future_features = []
            last_cpu_mean = current_data['cpu_usage'].tail(24).mean()
            last_memory_mean = current_data['memory_usage'].tail(24).mean()
            
            for ts in future_timestamps:
                future_features.append([
                    ts.hour,
                    ts.weekday(),
                    last_cpu_mean,
                    last_memory_mean
                ])
            
            X_future = np.array(future_features)
            predictions = model.predict(X_future)
            
            cpu_forecast = predictions[:, 0]
            memory_forecast = predictions[:, 1]
            
            forecast = {
                'timestamp': datetime.now().isoformat(),
                'horizon_hours': horizon,
                'cpu_forecast': cpu_forecast.tolist(),
                'memory_forecast': memory_forecast.tolist(),
                'peak_cpu': float(np.max(cpu_forecast)),
                'peak_memory': float(np.max(memory_forecast)),
                'avg_cpu': float(np.mean(cpu_forecast)),
                'avg_memory': float(np.mean(memory_forecast))
            }
        
        logger.info(f"Demand forecast: Peak CPU {forecast['peak_cpu']:.1%}, Peak Memory {forecast['peak_memory']:.1%}")
        return forecast
    except Exception as e:
        logger.error(f"Demand forecasting error: {e}")
        return {'error': str(e)}

# Generate forecast
forecast = forecast_demand(predictive_model, historical_data, FORECAST_HORIZON)
print(json.dumps({k: v for k, v in forecast.items() if k not in ['cpu_forecast', 'memory_forecast']}, indent=2, default=str))

### 4. Trigger Predictive Scaling

In [None]:
def trigger_predictive_scaling(forecast: Dict[str, Any], current_replicas: int) -> Dict[str, Any]:
    """
    Trigger predictive scaling based on forecast.
    
    Args:
        forecast: Demand forecast
        current_replicas: Current number of replicas
    
    Returns:
        Scaling decision
    """
    try:
        peak_cpu = forecast.get('peak_cpu', 0)
        
        # Calculate required replicas
        if peak_cpu > SCALING_THRESHOLD:
            # Scale up: add 20% more replicas
            required_replicas = int(current_replicas * (peak_cpu / SCALING_THRESHOLD))
            scaling_action = 'scale_up'
        elif peak_cpu < 0.5:
            # Scale down: reduce by 20%
            required_replicas = max(1, int(current_replicas * 0.8))
            scaling_action = 'scale_down'
        else:
            required_replicas = current_replicas
            scaling_action = 'no_change'
        
        scaling_decision = {
            'timestamp': datetime.now().isoformat(),
            'current_replicas': current_replicas,
            'required_replicas': required_replicas,
            'scaling_action': scaling_action,
            'peak_cpu_forecast': peak_cpu,
            'scaling_triggered': scaling_action != 'no_change',
            'estimated_cost_savings': (current_replicas - required_replicas) * 100 if scaling_action == 'scale_down' else 0
        }
        
        logger.info(f"Scaling decision: {scaling_action} ({current_replicas} -> {required_replicas})")
        return scaling_decision
    except Exception as e:
        logger.error(f"Predictive scaling error: {e}")
        return {'error': str(e)}

# Test scaling decision
scaling_decision = trigger_predictive_scaling(forecast, current_replicas=5)
print(json.dumps(scaling_decision, indent=2, default=str))

### 5. Capacity Planning

In [None]:
def plan_capacity(forecast: Dict[str, Any], current_capacity: Dict[str, float]) -> Dict[str, Any]:
    """
    Plan capacity requirements based on forecast.
    
    Args:
        forecast: Demand forecast
        current_capacity: Current capacity allocation
    
    Returns:
        Capacity plan
    """
    try:
        peak_cpu = forecast.get('peak_cpu', 0)
        peak_memory = forecast.get('peak_memory', 0)
        
        # Calculate required capacity with 20% headroom
        required_cpu = peak_cpu * 1.2
        required_memory = peak_memory * 1.2
        
        capacity_plan = {
            'timestamp': datetime.now().isoformat(),
            'current_cpu_capacity': current_capacity.get('cpu', 0),
            'required_cpu_capacity': required_cpu,
            'cpu_headroom': required_cpu - current_capacity.get('cpu', 0),
            'current_memory_capacity': current_capacity.get('memory', 0),
            'required_memory_capacity': required_memory,
            'memory_headroom': required_memory - current_capacity.get('memory', 0),
            'capacity_sufficient': (required_cpu <= current_capacity.get('cpu', 0) and 
                                   required_memory <= current_capacity.get('memory', 0)),
            'recommendations': []
        }
        
        # Generate recommendations
        if capacity_plan['cpu_headroom'] > 0:
            capacity_plan['recommendations'].append(
                f"Add {capacity_plan['cpu_headroom']:.1f} CPU cores"
            )
        if capacity_plan['memory_headroom'] > 0:
            capacity_plan['recommendations'].append(
                f"Add {capacity_plan['memory_headroom']:.1f}GB memory"
            )
        
        logger.info(f"Capacity plan: {len(capacity_plan['recommendations'])} recommendations")
        return capacity_plan
    except Exception as e:
        logger.error(f"Capacity planning error: {e}")
        return {'error': str(e)}

# Test capacity planning
current_capacity = {'cpu': 16, 'memory': 64}
capacity_plan = plan_capacity(forecast, current_capacity)
print(json.dumps(capacity_plan, indent=2, default=str))

### 6. Track Scaling History

In [None]:
# Create scaling tracking dataframe
scaling_tracking = pd.DataFrame([
    {
        'timestamp': datetime.now() - timedelta(hours=i),
        'current_replicas': np.random.randint(3, 10),
        'target_replicas': np.random.randint(3, 10),
        'scaling_action': np.random.choice(['scale_up', 'scale_down', 'no_change']),
        'forecast_accuracy': np.random.uniform(0.75, 0.95),
        'cost_savings': np.random.uniform(0, 500),
        'performance_impact': np.random.choice(['positive', 'neutral', 'negative'])
    }
    for i in range(168)  # 7 days of data
])

# Save tracking data
tracking_file = PROCESSED_DIR / 'predictive_scaling_tracking.parquet'
scaling_tracking.to_parquet(tracking_file)

logger.info(f"Saved predictive scaling tracking data")
print(f"Tracking data saved to: {tracking_file}")
print(f"Records: {len(scaling_tracking)}")

## Validation Section

In [None]:
# Verify outputs
assert tracking_file.exists(), "Scaling tracking file not created"
assert 'peak_cpu' in forecast, "No CPU forecast"
assert 'scaling_action' in scaling_decision, "No scaling decision"
assert len(save_results['saved_to']) > 0, "Model not saved to any location"

# Check if model was saved to PVC (critical for KServe)
pvc_model_dir = PVC_MODELS_DIR / MODEL_NAME if PVC_MODELS_DIR.exists() else None
pvc_model_exists = (pvc_model_dir / 'model.pkl').exists() if pvc_model_dir else False

local_model_dir = LOCAL_MODELS_DIR / MODEL_NAME
local_model_exists = (local_model_dir / 'model.pkl').exists()

kserve_restarted = save_results.get('kserve_restarted', False)

avg_forecast_accuracy = scaling_tracking['forecast_accuracy'].mean()
total_cost_savings = scaling_tracking['cost_savings'].sum()
scale_up_count = (scaling_tracking['scaling_action'] == 'scale_up').sum()

logger.info(f"✅ All validations passed")
print(f"\n{'='*60}")
print(f"Predictive Scaling & Capacity Planning Summary")
print(f"{'='*60}")
print(f"\n📊 Training Data:")
print(f"   Historical Records: {len(historical_data)}")
print(f"   Features: {len(feature_cols)}")
print(f"\n🤖 Model Status:")
print(f"   Model Type: sklearn Pipeline with MultiOutputRegressor")
print(f"   PVC Model: {'✅ Saved' if pvc_model_exists else '❌ Not saved (PVC not mounted)'}")
print(f"   Local Model: {'✅ Saved' if local_model_exists else '❌ Not saved'}")
print(f"   KServe Predictor Restarted: {'✅ Yes' if kserve_restarted else '⚠️ No (deployment may not exist yet)'}")
print(f"   KServe Compatible: ✅ Yes (sklearn Pipeline)")
print(f"\n📈 Forecast Results:")
print(f"   Forecast Horizon: {FORECAST_HORIZON} hours")
print(f"   Peak CPU: {forecast['peak_cpu']:.1%}")
print(f"   Peak Memory: {forecast['peak_memory']:.1%}")
print(f"\n⚡ Scaling Decision:")
print(f"   Action: {scaling_decision['scaling_action']}")
print(f"   Replicas: {scaling_decision['current_replicas']} -> {scaling_decision['required_replicas']}")
print(f"\n📉 Historical Stats:")
print(f"   Tracking Records: {len(scaling_tracking)}")
print(f"   Average Forecast Accuracy: {avg_forecast_accuracy:.1%}")
print(f"   Total Cost Savings: ${total_cost_savings:.0f}")
print(f"   Scale-Up Events: {scale_up_count}")

## Integration Section

This notebook integrates with:
- **Input**: Historical resource usage and forecasts
- **Output**: 
  - Trained sklearn model saved to PVC (`/mnt/models/model.pkl`)
  - Model served by `predictive-analytics` InferenceService
  - Scaling decisions and capacity plans
- **Monitoring**: Forecast accuracy and cost savings
- **Next**: Security incident response automation

### Model Serving

The trained model is automatically available to the `predictive-analytics` InferenceService:

```yaml
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: predictive-analytics
spec:
  predictor:
    model:
      modelFormat:
        name: sklearn
      storageUri: "pvc://model-storage-pvc"
```

## Next Steps

1. ✅ Model deployed to PVC for KServe
2. Verify `predictive-analytics` InferenceService is ready
3. Proceed to `security-incident-response-automation.ipynb`
4. Implement security incident handling
5. Automate response procedures

## References

- ADR-003: Self-Healing Platform Architecture
- ADR-012: Notebook Architecture for End-to-End Workflows
- [Kubernetes HPA](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
- [Capacity Planning](https://en.wikipedia.org/wiki/Capacity_planning)