# Predictive Analytics Model for KServe

## Overview
This notebook trains and saves a **Predictive Analytics model** in KServe-compatible format for the `predictive-analytics` InferenceService.

**Model Type**: Time series forecasting with Random Forest  
**Purpose**: Predict future resource usage (CPU, memory, disk, network)  
**Deployment**: KServe InferenceService with sklearn runtime

## KServe Integration (Issue #13 Fix)

This notebook implements the fix for [Issue #13](https://github.com/tosin2013/openshift-aiops-platform/issues/13) where models were registering as `"model"` instead of `"predictive-analytics"`.

### Problem Solved
- **Before**: Models saved to `/mnt/models/cpu_usage_step_0_model.pkl` (flat structure)
- **After**: Models saved to `/mnt/models/predictive-analytics/model.pkl` (KServe structure)
- **Result**: Model registers correctly as `"predictive-analytics"` ‚úÖ

### Architecture
```
Notebook Training ‚Üí /mnt/models/predictive-analytics/model.pkl
                    ‚Üì
KServe InferenceService (storageUri: pvc://model-storage-pvc/predictive-analytics)
                    ‚Üì
Model registered as: "predictive-analytics"
                    ‚Üì
Endpoint: /v1/models/predictive-analytics:predict
```

## Prerequisites
- Model storage PVC mounted at `/mnt/models`
- Python environment with sklearn, pandas, numpy
- Access to `src/models/predictive_analytics.py` module

## What This Notebook Does
1. ‚úÖ Imports the `PredictiveAnalytics` module
2. ‚úÖ Generates synthetic time series training data
3. ‚úÖ Trains multi-metric forecasting models (CPU, memory, disk, network)
4. ‚úÖ Saves in KServe-compatible format: `/mnt/models/predictive-analytics/model.pkl`
5. ‚úÖ Validates the model works correctly
6. ‚úÖ Tests prediction endpoint format

## Setup Section

### Import Libraries and Configure Environment

In [None]:
# Import required libraries
import sys
import os
from pathlib import Path

# Setup path for src/models module - works from any directory
def find_models_path():
    """Find src/models path regardless of current working directory"""
    possible_paths = [
        Path(__file__).parent.parent.parent / 'src' / 'models' if '__file__' in dir() else None,
        Path.cwd().parent.parent / 'src' / 'models',
        Path('/workspace/repo/src/models'),
        Path('/opt/app-root/src/openshift-aiops-platform/src/models'),
    ]
    for p in possible_paths:
        if p and p.exists() and (p / 'predictive_analytics.py').exists():
            return str(p)
    # Try relative path search
    current = Path.cwd()
    for _ in range(5):
        models_path = current / 'src' / 'models'
        if models_path.exists() and (models_path / 'predictive_analytics.py').exists():
            return str(models_path)
        current = current.parent
    return None

models_path = find_models_path()
if models_path:
    sys.path.insert(0, models_path)
    print(f"‚úÖ Models path found: {models_path}")
else:
    print("‚ö†Ô∏è Models path not found - using fallback implementation")

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import joblib
import warnings
warnings.filterwarnings('ignore')

# Import PredictiveAnalytics module
try:
    from predictive_analytics import PredictiveAnalytics, generate_sample_timeseries_data
    print("‚úÖ PredictiveAnalytics module imported successfully")
    USING_MODULE = True
except ImportError as e:
    print(f"‚ùå Failed to import PredictiveAnalytics module: {e}")
    print("   Please ensure src/models/predictive_analytics.py is available")
    USING_MODULE = False

# Set visualization style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

print("\n‚úÖ All libraries imported successfully")

### Configure Model Storage Paths

In [None]:
# Configure storage paths
# Use /mnt/models for persistent storage (model-storage-pvc)
# Fallback to local for development outside cluster
MODELS_DIR = Path('/mnt/models') if Path('/mnt/models').exists() else Path('/opt/app-root/src/models')
MODELS_DIR.mkdir(parents=True, exist_ok=True)

# Model name must match InferenceService name
MODEL_NAME = 'predictive-analytics'
MODEL_DIR = MODELS_DIR / MODEL_NAME  # Will be created by save_models()

print(f"üìÅ Model Storage Configuration:")
print(f"   Base directory: {MODELS_DIR}")
print(f"   Model name: {MODEL_NAME}")
print(f"   Expected KServe path: {MODEL_DIR}/model.pkl")
print(f"   PVC available: {'‚úÖ Yes' if MODELS_DIR == Path('/mnt/models') else '‚ö†Ô∏è No (using local)'}")

if not USING_MODULE:
    print("\n‚ùå Cannot proceed without PredictiveAnalytics module")
    raise ImportError("PredictiveAnalytics module required for this notebook")

In [None]:
# ====================
# Data Source Configuration (ADR-050, ADR-052)
# ====================
import requests

DATA_SOURCE = os.getenv('DATA_SOURCE', 'synthetic')  # synthetic|prometheus|hybrid
PROMETHEUS_URL = os.getenv('PROMETHEUS_URL', 'http://prometheus-k8s.openshift-monitoring.svc:9090')
TRAINING_DAYS = int(os.getenv('TRAINING_DAYS', '30'))  # 30-day lookback for predictive analytics
TRAINING_HOURS = int(os.getenv('TRAINING_HOURS', str(TRAINING_DAYS * 24)))
PROMETHEUS_AVAILABLE = False

# Check Prometheus availability
if DATA_SOURCE in ['prometheus', 'hybrid']:
    try:
        response = requests.get(f"{PROMETHEUS_URL}/api/v1/status/config", timeout=5)
        PROMETHEUS_AVAILABLE = response.status_code == 200
        print(f"‚úÖ Prometheus available at {PROMETHEUS_URL}")
    except Exception as e:
        print(f"‚ö†Ô∏è Prometheus not available: {e}")
        print(f"   Falling back to synthetic data")
        DATA_SOURCE = 'synthetic'

print(f"\nüìä Data Source Configuration:")
print(f"   Mode: {DATA_SOURCE}")
print(f"   Training hours: {TRAINING_HOURS}h ({TRAINING_HOURS / 24:.1f} days)")
print(f"   Prometheus: {'‚úÖ Available' if PROMETHEUS_AVAILABLE else '‚ùå Unavailable'}")


# ====================
# Prometheus Data Fetching Functions
# ====================

def fetch_prometheus_timeseries(metric_query, lookback_hours=720):
    """
    Fetch time series data from Prometheus
    
    Args:
        metric_query: PromQL query string
        lookback_hours: Time window in hours (default: 720 = 30 days)
    
    Returns:
        pandas DataFrame with timestamp and value columns
    """
    end_time = datetime.now()
    start_time = end_time - timedelta(hours=lookback_hours)
    
    params = {
        'query': metric_query,
        'start': int(start_time.timestamp()),
        'end': int(end_time.timestamp()),
        'step': '5m'  # 5-minute intervals
    }
    
    try:
        response = requests.get(
            f'{PROMETHEUS_URL}/api/v1/query_range', 
            params=params,
            timeout=30
        )
        response.raise_for_status()
        
        result = response.json()
        if result['status'] != 'success':
            raise ValueError(f"Prometheus query failed: {result}")
        
        # Parse results
        timestamps = []
        values = []
        for series in result['data']['result']:
            for timestamp, value in series['values']:
                timestamps.append(pd.to_datetime(timestamp, unit='s'))
                values.append(float(value))
        
        df = pd.DataFrame({
            'timestamp': timestamps,
            'value': values
        })
        
        if len(df) > 0:
            df = df.sort_values('timestamp').reset_index(drop=True)
        
        return df
    
    except Exception as e:
        print(f"‚ö†Ô∏è  Failed to fetch Prometheus data: {e}")
        return pd.DataFrame(columns=['timestamp', 'value'])


def fetch_prometheus_metrics_for_prediction(lookback_hours=720):
    """
    Fetch all metrics needed for predictive analytics from Prometheus
    
    Args:
        lookback_hours: Time window in hours
    
    Returns:
        pandas DataFrame with timestamp, cpu_usage, memory_usage, disk_usage, network_in, network_out
    """
    print(f"üîç Fetching metrics from Prometheus (lookback: {lookback_hours}h)...")
    
    # Prometheus query mappings
    metric_queries = {
        'cpu_usage': 'instance:node_cpu:ratio',
        'memory_usage': 'instance:node_memory_utilisation:ratio',
        'disk_usage': 'instance:node_filesystem_usage:ratio',
        'network_in': 'instance:node_network_receive_bytes:rate1m',
        'network_out': 'instance:node_network_transmit_bytes:rate1m'
    }
    
    # Fetch each metric
    metric_dfs = {}
    for metric_name, query in metric_queries.items():
        print(f"  üìä Fetching {metric_name}...")
        df = fetch_prometheus_timeseries(query, lookback_hours)
        
        if len(df) > 0:
            metric_dfs[metric_name] = df
            print(f"    ‚úÖ {len(df)} data points")
        else:
            print(f"    ‚ö†Ô∏è  No data available")
    
    # If no metrics fetched, return empty DataFrame
    if not metric_dfs:
        print(f"‚ùå No metrics fetched from Prometheus")
        return pd.DataFrame()
    
    # Merge all metrics on timestamp
    print(f"\nüîß Merging metrics...")
    combined_df = None
    
    for metric_name, df in metric_dfs.items():
        df = df.rename(columns={'value': metric_name})
        if combined_df is None:
            combined_df = df
        else:
            combined_df = combined_df.merge(df, on='timestamp', how='outer')
    
    # Sort by timestamp and forward-fill missing values
    combined_df = combined_df.sort_values('timestamp').reset_index(drop=True)
    combined_df = combined_df.ffill().bfill()
    
    # Fill any remaining NaN with 0
    combined_df = combined_df.fillna(0)
    
    print(f"‚úÖ Combined dataset: {len(combined_df)} samples with {len(metric_dfs)} metrics")
    
    return combined_df


print("‚úÖ Data fetching functions configured")

In [None]:
# Generate training data based on data source configuration
print("üìä Generating training data...")
print(f"   Mode: {DATA_SOURCE}")
print(f"   Time window: {TRAINING_HOURS}h ({TRAINING_HOURS / 24:.1f} days)")

if DATA_SOURCE == 'prometheus' and PROMETHEUS_AVAILABLE:
    # Fetch data from Prometheus
    print("\nüîç Fetching data from Prometheus...")
    sample_data = fetch_prometheus_metrics_for_prediction(lookback_hours=TRAINING_HOURS)
    
    if len(sample_data) == 0:
        print("‚ö†Ô∏è  No Prometheus data available, falling back to synthetic")
        n_samples = int(TRAINING_HOURS * 12)  # 5-min intervals
        sample_data = generate_sample_timeseries_data(n_samples=n_samples)
    else:
        print(f"‚úÖ Using Prometheus data: {len(sample_data)} samples")

elif DATA_SOURCE == 'hybrid' and PROMETHEUS_AVAILABLE:
    # Mix Prometheus and synthetic data
    print("\nüîç Creating hybrid dataset (50% Prometheus, 50% synthetic)...")
    prom_data = fetch_prometheus_metrics_for_prediction(lookback_hours=TRAINING_HOURS)
    
    if len(prom_data) > 0:
        # Generate synthetic data to match Prometheus size
        synthetic_data = generate_sample_timeseries_data(n_samples=len(prom_data))
        
        # Combine datasets
        sample_data = pd.concat([prom_data, synthetic_data], ignore_index=True)
        sample_data = sample_data.sort_values('timestamp').reset_index(drop=True)
        print(f"‚úÖ Combined: {len(prom_data)} Prometheus + {len(synthetic_data)} synthetic = {len(sample_data)} total")
    else:
        print("‚ö†Ô∏è  No Prometheus data available, using 100% synthetic")
        n_samples = int(TRAINING_HOURS * 12)
        sample_data = generate_sample_timeseries_data(n_samples=n_samples)

else:
    # Pure synthetic data
    print("\nüìä Generating synthetic time series data...")
    print("   This simulates realistic infrastructure metrics with patterns:")
    print("   - Daily cycles (higher during business hours)")
    print("   - Weekly patterns (weekday vs weekend)")
    print("   - Trends (gradual growth over time)")
    print("   - Noise (random variations)")
    
    # Calculate sample count based on training hours (5-minute intervals)
    n_samples = int(TRAINING_HOURS * 12)
    sample_data = generate_sample_timeseries_data(n_samples=n_samples)
    print(f"‚úÖ Generated {len(sample_data)} synthetic samples")

print(f"\n‚úÖ Training data prepared:")
print(f"   Samples: {len(sample_data)}")
print(f"   Columns: {', '.join(sample_data.columns)}")
print(f"   Shape: {sample_data.shape}")
print(f"   Date range: {sample_data['timestamp'].min()} to {sample_data['timestamp'].max()}")

# Display sample data
print("\nüìã Sample data (first 5 rows):")
display(sample_data.head())

# Show statistics
print("\nüìà Data Statistics:")
display(sample_data.describe())

In [None]:
# Generate synthetic time series data for training
print("üìä Generating synthetic time series data...")
print("   This simulates realistic infrastructure metrics with patterns:")
print("   - Daily cycles (higher during business hours)")
print("   - Weekly patterns (weekday vs weekend)")
print("   - Trends (gradual growth over time)")
print("   - Noise (random variations)")

# Generate 2000 samples (about 7 days at 5-minute intervals)
sample_data = generate_sample_timeseries_data(n_samples=2000)

print(f"\n‚úÖ Generated {len(sample_data)} samples")
print(f"   Columns: {', '.join(sample_data.columns)}")
print(f"   Shape: {sample_data.shape}")
print(f"   Date range: {sample_data['timestamp'].min()} to {sample_data['timestamp'].max()}")

# Display sample data
print("\nüìã Sample data (first 5 rows):")
display(sample_data.head())

# Show statistics
print("\nüìà Data Statistics:")
display(sample_data.describe())

### Visualize Training Data

In [None]:
# Visualize the generated time series data
fig, axes = plt.subplots(3, 2, figsize=(15, 10))
fig.suptitle('Synthetic Time Series Training Data', fontsize=16, fontweight='bold')

metrics = ['cpu_usage', 'memory_usage', 'disk_usage', 'network_in', 'network_out']
colors = ['#FF6B6B', '#4ECDC4', '#45B7D1', '#FFA07A', '#98D8C8']

for idx, (metric, color) in enumerate(zip(metrics, colors)):
    ax = axes[idx // 2, idx % 2]
    ax.plot(sample_data['timestamp'], sample_data[metric], color=color, alpha=0.7, linewidth=1)
    ax.set_title(f'{metric.replace("_", " ").title()}', fontweight='bold')
    ax.set_xlabel('Time')
    ax.set_ylabel('Value')
    ax.grid(True, alpha=0.3)
    ax.tick_params(axis='x', rotation=45)

# Hide the last subplot (we have 5 metrics in a 3x2 grid)
axes[2, 1].set_visible(False)

plt.tight_layout()
plt.show()

print("‚úÖ Visualization complete")

### Split Data for Training and Validation

In [None]:
# Split data: 80% training, 20% validation
split_point = int(len(sample_data) * 0.8)

train_data = sample_data.iloc[:split_point].copy()
val_data = sample_data.iloc[split_point:].copy()

print(f"üìä Data Split:")
print(f"   Training samples: {len(train_data)} ({len(train_data)/len(sample_data)*100:.1f}%)")
print(f"   Validation samples: {len(val_data)} ({len(val_data)/len(sample_data)*100:.1f}%)")
print(f"   Training period: {train_data['timestamp'].min()} to {train_data['timestamp'].max()}")
print(f"   Validation period: {val_data['timestamp'].min()} to {val_data['timestamp'].max()}")

## Model Training Section

### Initialize and Train PredictiveAnalytics Model

In [None]:
# Initialize PredictiveAnalytics model
print("üî¨ Initializing PredictiveAnalytics model...")

# Configure model parameters
FORECAST_HORIZON = 12  # Predict 12 time steps ahead
LOOKBACK_WINDOW = 24   # Use 24 historical time steps

predictor = PredictiveAnalytics(
    forecast_horizon=FORECAST_HORIZON,
    lookback_window=LOOKBACK_WINDOW
)

print(f"   Forecast horizon: {FORECAST_HORIZON} time steps")
print(f"   Lookback window: {LOOKBACK_WINDOW} time steps")
print(f"   Target metrics: {', '.join(predictor.target_metrics)}")

# Train the model
print(f"\nüéØ Training models on {len(train_data)} samples...")
print("   This will train separate models for each metric:")
print("   - CPU usage")
print("   - Memory usage")
print("   - Disk usage")
print("   - Network in")
print("   - Network out")

training_results = predictor.train(train_data)

print(f"\n‚úÖ Training completed!")
print(f"   Models trained: {training_results['models_trained']}")
print(f"   Feature count: {training_results['feature_count']}")
print(f"   Forecast horizon: {training_results['forecast_horizon']}")
print(f"   Lookback window: {training_results['lookback_window']}")

### Evaluate Model Performance

In [None]:
# Display detailed metrics for each model
print("üìä Model Performance Metrics:\n")
print("=" * 80)

for metric_name, results in training_results['metrics'].items():
    print(f"\n{metric_name.upper().replace('_', ' ')}:")
    print(f"  Mean Absolute Error (MAE):  {results['mae']:.4f}")
    print(f"  Root Mean Squared Error (RMSE): {results['rmse']:.4f}")
    print(f"  R¬≤ Score: {results['r2']:.4f}")
    print(f"  Training samples: {results['training_samples']}")
    print(f"  Test samples: {results['test_samples']}")
    print("  " + "-" * 60)

print("\n" + "=" * 80)

# Calculate average R¬≤ across all metrics
avg_r2 = np.mean([r['r2'] for r in training_results['metrics'].values()])
print(f"\nüìà Average R¬≤ Score: {avg_r2:.4f}")

if avg_r2 > 0.8:
    print("‚úÖ Excellent model performance!")
elif avg_r2 > 0.6:
    print("‚úÖ Good model performance")
else:
    print("‚ö†Ô∏è Model performance could be improved - consider more training data")

## Model Validation Section

### Test Predictions on Validation Data

In [None]:
# Make predictions on validation data
print("üîÆ Making predictions on validation data...")

predictions = predictor.predict(val_data.head(50))

print(f"\n‚úÖ Predictions generated:")
print(f"   Timestamp: {predictions['timestamp']}")
print(f"   Metrics predicted: {len(predictions['predictions'])}")
print(f"   Lookback window used: {predictions['lookback_window']}")

# Display predictions for each metric
print("\nüìä Prediction Results:\n")
for metric_name, pred_data in predictions['predictions'].items():
    forecast = pred_data['forecast']
    confidence = pred_data.get('confidence', [0.5] * len(forecast))
    
    print(f"{metric_name.upper().replace('_', ' ')}:")
    print(f"  Forecast values (first 5): {[f'{v:.2f}' for v in forecast[:5]]}")
    print(f"  Confidence (avg): {np.mean(confidence):.2%}")
    print()

### Visualize Predictions vs Actual

In [None]:
# Visualize predictions
fig, axes = plt.subplots(2, 2, figsize=(15, 10))
fig.suptitle('Predictions vs Actual Values (Validation Set)', fontsize=16, fontweight='bold')

# Select metrics to visualize
vis_metrics = ['cpu_usage', 'memory_usage', 'disk_usage', 'network_in']

for idx, metric in enumerate(vis_metrics):
    ax = axes[idx // 2, idx % 2]
    
    # Plot actual values
    actual_vals = val_data[metric].head(50).values
    ax.plot(range(len(actual_vals)), actual_vals, label='Actual', color='blue', alpha=0.6, linewidth=2)
    
    # Plot predictions (if available)
    if metric in predictions['predictions']:
        forecast = predictions['predictions'][metric]['forecast']
        # Start prediction from lookback_window position
        pred_start = predictions['lookback_window']
        ax.plot(range(pred_start, pred_start + len(forecast)), 
               forecast, label='Predicted', color='red', alpha=0.6, linewidth=2, linestyle='--')
    
    ax.set_title(f'{metric.replace("_", " ").title()}', fontweight='bold')
    ax.set_xlabel('Time Step')
    ax.set_ylabel('Value')
    ax.legend()
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("‚úÖ Visualization complete")

## Model Saving Section

### Save Model in KServe-Compatible Format

This is the critical step that implements the **Issue #13 fix**!

In [None]:
# Save MULTI-OUTPUT model using sklearn's standard MultiOutputRegressor
# This ensures KServe sklearn server can load it without custom class dependencies
print("üíæ Saving sklearn Pipeline + MultiOutputRegressor for KServe...\n")
print("=" * 80)
print("KSERVE SKLEARN PIPELINE (Standard sklearn classes only)")
print("=" * 80)
print(f"\nüìÇ Directory Structure:")
print(f"   Base: {MODELS_DIR}")
print(f"   Model subdirectory: {MODEL_NAME}/")
print(f"   Full path: {MODELS_DIR}/{MODEL_NAME}/model.pkl")

from pathlib import Path
import joblib
import numpy as np
from sklearn.multioutput import MultiOutputRegressor
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

# Check if XGBoost is available
try:
    import xgboost as xgb
    XGBOOST_AVAILABLE = True
    print("\n‚úÖ XGBoost available - using XGBRegressor")
except ImportError:
    from sklearn.ensemble import RandomForestRegressor
    XGBOOST_AVAILABLE = False
    print("\n‚ö†Ô∏è XGBoost not available - using RandomForestRegressor")

# ====================
# PREPARE TRAINING DATA FOR MULTI-OUTPUT MODEL
# ====================
print(f"\nüîß Preparing training data for multi-output model...")

# Target columns in specific order (must match coordination engine expectations)
target_columns = ['cpu_usage', 'memory_usage', 'disk_usage', 'network_in', 'network_out']

# Get feature columns (all numeric columns except targets and timestamp)
feature_columns = [col for col in train_data.columns 
                   if col not in target_columns + ['timestamp'] 
                   and train_data[col].dtype in ['float64', 'int64', 'float32', 'int32']]

print(f"   Features: {len(feature_columns)} columns")
print(f"   Targets: {target_columns}")

# Prepare X (features) and y (all 5 targets)
X_train_all = train_data[feature_columns].values
y_train_all = train_data[target_columns].values

print(f"   X_train shape: {X_train_all.shape}")
print(f"   y_train shape: {y_train_all.shape}")

# ====================
# CREATE SKLEARN PIPELINE WITH MULTIOUTPUTREGRESSOR
# ====================
print(f"\nüîß Creating sklearn Pipeline with MultiOutputRegressor...")

# Create the base estimator
if XGBOOST_AVAILABLE:
    base_estimator = xgb.XGBRegressor(
        n_estimators=100,
        max_depth=10,
        learning_rate=0.1,
        tree_method='hist',  # Fast histogram method for CPU
        random_state=42,
        n_jobs=-1,
        verbosity=0  # Suppress XGBoost output
    )
    print("   Base estimator: XGBRegressor (tree_method='hist')")
else:
    base_estimator = RandomForestRegressor(
        n_estimators=100,
        max_depth=10,
        random_state=42,
        n_jobs=-1
    )
    print("   Base estimator: RandomForestRegressor")

# Create Pipeline: StandardScaler -> MultiOutputRegressor
# This is fully serializable and KServe sklearn server can load it!
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('regressor', MultiOutputRegressor(base_estimator))
])

print("   Pipeline steps:")
print("     1. StandardScaler (normalizes features)")
print("     2. MultiOutputRegressor (wraps base estimator for 5 outputs)")

# ====================
# TRAIN THE PIPELINE
# ====================
print(f"\nüéØ Training multi-output pipeline on {len(X_train_all)} samples...")
print("   This will train 5 separate regressors (one per target metric)")

pipeline.fit(X_train_all, y_train_all)

print("   ‚úÖ Pipeline trained successfully!")

# Test predictions
test_input = X_train_all[:1]  # Use first sample for testing
test_output = pipeline.predict(test_input)
print(f"\nüß™ Test prediction:")
print(f"   Input shape: {test_input.shape}")
print(f"   Output shape: {test_output.shape}")
print(f"   Output values: {test_output[0]}")
print(f"   Output order: [cpu, memory, disk, network_in, network_out]")

# ====================
# SAVE THE PIPELINE
# ====================
kserve_dir = MODELS_DIR / MODEL_NAME
kserve_dir.mkdir(parents=True, exist_ok=True)

model_path = kserve_dir / 'model.pkl'
joblib.dump(pipeline, model_path)
print(f"\n‚úÖ sklearn Pipeline saved to: {model_path}")

# Verify the saved model
expected_path = MODELS_DIR / MODEL_NAME / 'model.pkl'
print(f"\nüîç Verifying saved model...")

if expected_path.exists():
    # Load and verify
    loaded_pipeline = joblib.load(expected_path)
    size_kb = expected_path.stat().st_size / 1024
    
    # Verify predict() works
    verify_output = loaded_pipeline.predict(test_input)
    
    print(f"\n‚úÖ SUCCESS! sklearn Pipeline saved and verified:")
    print(f"   Location: {expected_path}")
    print(f"   Size: {size_kb:.2f} KB")
    print(f"   Type: {type(loaded_pipeline).__name__}")
    print(f"   Pipeline steps: {list(loaded_pipeline.named_steps.keys())}")
    print(f"   Has predict(): {hasattr(loaded_pipeline, 'predict')}")
    print(f"   Output shape: {verify_output.shape}")
    
    print(f"\nüì° KServe Response Format:")
    print(f"   Request: {{\"instances\": [[... {len(feature_columns)} features ...]]}}")
    print(f"   Response: {{\"predictions\": [[cpu, mem, disk, net_in, net_out]]}}")
    
    print(f"\nüîó Coordination Engine Integration:")
    print(f"   predictions[0] = cpu_usage")
    print(f"   predictions[1] = memory_usage")
    print(f"   predictions[2] = disk_usage")
    print(f"   predictions[3] = network_in")
    print(f"   predictions[4] = network_out")
    
    print(f"\n‚ö†Ô∏è  IMPORTANT: Feature count = {len(feature_columns)}")
    print(f"   Coordination engine must send exactly {len(feature_columns)} features!")
    
else:
    print(f"\n‚ùå ERROR: Model not found at expected location: {expected_path}")

# Store feature count for later reference
FEATURE_COUNT = len(feature_columns)
print(f"\nüìä Model expects {FEATURE_COUNT} input features")
print("\n" + "=" * 80)


### Test Model Loading (Verification)

In [None]:
# Verify the sklearn Pipeline can be loaded back (as KServe sklearn server would)
print("üß™ Testing sklearn Pipeline loading (simulating KServe sklearn server)...")

import joblib

# Load the saved Pipeline (exactly as KServe sklearn server will do)
model_path = MODELS_DIR / MODEL_NAME / 'model.pkl'
loaded_pipeline = joblib.load(model_path)

print(f"\n‚úÖ Pipeline loaded successfully!")
print(f"   Type: {type(loaded_pipeline).__name__}")
print(f"   Steps: {list(loaded_pipeline.named_steps.keys())}")

# Verify it's a standard sklearn Pipeline
assert type(loaded_pipeline).__name__ == 'Pipeline', "Model must be sklearn Pipeline!"
print(f"   ‚úÖ Model is a standard sklearn Pipeline (KServe compatible)")

# Test prediction with validation data
X_val = val_data[feature_columns].values
y_val_pred = loaded_pipeline.predict(X_val[:10])

print(f"\nüß™ Test prediction on validation data:")
print(f"   Input shape: {X_val[:10].shape}")
print(f"   Output shape: {y_val_pred.shape}")
print(f"   First prediction: {y_val_pred[0]}")
print(f"   Output columns: [cpu, memory, disk, network_in, network_out]")

# Verify output shape is correct
assert y_val_pred.shape[1] == 5, f"Expected 5 outputs, got {y_val_pred.shape[1]}"
print(f"   ‚úÖ Output has 5 columns (all metrics)")

print(f"\nüéâ sklearn Pipeline is ready for KServe deployment!")
print(f"   - No custom classes")
print(f"   - Standard sklearn serialization")
print(f"   - KServe sklearn server can load directly")

## Deployment Verification Section

### Generate KServe Test Commands

In [None]:
# Generate commands for testing the deployed model
print("üìã KServe Deployment Test Commands:\n")
print("=" * 80)
print("After deploying the InferenceService, run these commands to verify:\n")

print("# 1. Get the predictor pod IP")
print(f"PREDICTOR_IP=$(oc get pod -l serving.kserve.io/inferenceservice={MODEL_NAME} -o jsonpath='{{.items[0].status.podIP}}')")
print(f"echo \"Predictor IP: $PREDICTOR_IP\"\n")

print("# 2. List available models (should return 'predictive-analytics', not 'model')")
print("curl http://${PREDICTOR_IP}:8080/v1/models")
print(f"# Expected: {{\"models\":[\"{MODEL_NAME}\"]}}  ‚úÖ\n")

print("# 3. Check model status")
print(f"curl http://${{PREDICTOR_IP}}:8080/v1/models/{MODEL_NAME}")
print(f"# Expected: {{\"name\":\"{MODEL_NAME}\",\"ready\":true}}  ‚úÖ\n")

print("# 4. Test prediction endpoint")
print(f"curl -X POST http://${{PREDICTOR_IP}}:8080/v1/models/{MODEL_NAME}:predict \\")
print("  -H 'Content-Type: application/json' \\")
print("  -d '{\"instances\": [[0.5, 0.6, 0.4, 100, 80]]}'")
print("# Expected: Prediction response with forecast values  ‚úÖ\n")

print("=" * 80)
print("\n‚úÖ If all commands work, Issue #13 is fixed!")

## Summary

### What Was Accomplished

‚úÖ **Model trained** with multi-metric forecasting (CPU, memory, disk, network)  
‚úÖ **Saved in KServe format**: `/mnt/models/predictive-analytics/model.pkl`  
‚úÖ **Issue #13 fixed**: Model will register as `"predictive-analytics"` not `"model"`  
‚úÖ **Validated**: Model can be loaded and makes predictions  
‚úÖ **Ready for deployment**: Compatible with KServe InferenceService

### Next Steps

1. **Deploy InferenceService** (if not already deployed):
   ```yaml
   apiVersion: serving.kserve.io/v1beta1
   kind: InferenceService
   metadata:
     name: predictive-analytics
   spec:
     predictor:
       model:
         name: predictive-analytics
         runtime: sklearn-pvc-runtime
         storageUri: "pvc://model-storage-pvc/predictive-analytics"
   ```

2. **Verify deployment** using the commands above

3. **Test from coordination engine**:
   - Ensure coordination engine can call `/v1/models/predictive-analytics:predict`
   - Verify predictions work end-to-end

4. **Monitor predictions**:
   - Check prediction accuracy over time
   - Retrain periodically with new data

### References

- **Issue**: [#13 - KServe model registration fix](https://github.com/tosin2013/openshift-aiops-platform/issues/13)
- **Module**: `src/models/predictive_analytics.py`
- **Training script**: `src/models/train_predictive_analytics.py`
- **Documentation**: `src/models/KSERVE_FIX_README.md`