# Models & Training - Drone Fault Detection

This notebook consolidates all model definitions and training pipelines:
- **Classical Models**: Random Forest for fault detection, type classification, and severity regression
- **Deep Learning MTL**: Multi-Task Learning CNN for simultaneous prediction
- **Unsupervised Learning**: Isolation Forest for anomaly detection

---

## Table of Contents
1. **Setup & Imports**
2. **Model Definitions**
   - Classical Models (Random Forest)
   - Deep MTL (CNN 1D)
3. **Training Functions**
   - Classical Training
   - Deep MTL Training
   - Unsupervised Training
4. **Evaluation Functions**
5. **Training Pipelines** (Main scripts as cells)

## 1. Setup & Imports

In [None]:
import os
import sys
from pathlib import Path
import json
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from dataclasses import dataclass
from typing import Tuple, Dict

# Scikit-learn
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor, IsolationForest
from sklearn.model_selection import train_test_split
from sklearn.metrics import (
    accuracy_score, f1_score, classification_report, confusion_matrix,
    mean_absolute_error, roc_auc_score
)
from joblib import dump, load

# PyTorch
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from torch.optim import Adam

## 2. Model Definitions

### 2.1 Classical Models (Random Forest)

In [None]:
@dataclass
class ClassicalModels:
    """Container for classical ML models"""
    fault_clf: RandomForestClassifier
    type_clf: RandomForestClassifier
    severity_reg: RandomForestRegressor


def train_classical_models(
    X_train: pd.DataFrame,
    y_fault_train: np.ndarray,
    y_type_train: np.ndarray,
    y_sev_train: np.ndarray,
) -> ClassicalModels:
    """
    Train three models: fault detection, fault type, severity.
    
    Args:
        X_train: Feature matrix (samples √ó features)
        y_fault_train: Binary fault labels (0/1)
        y_type_train: Fault type labels
        y_sev_train: Severity levels (0-3)
    
    Returns:
        ClassicalModels containing trained models
    """
    print("Training fault detection classifier...")
    fault_clf = RandomForestClassifier(n_estimators=200, class_weight="balanced", random_state=42)
    fault_clf.fit(X_train, y_fault_train)

    print("Training fault type classifier...")
    type_clf = RandomForestClassifier(n_estimators=200, class_weight="balanced", random_state=42)
    type_clf.fit(X_train, y_type_train)

    print("Training severity regressor...")
    severity_reg = RandomForestRegressor(n_estimators=200, random_state=42)
    severity_reg.fit(X_train, y_sev_train)

    return ClassicalModels(
        fault_clf=fault_clf, type_clf=type_clf, severity_reg=severity_reg
    )


def predict_with_classical(
    models: ClassicalModels, X: pd.DataFrame
) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
    """
    Make predictions with classical models.
    
    Returns:
        prob_fault: Probability of fault (0-1)
        type_pred: Predicted fault type
        sev_pred: Predicted severity level
    """
    prob_fault = models.fault_clf.predict_proba(X)[:, 1]
    type_pred = models.type_clf.predict(X)
    sev_pred = models.severity_reg.predict(X)
    return prob_fault, type_pred, sev_pred

print("‚úì Classical models defined")

### 2.2 Deep Learning Multi-Task Learning (CNN 1D)

In [None]:
class CNNMTL(nn.Module):
    """
    1D CNN Multi-Task Learning model for drone fault detection.
    
    Input: (batch, seq_len, n_features)
    Outputs: fault detection, fault type, severity level
    """

    def __init__(self, in_channels: int, n_fault_types: int):
        super().__init__()
        self.backbone = nn.Sequential(
            nn.Conv1d(in_channels, 64, kernel_size=5, padding=2),
            nn.BatchNorm1d(64),
            nn.ReLU(),
            nn.MaxPool1d(2),

            nn.Conv1d(64, 128, kernel_size=5, padding=2),
            nn.BatchNorm1d(128),
            nn.ReLU(),
            nn.MaxPool1d(2),

            nn.Conv1d(128, 256, kernel_size=3, padding=1),
            nn.BatchNorm1d(256),
            nn.ReLU(),
            nn.AdaptiveAvgPool1d(1),
        )
        self.head_fault = nn.Linear(256, 2)  # Binary fault detection
        self.head_type = nn.Linear(256, n_fault_types)  # Fault type classification
        self.head_sev = nn.Linear(256, 4)  # 4 severity levels (0-3)

    def forward(self, x: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
        # x: (batch, seq_len, n_feat) -> (batch, n_feat, seq_len)
        x = x.transpose(1, 2)
        h = self.backbone(x).squeeze(-1)
        out_fault = self.head_fault(h)
        out_type = self.head_type(h)
        out_sev = self.head_sev(h)
        return out_fault, out_type, out_sev


def mtl_loss(
    logits_fault,
    logits_type,
    logits_sev,
    y_fault,
    y_type,
    y_sev,
    lambda_fault=1.0,
    lambda_type=1.0,
    lambda_sev=1.0,
):
    """
    Multi-task loss function combining three objectives.
    """
    loss_fault = F.cross_entropy(logits_fault, y_fault)
    loss_type = F.cross_entropy(logits_type, y_type)
    loss_sev = F.cross_entropy(logits_sev, y_sev)
    
    total = lambda_fault * loss_fault + lambda_type * loss_type + lambda_sev * loss_sev
    return total, {"fault": loss_fault.item(), "type": loss_type.item(), "sev": loss_sev.item()}

print("‚úì Deep MTL model defined")

### 2.3 PyTorch Dataset for Time Windows

In [None]:
class WindowDataset(Dataset):
    """
    PyTorch Dataset for time windows + labels.
    """

    def __init__(
        self,
        X: np.ndarray,
        y_fault: np.ndarray,
        y_type: np.ndarray,
        y_sev: np.ndarray,
    ):
        self.X = X.astype(np.float32)
        self.y_fault = y_fault.astype(np.int64)
        self.y_type = y_type.astype(np.int64)
        self.y_sev = y_sev.astype(np.int64)

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        return (
            self.X[idx],
            self.y_fault[idx],
            self.y_type[idx],
            self.y_sev[idx],
        )

print("‚úì WindowDataset defined")

## 3. Training Functions

### 3.1 Unsupervised Learning (Isolation Forest)

In [None]:
def train_isolation_forest(X_healthy: np.ndarray) -> IsolationForest:
    """
    Train unsupervised anomaly detector on healthy windows only.
    
    Args:
        X_healthy: Feature matrix of healthy samples only
    
    Returns:
        Trained IsolationForest model
    """
    iso = IsolationForest(
        n_estimators=200,
        contamination=0.1,
        random_state=42,
    )
    iso.fit(X_healthy)
    return iso


def anomaly_score(model: IsolationForest, X: np.ndarray) -> np.ndarray:
    """
    Calculate anomaly scores (higher = more anomalous).
    """
    # IsolationForest returns negative anomaly scores
    raw = model.score_samples(X)
    return -raw

print("‚úì Unsupervised training functions defined")

### 3.2 Deep MTL Training Loop

In [None]:
def train_deep_mtl(
    X: np.ndarray,
    y_fault: np.ndarray,
    y_type: np.ndarray,
    y_sev: np.ndarray,
    batch_size: int = 32,
    num_epochs: int = 50,
    lr: float = 1e-3,
    val_split: float = 0.2,
    lambda_fault: float = 1.0,
    lambda_type: float = 1.0,
    lambda_sev: float = 1.0,
    device: str = "cuda" if torch.cuda.is_available() else "cpu"
):
    """
    Train Multi-Task Learning CNN model.
    
    Args:
        X: Input windows (n_samples, seq_len, n_features)
        y_fault, y_type, y_sev: Labels
        batch_size: Batch size for training
        num_epochs: Number of training epochs
        lr: Learning rate
        val_split: Validation split ratio
        lambda_*: Loss weights for each task
        device: 'cuda' or 'cpu'
    
    Returns:
        Trained model and validation metrics
    """
    # Split train/val
    idx_train, idx_val = train_test_split(
        np.arange(len(X)), test_size=val_split, random_state=42, stratify=y_fault
    )
    
    train_dataset = WindowDataset(
        X[idx_train], y_fault[idx_train], y_type[idx_train], y_sev[idx_train]
    )
    val_dataset = WindowDataset(
        X[idx_val], y_fault[idx_val], y_type[idx_val], y_sev[idx_val]
    )
    
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
    val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
    
    # Initialize model
    n_features = X.shape[2]
    n_fault_types = int(y_type.max() + 1)
    model = CNNMTL(in_channels=n_features, n_fault_types=n_fault_types)
    model = model.to(device)
    
    optimizer = Adam(model.parameters(), lr=lr)
    
    print(f"Training on device: {device}")
    print(f"Train size: {len(train_dataset)}, Val size: {len(val_dataset)}")
    
    # Training loop
    history = {"train_loss": [], "val_loss": []}
    
    for epoch in range(num_epochs):
        # Training
        model.train()
        train_loss_epoch = 0.0
        
        for batch_X, batch_fault, batch_type, batch_sev in train_loader:
            batch_X = batch_X.to(device)
            batch_fault = batch_fault.to(device)
            batch_type = batch_type.to(device)
            batch_sev = batch_sev.to(device)
            
            optimizer.zero_grad()
            logits_fault, logits_type, logits_sev = model(batch_X)
            loss, _ = mtl_loss(
                logits_fault, logits_type, logits_sev,
                batch_fault, batch_type, batch_sev,
                lambda_fault, lambda_type, lambda_sev
            )
            loss.backward()
            optimizer.step()
            
            train_loss_epoch += loss.item() * len(batch_X)
        
        train_loss_epoch /= len(train_dataset)
        
        # Validation
        model.eval()
        val_loss_epoch = 0.0
        
        with torch.no_grad():
            for batch_X, batch_fault, batch_type, batch_sev in val_loader:
                batch_X = batch_X.to(device)
                batch_fault = batch_fault.to(device)
                batch_type = batch_type.to(device)
                batch_sev = batch_sev.to(device)
                
                logits_fault, logits_type, logits_sev = model(batch_X)
                loss, _ = mtl_loss(
                    logits_fault, logits_type, logits_sev,
                    batch_fault, batch_type, batch_sev,
                    lambda_fault, lambda_type, lambda_sev
                )
                val_loss_epoch += loss.item() * len(batch_X)
        
        val_loss_epoch /= len(val_dataset)
        
        history["train_loss"].append(train_loss_epoch)
        history["val_loss"].append(val_loss_epoch)
        
        if (epoch + 1) % 10 == 0:
            print(f"Epoch {epoch+1}/{num_epochs} - Train Loss: {train_loss_epoch:.4f}, Val Loss: {val_loss_epoch:.4f}")
    
    # Final validation metrics
    val_metrics = {"history": history, "final_val_loss": val_loss_epoch}
    
    return model, val_metrics

print("‚úì Deep MTL training function defined")

## 4. Evaluation Functions

In [None]:
def evaluate_fault_detection(y_true: np.ndarray, y_scores: np.ndarray, thr: float = 0.5) -> Dict:
    """
    Evaluate binary fault detection.
    
    Args:
        y_true: True binary labels (0/1)
        y_scores: Predicted probabilities or scores
        thr: Threshold for binary classification
    
    Returns:
        Dictionary of metrics
    """
    y_pred = (y_scores >= thr).astype(int)
    acc = accuracy_score(y_true, y_pred)
    f1 = f1_score(y_true, y_pred)
    
    return {
        "accuracy": acc,
        "f1": f1,
        "threshold": thr,
    }


def evaluate_fault_type(y_true: np.ndarray, y_pred: np.ndarray) -> Dict:
    """
    Evaluate multi-class fault type classification.
    
    Returns:
        Dictionary of metrics including confusion matrix
    """
    acc = accuracy_score(y_true, y_pred)
    f1_macro = f1_score(y_true, y_pred, average="macro")
    conf_mat = confusion_matrix(y_true, y_pred)
    
    return {
        "accuracy": acc,
        "f1_macro": f1_macro,
        "conf_mat": conf_mat,
    }


def evaluate_severity(y_true: np.ndarray, y_pred: np.ndarray) -> Dict:
    """
    Evaluate severity prediction (regression or ordinal).
    
    Returns:
        Dictionary of MAE metrics
    """
    mae = mean_absolute_error(y_true, y_pred)
    
    # Also compute MAE with rounded predictions (for ordinal levels)
    y_pred_rounded = np.round(y_pred).clip(0, 3).astype(int)
    mae_levels = mean_absolute_error(y_true, y_pred_rounded)
    
    return {
        "mae": mae,
        "mae_levels": mae_levels,
    }

print("‚úì Evaluation functions defined")

## 5. Complete Training Pipelines

These cells replicate the main training scripts as executable notebook cells.

### 5.1 Load Processed Data

In [None]:

import os, types
import pandas as pd
from botocore.client import Config
import ibm_boto3
import numpy as np
import io

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share the notebook.

cos_client = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='pHKiyR******************U3dUIXs4',
    ibm_auth_endpoint="https://iam.cloud.ibm.com/identity/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3.direct.us-south.cloud-object-storage.appdomain.cloud')

bucket = 'hackatonuav-donotdelete-pr-ctbzcaijxyh2yh'
object_key = 'X_windows.npy'

# load data of type "application/octet-stream" into a botocore.response.StreamingBody object.
# Please read the documentation of ibm_boto3 and pandas to learn more about the possibilities to load the data.
# ibm_boto3 documentation: https://ibm.github.io/ibm-cos-sdk-python/
# pandas documentation: http://pandas.pydata.org/

streaming_body_X = cos_client.get_object(Bucket=bucket, Key=object_key)['Body']
streaming_body_YF = cos_client.get_object(Bucket=bucket, Key="y_fault.npy")['Body']
streaming_body_YT = cos_client.get_object(Bucket=bucket, Key="y_type.npy")['Body']
streaming_body_YS = cos_client.get_object(Bucket=bucket, Key="y_sev.npy")['Body']

In [None]:
# Load windowed data
X_windows = np.load(io.BytesIO(streaming_body_X.read()), allow_pickle=True)
y_fault = np.load(io.BytesIO(streaming_body_YF.read()), allow_pickle=True)
y_type = np.load(io.BytesIO(streaming_body_YT.read()), allow_pickle=True)
y_sev = np.load(io.BytesIO(streaming_body_YS.read()), allow_pickle=True)

print(f"\nüìä Data loaded:")
print(f"  X_windows: {X_windows.shape}")
print(f"  y_fault: {y_fault.shape}")
print(f"  y_type: {y_type.shape}")
print(f"  y_sev: {y_sev.shape}")
print(f"\n  Unique fault values: {np.unique(y_fault)}")
print(f"  Unique type values: {np.unique(y_type)}")
print(f"  Unique severity values: {np.unique(y_sev)}")

### 5.3 Train Classical Random Forest Models

In [None]:
print("="*80)
print("TRAINING CLASSICAL MODELS (Random Forest)")
print("="*80)

# Split data
X_train, X_test, y_fault_train, y_fault_test, y_type_train, y_type_test, y_sev_train, y_sev_test = train_test_split(
    X_feat,
    y_fault,
    y_type,
    y_sev,
    test_size=0.2,
    random_state=42,
    stratify=y_fault,
)

print(f"Train size: {len(X_train)}, Test size: {len(X_test)}")

# Train models
classical_models = train_classical_models(
    X_train, y_fault_train, y_type_train, y_sev_train
)

# Make predictions
prob_fault_test, type_pred_test, sev_pred_test = predict_with_classical(classical_models, X_test)

# Evaluate
fault_metrics = evaluate_fault_detection(y_fault_test, prob_fault_test, thr=0.5)
type_metrics = evaluate_fault_type(y_type_test, type_pred_test)
sev_metrics = evaluate_severity(y_sev_test, sev_pred_test)

print("\n" + "="*80)
print("EVALUATION RESULTS - CLASSICAL MODELS")
print("="*80)

print("\n=== Fault Detection (Binary) ===")
print(f"Accuracy: {fault_metrics['accuracy']:.3f}")
print(f"F1 Score: {fault_metrics['f1']:.3f}")

print("\n=== Fault Type Classification ===")
print(f"Accuracy: {type_metrics['accuracy']:.3f}")
print(f"F1 Macro: {type_metrics['f1_macro']:.3f}")
print("\nConfusion Matrix:")
print(type_metrics['conf_mat'])

print("\n=== Severity Prediction ===")
print(f"MAE (continuous): {sev_metrics['mae']:.3f}")
print(f"MAE (levels 0-3): {sev_metrics['mae_levels']:.3f}")

# Save model
model_path = MODELS_DIR / "classical_rf_models.joblib"
meta = {
    "models": classical_models,
    "feature_columns": list(X_feat.columns),
    "metrics": {
        "fault": fault_metrics,
        "type": type_metrics,
        "severity": sev_metrics,
    }
}
dump(meta, model_path)
print(f"\n‚úì Model saved to {model_path}")

### 5.4 Train Deep Multi-Task Learning Model

In [None]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

In [None]:
print("="*80)
print("TRAINING DEEP MULTI-TASK LEARNING MODEL")
print("="*80)

# Training configuration
config = {
    "batch_size": 32,
    "num_epochs": 50,
    "lr": 1e-3,
    "val_split": 0.2,
    "lambda_fault": 1.0,
    "lambda_type": 1.0,
    "lambda_sev": 1.0,
}

print(f"Configuration: {config}")

# Train model
model, val_metrics = train_deep_mtl(
    X_windows,
    y_fault,
    y_type,
    y_sev,
    **config
)

print("\n" + "="*80)
print("TRAINING COMPLETE")
print("="*80)
print(f"Final validation loss: {val_metrics['final_val_loss']:.4f}")

# Plot training history
plt.figure(figsize=(10, 5))
plt.plot(val_metrics['history']['train_loss'], label='Train Loss')
plt.plot(val_metrics['history']['val_loss'], label='Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training History - Deep MTL Model')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# Save model
model_path = MODELS_DIR / "deep_mtl_cnn1d.pth"
checkpoint = {
    "state_dict": model.state_dict(),
    "n_features": X_windows.shape[2],
    "n_fault_types": int(y_type.max() + 1),
    "config": config,
    "val_metrics": val_metrics,
}
torch.save(checkpoint, model_path)
print(f"\n‚úì Model saved to {model_path}")

### 5.5 Train Unsupervised Anomaly Detection (Isolation Forest)

In [None]:
print("="*80)
print("TRAINING UNSUPERVISED ANOMALY DETECTION")
print("="*80)

# Filter healthy samples only
healthy_mask = (y_fault == 0)
X_healthy = X_feat[healthy_mask]

print(f"Healthy samples: {len(X_healthy)} / {len(X_feat)} ({len(X_healthy)/len(X_feat)*100:.1f}%)")

if len(X_healthy) == 0:
    print("‚ö†Ô∏è  No healthy samples found! Cannot train Isolation Forest.")
else:
    # Train model
    iso_model = train_isolation_forest(X_healthy.values)
    
    # Calculate anomaly scores for all samples
    scores = anomaly_score(iso_model, X_feat.values)
    
    # Determine threshold (95th percentile of healthy scores)
    healthy_scores = scores[healthy_mask]
    threshold = np.percentile(healthy_scores, 95.0)
    
    print(f"\nAnomaly threshold (95th percentile of healthy): {threshold:.4f}")
    
    # Evaluate
    unsup_metrics = evaluate_fault_detection(y_fault, scores, thr=threshold)
    
    print("\n" + "="*80)
    print("EVALUATION RESULTS - UNSUPERVISED")
    print("="*80)
    print(f"Accuracy: {unsup_metrics['accuracy']:.3f}")
    print(f"F1 Score: {unsup_metrics['f1']:.3f}")
    
    # Visualize score distribution
    plt.figure(figsize=(12, 5))
    
    plt.subplot(1, 2, 1)
    plt.hist(healthy_scores, bins=50, alpha=0.7, label='Healthy', edgecolor='black')
    plt.hist(scores[~healthy_mask], bins=50, alpha=0.7, label='Faulty', edgecolor='black')
    plt.axvline(threshold, color='red', linestyle='--', linewidth=2, label='Threshold')
    plt.xlabel('Anomaly Score')
    plt.ylabel('Frequency')
    plt.title('Anomaly Score Distribution')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    plt.subplot(1, 2, 2)
    plt.scatter(range(len(scores)), scores, c=y_fault, cmap='RdYlGn_r', alpha=0.5, s=10)
    plt.axhline(threshold, color='red', linestyle='--', linewidth=2, label='Threshold')
    plt.xlabel('Sample Index')
    plt.ylabel('Anomaly Score')
    plt.title('Anomaly Scores (colored by true label)')
    plt.colorbar(label='Fault (0=healthy, 1=faulty)')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Save model
    model_path = MODELS_DIR / "iso_forest_unsupervised.joblib"
    thr_path = MODELS_DIR / "iso_forest_threshold.npy"
    
    dump(iso_model, model_path)
    np.save(thr_path, np.array([threshold]))
    
    print(f"\n‚úì Model saved to {model_path}")
    print(f"‚úì Threshold saved to {thr_path}")

## 6. Summary & Next Steps

### ‚úÖ Models Trained:
1. **Classical Random Forest**: 3 separate models for fault detection, type classification, and severity regression
2. **Deep MTL CNN**: Single end-to-end model for all three tasks simultaneously
3. **Isolation Forest**: Unsupervised anomaly detection

### üìÅ Saved Models:
- `models/classical_rf_models.joblib`
- `models/deep_mtl_cnn1d.pth`
- `models/iso_forest_unsupervised.joblib`
- `models/iso_forest_threshold.npy`

### üéØ Next Steps:
1. **Model Comparison**: Compare all three approaches
2. **Hyperparameter Tuning**: Optimize model parameters
3. **Feature Engineering**: Try additional features
4. **Ensemble Methods**: Combine multiple models
5. **Deployment**: Integrate best model into production system