# Tire Defect Prediction - Hybrid Model (Boosting + Deep Learning)
## Expert Manufacturing AI Solution for Smart Factory

**Objective:** Binary Classification of Tire Defects (NG vs Good)

**Approach:** Hybrid Stacking/Ensemble Model combining:
- Branch 1: XGBoost/CatBoost for Design/Process Features
- Branch 2: Deep Neural Network for Simulation Features
- Fusion: MLP Head for Final Decision

## 1. Setup and Imports

In [None]:
# Core Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split, StratifiedKFold
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    roc_auc_score, confusion_matrix, classification_report
)
import warnings
warnings.filterwarnings('ignore')

# Boosting Libraries
import xgboost as xgb
from catboost import CatBoostClassifier

# Deep Learning Libraries
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader, TensorDataset

# Set random seeds for reproducibility
SEED = 42
np.random.seed(SEED)
torch.manual_seed(SEED)
if torch.cuda.is_available():
    torch.cuda.manual_seed(SEED)

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

## 2. Data Loading

In [None]:
# Load training data
print("[INFO] Loading training data...")
df_train = pd.read_csv('train.csv')

print(f"Dataset shape: {df_train.shape}")
print(f"\nFirst few rows:")
display(df_train.head())

# Check target distribution
print(f"\nTarget Distribution:")
print(df_train['Label Class'].value_counts())
print(f"\nClass Balance:")
print(df_train['Label Class'].value_counts(normalize=True))

## 3. Feature Engineering (Modular Structure)
### 3.1 Define Feature Groups

In [None]:
# Define feature groups
DESIGN_FEATURES = ['Mass_Pilot', 'Width', 'Aspect', 'Inch', 'Plant'] + \
                  [f'Proc_Param{i}' for i in range(1, 12)]

SIMULATION_FEATURES = [f'x{i}' for i in range(1, 6)] + \
                      [f'y{i}' for i in range(1, 6)] + \
                      [f'x{i}' for i in range(0, 256)] + \
                      [f'y{i}' for i in range(0, 256)] + \
                      [f'p{i}' for i in range(0, 256)] + \
                      [f'G{i}' for i in range(1, 5)]

# Remove duplicates (x0-x5, y0-y5 might overlap)
SIMULATION_FEATURES = list(set(SIMULATION_FEATURES))
SIMULATION_FEATURES.sort()

TARGET = 'Label Class'

print(f"Design Features: {len(DESIGN_FEATURES)}")
print(f"Simulation Features: {len(SIMULATION_FEATURES)}")
print(f"Total Features: {len(DESIGN_FEATURES) + len(SIMULATION_FEATURES)}")

### 3.2 Modular Preprocessing Functions

In [None]:
class FeaturePreprocessor:
    """
    Modular preprocessing pipeline for tire defect prediction.
    """
    
    def __init__(self, verbose=True):
        self.verbose = verbose
        self.label_encoders = {}
        self.scalers = {}
        self.design_features = None
        self.simulation_features = None
        
    def _print(self, message):
        """Print if verbose mode is enabled."""
        if self.verbose:
            print(message)
    
    def handle_missing_values(self, df, strategy='median'):
        """
        Step 1: Handle missing values
        """
        self._print("\n[1] Currently processing: Missing Value Handling...")
        df_processed = df.copy()
        
        # Check missing values
        missing_counts = df_processed.isnull().sum()
        missing_features = missing_counts[missing_counts > 0]
        
        if len(missing_features) > 0:
            self._print(f"   - Found {len(missing_features)} features with missing values")
            for feature in missing_features.index:
                if df_processed[feature].dtype in ['float64', 'int64']:
                    if strategy == 'median':
                        fill_value = df_processed[feature].median()
                    elif strategy == 'mean':
                        fill_value = df_processed[feature].mean()
                    else:
                        fill_value = 0
                    df_processed[feature].fillna(fill_value, inplace=True)
                else:
                    # Categorical: fill with mode or 'Unknown'
                    df_processed[feature].fillna('Unknown', inplace=True)
            self._print(f"   - Missing values handled using {strategy} strategy")
        else:
            self._print("   - No missing values found")
        
        self._print("   ✓ Missing value handling complete")
        return df_processed
    
    def encode_categorical(self, df, categorical_features, fit=True):
        """
        Step 2: Encode categorical features
        """
        self._print("\n[2] Currently processing: Categorical Encoding...")
        df_processed = df.copy()
        
        for feature in categorical_features:
            if feature not in df_processed.columns:
                continue
                
            if fit:
                le = LabelEncoder()
                df_processed[feature] = le.fit_transform(df_processed[feature].astype(str))
                self.label_encoders[feature] = le
                self._print(f"   - Encoded {feature}: {len(le.classes_)} unique values")
            else:
                le = self.label_encoders[feature]
                # Handle unseen categories
                df_processed[feature] = df_processed[feature].astype(str).map(
                    lambda x: le.transform([x])[0] if x in le.classes_ else -1
                )
        
        self._print("   ✓ Categorical encoding complete")
        return df_processed
    
    def scale_features(self, df, feature_groups, fit=True):
        """
        Step 3: Scale numerical features by group
        """
        self._print("\n[3] Currently processing: Feature Scaling...")
        df_processed = df.copy()
        
        for group_name, features in feature_groups.items():
            # Filter features that exist in dataframe
            existing_features = [f for f in features if f in df_processed.columns]
            
            if len(existing_features) == 0:
                continue
            
            if fit:
                scaler = StandardScaler()
                df_processed[existing_features] = scaler.fit_transform(
                    df_processed[existing_features]
                )
                self.scalers[group_name] = scaler
                self._print(f"   - Scaled {group_name}: {len(existing_features)} features")
            else:
                scaler = self.scalers[group_name]
                df_processed[existing_features] = scaler.transform(
                    df_processed[existing_features]
                )
        
        self._print("   ✓ Feature scaling complete")
        return df_processed
    
    def fit_transform(self, df, design_features, simulation_features, target_col):
        """
        Main preprocessing pipeline (fit and transform)
        """
        self._print("\n" + "="*60)
        self._print("STARTING FEATURE ENGINEERING PIPELINE (FIT)")
        self._print("="*60)
        
        self.design_features = design_features
        self.simulation_features = simulation_features
        
        df_processed = df.copy()
        
        # Step 1: Handle missing values
        df_processed = self.handle_missing_values(df_processed, strategy='median')
        
        # Step 2: Encode categorical features
        categorical_features = ['Mass_Pilot', 'Plant']  # Add others if needed
        df_processed = self.encode_categorical(df_processed, categorical_features, fit=True)
        
        # Step 3: Scale features by group
        feature_groups = {
            'design': design_features,
            'simulation': simulation_features
        }
        df_processed = self.scale_features(df_processed, feature_groups, fit=True)
        
        self._print("\n" + "="*60)
        self._print("FEATURE ENGINEERING PIPELINE COMPLETE")
        self._print("="*60 + "\n")
        
        return df_processed
    
    def transform(self, df):
        """
        Apply fitted preprocessing (transform only)
        """
        self._print("\n" + "="*60)
        self._print("APPLYING FEATURE ENGINEERING PIPELINE (TRANSFORM)")
        self._print("="*60)
        
        df_processed = df.copy()
        
        # Step 1: Handle missing values
        df_processed = self.handle_missing_values(df_processed, strategy='median')
        
        # Step 2: Encode categorical features
        categorical_features = list(self.label_encoders.keys())
        df_processed = self.encode_categorical(df_processed, categorical_features, fit=False)
        
        # Step 3: Scale features
        feature_groups = {
            'design': self.design_features,
            'simulation': self.simulation_features
        }
        df_processed = self.scale_features(df_processed, feature_groups, fit=False)
        
        self._print("\n" + "="*60)
        self._print("FEATURE ENGINEERING PIPELINE COMPLETE")
        self._print("="*60 + "\n")
        
        return df_processed

### 3.3 Apply Preprocessing

In [None]:
# Convert target to binary (NG=1, Good=0)
df_train['Label Class'] = (df_train['Label Class'] == 'NG').astype(int)

# Initialize preprocessor
preprocessor = FeaturePreprocessor(verbose=True)

# Apply preprocessing
df_processed = preprocessor.fit_transform(
    df_train,
    design_features=DESIGN_FEATURES,
    simulation_features=SIMULATION_FEATURES,
    target_col=TARGET
)

# Split features and target
X = df_processed.drop(columns=[TARGET])
y = df_processed[TARGET]

print(f"\nFinal processed dataset shape: {X.shape}")
print(f"Target shape: {y.shape}")

## 4. Feature Selection Using Boosting
### 4.1 Train Temporary Boosting Model for Feature Importance

In [None]:
print("\n" + "="*60)
print("FEATURE SELECTION USING BOOSTING")
print("="*60)

# Split data for feature selection
X_train_fs, X_val_fs, y_train_fs, y_val_fs = train_test_split(
    X, y, test_size=0.2, random_state=SEED, stratify=y
)

print(f"\n[INFO] Training temporary XGBoost model for feature importance...")

# Train XGBoost model
xgb_fs = xgb.XGBClassifier(
    n_estimators=200,
    max_depth=6,
    learning_rate=0.1,
    random_state=SEED,
    tree_method='hist',
    enable_categorical=False
)

xgb_fs.fit(X_train_fs, y_train_fs)

# Get feature importance
feature_importance = pd.DataFrame({
    'feature': X.columns,
    'importance': xgb_fs.feature_importances_
}).sort_values('importance', ascending=False)

print(f"\n[INFO] Top 20 Most Important Features:")
print(feature_importance.head(20))

# Plot feature importance
plt.figure(figsize=(12, 6))
plt.barh(feature_importance.head(30)['feature'][::-1], 
         feature_importance.head(30)['importance'][::-1])
plt.xlabel('Importance')
plt.title('Top 30 Feature Importance (XGBoost)')
plt.tight_layout()
plt.show()

### 4.2 Select Top N Features

In [None]:
# Select top N features or features above threshold
TOP_N_FEATURES = 100  # Adjust based on performance
IMPORTANCE_THRESHOLD = 0.001  # Alternative: use threshold

# Method 1: Top N features
selected_features = feature_importance.head(TOP_N_FEATURES)['feature'].tolist()

# Method 2: Features above threshold (uncomment to use)
# selected_features = feature_importance[
#     feature_importance['importance'] > IMPORTANCE_THRESHOLD
# ]['feature'].tolist()

print(f"\n[INFO] Selected {len(selected_features)} features for final model")

# Separate selected features into design and simulation groups
selected_design_features = [f for f in selected_features if f in DESIGN_FEATURES]
selected_simulation_features = [f for f in selected_features if f in SIMULATION_FEATURES]

print(f"   - Design features selected: {len(selected_design_features)}")
print(f"   - Simulation features selected: {len(selected_simulation_features)}")

# Create filtered datasets
X_selected = X[selected_features]
X_design = X[selected_design_features]
X_simulation = X[selected_simulation_features]

print("\n" + "="*60)
print("FEATURE SELECTION COMPLETE")
print("="*60)

## 5. Hybrid Model Architecture
### 5.1 Define Deep Neural Network Branch

In [None]:
class SimulationDNN(nn.Module):
    """
    Deep Neural Network for Simulation Features
    Returns both latent representations and predictions
    """
    
    def __init__(self, input_dim, latent_dim=64, dropout_rate=0.3):
        super(SimulationDNN, self).__init__()
        
        self.encoder = nn.Sequential(
            nn.Linear(input_dim, 256),
            nn.BatchNorm1d(256),
            nn.ReLU(),
            nn.Dropout(dropout_rate),
            
            nn.Linear(256, 128),
            nn.BatchNorm1d(128),
            nn.ReLU(),
            nn.Dropout(dropout_rate),
            
            nn.Linear(128, latent_dim),
            nn.BatchNorm1d(latent_dim),
            nn.ReLU()
        )
        
        # Prediction head
        self.predictor = nn.Sequential(
            nn.Linear(latent_dim, 32),
            nn.ReLU(),
            nn.Dropout(dropout_rate),
            nn.Linear(32, 1)
        )
    
    def forward(self, x):
        latent = self.encoder(x)  # h2
        prediction = self.predictor(latent)  # p2 (logits)
        return latent, prediction


class HybridFusionModel(nn.Module):
    """
    Hybrid Model combining Boosting and DNN outputs
    """
    
    def __init__(self, boosting_pred_dim, boosting_latent_dim, 
                 dnn_latent_dim, dropout_rate=0.3):
        super(HybridFusionModel, self).__init__()
        
        # Total input dimension: p1 + h1 + h2 + p2
        total_dim = boosting_pred_dim + boosting_latent_dim + dnn_latent_dim + 1
        
        self.fusion = nn.Sequential(
            nn.Linear(total_dim, 64),
            nn.BatchNorm1d(64),
            nn.ReLU(),
            nn.Dropout(dropout_rate),
            
            nn.Linear(64, 32),
            nn.BatchNorm1d(32),
            nn.ReLU(),
            nn.Dropout(dropout_rate),
            
            nn.Linear(32, 1)
        )
    
    def forward(self, boosting_pred, boosting_latent, dnn_latent, dnn_pred):
        # Concatenate all inputs
        fused = torch.cat([boosting_pred, boosting_latent, dnn_latent, dnn_pred], dim=1)
        output = self.fusion(fused)
        return output


print("[INFO] Neural network architectures defined successfully")

### 5.2 Prepare Data for Hybrid Model

In [None]:
# Split data into train and validation sets
(
    X_design_train, X_design_val,
    X_sim_train, X_sim_val,
    y_train, y_val
) = train_test_split(
    X_design, X_simulation, y,
    test_size=0.2,
    random_state=SEED,
    stratify=y
)

print(f"Training set size: {len(y_train)}")
print(f"Validation set size: {len(y_val)}")
print(f"\nDesign features shape: {X_design_train.shape}")
print(f"Simulation features shape: {X_sim_train.shape}")

### 5.3 Train Branch 1: Boosting Model (Design Features)

In [None]:
print("\n" + "="*60)
print("TRAINING BRANCH 1: XGBoost (Design Features)")
print("="*60)

# Train XGBoost model on design features
xgb_model = xgb.XGBClassifier(
    n_estimators=300,
    max_depth=7,
    learning_rate=0.05,
    subsample=0.8,
    colsample_bytree=0.8,
    random_state=SEED,
    tree_method='hist',
    enable_categorical=False,
    eval_metric='logloss'
)

print("[INFO] Training XGBoost model...")
xgb_model.fit(
    X_design_train, y_train,
    eval_set=[(X_design_val, y_val)],
    verbose=50
)

# Get predictions and leaf indices (as latent representation)
print("\n[INFO] Extracting XGBoost outputs...")
xgb_pred_train = xgb_model.predict_proba(X_design_train)[:, 1].reshape(-1, 1)
xgb_pred_val = xgb_model.predict_proba(X_design_val)[:, 1].reshape(-1, 1)

# Get leaf indices as latent features (h1)
xgb_latent_train = xgb_model.apply(X_design_train)  # Shape: (n_samples, n_estimators)
xgb_latent_val = xgb_model.apply(X_design_val)

print(f"   - XGBoost predictions shape: {xgb_pred_train.shape}")
print(f"   - XGBoost latent features shape: {xgb_latent_train.shape}")

# Evaluate XGBoost performance
xgb_val_pred_binary = (xgb_pred_val > 0.5).astype(int)
print(f"\n[RESULTS] XGBoost Branch Performance:")
print(f"   - Validation Accuracy: {accuracy_score(y_val, xgb_val_pred_binary):.4f}")
print(f"   - Validation ROC-AUC: {roc_auc_score(y_val, xgb_pred_val):.4f}")

print("\n" + "="*60)
print("BRANCH 1 TRAINING COMPLETE")
print("="*60)

### 5.4 Train Branch 2: DNN Model (Simulation Features)

In [None]:
print("\n" + "="*60)
print("TRAINING BRANCH 2: DNN (Simulation Features)")
print("="*60)

# Convert to PyTorch tensors
X_sim_train_tensor = torch.FloatTensor(X_sim_train.values).to(device)
X_sim_val_tensor = torch.FloatTensor(X_sim_val.values).to(device)
y_train_tensor = torch.FloatTensor(y_train.values).reshape(-1, 1).to(device)
y_val_tensor = torch.FloatTensor(y_val.values).reshape(-1, 1).to(device)

# Create DataLoaders
train_dataset = TensorDataset(X_sim_train_tensor, y_train_tensor)
val_dataset = TensorDataset(X_sim_val_tensor, y_val_tensor)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=128, shuffle=False)

# Initialize DNN model
dnn_model = SimulationDNN(
    input_dim=X_sim_train.shape[1],
    latent_dim=64,
    dropout_rate=0.3
).to(device)

# Loss and optimizer
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(dnn_model.parameters(), lr=0.001, weight_decay=1e-5)
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', patience=5, factor=0.5)

# Training loop
print("\n[INFO] Training DNN model...")
n_epochs = 100
best_val_loss = float('inf')
patience = 15
patience_counter = 0

train_losses = []
val_losses = []

for epoch in range(n_epochs):
    # Training phase
    dnn_model.train()
    train_loss = 0.0
    
    for batch_X, batch_y in train_loader:
        optimizer.zero_grad()
        latent, pred = dnn_model(batch_X)
        loss = criterion(pred, batch_y)
        loss.backward()
        optimizer.step()
        train_loss += loss.item()
    
    train_loss /= len(train_loader)
    train_losses.append(train_loss)
    
    # Validation phase
    dnn_model.eval()
    val_loss = 0.0
    
    with torch.no_grad():
        for batch_X, batch_y in val_loader:
            latent, pred = dnn_model(batch_X)
            loss = criterion(pred, batch_y)
            val_loss += loss.item()
    
    val_loss /= len(val_loader)
    val_losses.append(val_loss)
    
    # Learning rate scheduling
    scheduler.step(val_loss)
    
    # Print progress
    if (epoch + 1) % 10 == 0:
        print(f"Epoch [{epoch+1}/{n_epochs}] - Train Loss: {train_loss:.4f}, Val Loss: {val_loss:.4f}")
    
    # Early stopping
    if val_loss < best_val_loss:
        best_val_loss = val_loss
        patience_counter = 0
        # Save best model
        torch.save(dnn_model.state_dict(), 'best_dnn_model.pth')
    else:
        patience_counter += 1
        if patience_counter >= patience:
            print(f"\n[INFO] Early stopping triggered at epoch {epoch+1}")
            break

# Load best model
dnn_model.load_state_dict(torch.load('best_dnn_model.pth'))

# Plot training history
plt.figure(figsize=(10, 5))
plt.plot(train_losses, label='Train Loss')
plt.plot(val_losses, label='Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('DNN Training History')
plt.legend()
plt.grid(True)
plt.show()

# Extract DNN outputs
print("\n[INFO] Extracting DNN outputs...")
dnn_model.eval()
with torch.no_grad():
    dnn_latent_train, dnn_pred_train = dnn_model(X_sim_train_tensor)
    dnn_latent_val, dnn_pred_val = dnn_model(X_sim_val_tensor)

print(f"   - DNN latent features shape: {dnn_latent_train.shape}")
print(f"   - DNN predictions shape: {dnn_pred_train.shape}")

# Evaluate DNN performance
dnn_val_pred_prob = torch.sigmoid(dnn_pred_val).cpu().numpy()
dnn_val_pred_binary = (dnn_val_pred_prob > 0.5).astype(int)
print(f"\n[RESULTS] DNN Branch Performance:")
print(f"   - Validation Accuracy: {accuracy_score(y_val, dnn_val_pred_binary):.4f}")
print(f"   - Validation ROC-AUC: {roc_auc_score(y_val, dnn_val_pred_prob):.4f}")

print("\n" + "="*60)
print("BRANCH 2 TRAINING COMPLETE")
print("="*60)

### 5.5 Train Fusion Model

In [None]:
print("\n" + "="*60)
print("TRAINING FUSION MODEL (Final Layer)")
print("="*60)

# Prepare fusion inputs
# p1: XGBoost predictions
xgb_pred_train_tensor = torch.FloatTensor(xgb_pred_train).to(device)
xgb_pred_val_tensor = torch.FloatTensor(xgb_pred_val).to(device)

# h1: XGBoost latent (leaf indices) - normalize
from sklearn.preprocessing import StandardScaler
scaler_xgb_latent = StandardScaler()
xgb_latent_train_scaled = scaler_xgb_latent.fit_transform(xgb_latent_train)
xgb_latent_val_scaled = scaler_xgb_latent.transform(xgb_latent_val)

xgb_latent_train_tensor = torch.FloatTensor(xgb_latent_train_scaled).to(device)
xgb_latent_val_tensor = torch.FloatTensor(xgb_latent_val_scaled).to(device)

# h2 and p2 are already tensors from DNN

# Create fusion datasets
fusion_train_dataset = TensorDataset(
    xgb_pred_train_tensor,
    xgb_latent_train_tensor,
    dnn_latent_train,
    dnn_pred_train,
    y_train_tensor
)

fusion_val_dataset = TensorDataset(
    xgb_pred_val_tensor,
    xgb_latent_val_tensor,
    dnn_latent_val,
    dnn_pred_val,
    y_val_tensor
)

fusion_train_loader = DataLoader(fusion_train_dataset, batch_size=64, shuffle=True)
fusion_val_loader = DataLoader(fusion_val_dataset, batch_size=128, shuffle=False)

# Initialize fusion model
fusion_model = HybridFusionModel(
    boosting_pred_dim=1,
    boosting_latent_dim=xgb_latent_train.shape[1],
    dnn_latent_dim=64,
    dropout_rate=0.3
).to(device)

# Loss and optimizer
criterion_fusion = nn.BCEWithLogitsLoss()
optimizer_fusion = optim.Adam(fusion_model.parameters(), lr=0.001, weight_decay=1e-5)
scheduler_fusion = optim.lr_scheduler.ReduceLROnPlateau(optimizer_fusion, mode='min', patience=5, factor=0.5)

# Training loop
print("\n[INFO] Training fusion model...")
n_epochs_fusion = 80
best_val_loss_fusion = float('inf')
patience_fusion = 15
patience_counter_fusion = 0

fusion_train_losses = []
fusion_val_losses = []

for epoch in range(n_epochs_fusion):
    # Training phase
    fusion_model.train()
    train_loss = 0.0
    
    for xgb_p, xgb_h, dnn_h, dnn_p, batch_y in fusion_train_loader:
        optimizer_fusion.zero_grad()
        output = fusion_model(xgb_p, xgb_h, dnn_h, dnn_p)
        loss = criterion_fusion(output, batch_y)
        loss.backward()
        optimizer_fusion.step()
        train_loss += loss.item()
    
    train_loss /= len(fusion_train_loader)
    fusion_train_losses.append(train_loss)
    
    # Validation phase
    fusion_model.eval()
    val_loss = 0.0
    
    with torch.no_grad():
        for xgb_p, xgb_h, dnn_h, dnn_p, batch_y in fusion_val_loader:
            output = fusion_model(xgb_p, xgb_h, dnn_h, dnn_p)
            loss = criterion_fusion(output, batch_y)
            val_loss += loss.item()
    
    val_loss /= len(fusion_val_loader)
    fusion_val_losses.append(val_loss)
    
    # Learning rate scheduling
    scheduler_fusion.step(val_loss)
    
    # Print progress
    if (epoch + 1) % 10 == 0:
        print(f"Epoch [{epoch+1}/{n_epochs_fusion}] - Train Loss: {train_loss:.4f}, Val Loss: {val_loss:.4f}")
    
    # Early stopping
    if val_loss < best_val_loss_fusion:
        best_val_loss_fusion = val_loss
        patience_counter_fusion = 0
        # Save best model
        torch.save(fusion_model.state_dict(), 'best_fusion_model.pth')
    else:
        patience_counter_fusion += 1
        if patience_counter_fusion >= patience_fusion:
            print(f"\n[INFO] Early stopping triggered at epoch {epoch+1}")
            break

# Load best model
fusion_model.load_state_dict(torch.load('best_fusion_model.pth'))

# Plot training history
plt.figure(figsize=(10, 5))
plt.plot(fusion_train_losses, label='Train Loss')
plt.plot(fusion_val_losses, label='Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Fusion Model Training History')
plt.legend()
plt.grid(True)
plt.show()

print("\n" + "="*60)
print("FUSION MODEL TRAINING COMPLETE")
print("="*60)

## 6. Model Evaluation
### 6.1 Generate Final Predictions

In [None]:
print("\n" + "="*60)
print("FINAL MODEL EVALUATION")
print("="*60)

# Get final predictions on validation set
fusion_model.eval()
with torch.no_grad():
    final_pred_logits = fusion_model(
        xgb_pred_val_tensor,
        xgb_latent_val_tensor,
        dnn_latent_val,
        dnn_pred_val
    )
    final_pred_probs = torch.sigmoid(final_pred_logits).cpu().numpy()

final_pred_binary = (final_pred_probs > 0.5).astype(int)

# Calculate metrics
accuracy = accuracy_score(y_val, final_pred_binary)
precision = precision_score(y_val, final_pred_binary)
recall = recall_score(y_val, final_pred_binary)
f1 = f1_score(y_val, final_pred_binary)
roc_auc = roc_auc_score(y_val, final_pred_probs)

print("\n[FINAL RESULTS] Hybrid Model Performance:")
print(f"   - Accuracy:  {accuracy:.4f}")
print(f"   - Precision: {precision:.4f}")
print(f"   - Recall:    {recall:.4f}")
print(f"   - F1-Score:  {f1:.4f}")
print(f"   - ROC-AUC:   {roc_auc:.4f}")

# Confusion Matrix
cm = confusion_matrix(y_val, final_pred_binary)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', cbar=False)
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix - Hybrid Model')
plt.show()

# Classification Report
print("\n[CLASSIFICATION REPORT]")
print(classification_report(y_val, final_pred_binary, target_names=['Good', 'NG']))

### 6.2 Compare Individual Models vs Hybrid

In [None]:
# Create comparison table
comparison_results = pd.DataFrame({
    'Model': ['XGBoost (Design)', 'DNN (Simulation)', 'Hybrid Fusion'],
    'Accuracy': [
        accuracy_score(y_val, xgb_val_pred_binary),
        accuracy_score(y_val, dnn_val_pred_binary),
        accuracy
    ],
    'ROC-AUC': [
        roc_auc_score(y_val, xgb_pred_val),
        roc_auc_score(y_val, dnn_val_pred_prob),
        roc_auc
    ],
    'F1-Score': [
        f1_score(y_val, xgb_val_pred_binary),
        f1_score(y_val, dnn_val_pred_binary),
        f1
    ]
})

print("\n[MODEL COMPARISON]")
print(comparison_results.to_string(index=False))

# Visualize comparison
fig, axes = plt.subplots(1, 3, figsize=(15, 4))

metrics = ['Accuracy', 'ROC-AUC', 'F1-Score']
for idx, metric in enumerate(metrics):
    axes[idx].bar(comparison_results['Model'], comparison_results[metric])
    axes[idx].set_ylabel(metric)
    axes[idx].set_title(f'{metric} Comparison')
    axes[idx].set_ylim([0, 1])
    axes[idx].tick_params(axis='x', rotation=15)
    for i, v in enumerate(comparison_results[metric]):
        axes[idx].text(i, v + 0.02, f'{v:.3f}', ha='center')

plt.tight_layout()
plt.show()

## 7. Prediction Pipeline for New Data
### 7.1 Create End-to-End Prediction Function

In [None]:
def predict_tire_defects(df_new, preprocessor, xgb_model, dnn_model, fusion_model, 
                         selected_design_features, selected_simulation_features,
                         scaler_xgb_latent, device):
    """
    End-to-end prediction pipeline for new tire data.
    
    Parameters:
    -----------
    df_new : pd.DataFrame
        New data to predict (without target column)
    preprocessor : FeaturePreprocessor
        Fitted preprocessor object
    xgb_model : XGBClassifier
        Trained XGBoost model
    dnn_model : SimulationDNN
        Trained DNN model
    fusion_model : HybridFusionModel
        Trained fusion model
    selected_design_features : list
        List of selected design feature names
    selected_simulation_features : list
        List of selected simulation feature names
    scaler_xgb_latent : StandardScaler
        Fitted scaler for XGBoost latent features
    device : torch.device
        Device to run inference on
    
    Returns:
    --------
    predictions : np.ndarray
        Predicted probabilities of defect (NG)
    """
    
    print("[INFO] Running prediction pipeline...")
    
    # Step 1: Preprocess
    df_processed = preprocessor.transform(df_new)
    
    # Step 2: Extract selected features
    X_design = df_processed[selected_design_features]
    X_simulation = df_processed[selected_simulation_features]
    
    # Step 3: Get XGBoost outputs
    xgb_pred = xgb_model.predict_proba(X_design)[:, 1].reshape(-1, 1)
    xgb_latent = xgb_model.apply(X_design)
    xgb_latent_scaled = scaler_xgb_latent.transform(xgb_latent)
    
    # Step 4: Get DNN outputs
    X_sim_tensor = torch.FloatTensor(X_simulation.values).to(device)
    dnn_model.eval()
    with torch.no_grad():
        dnn_latent, dnn_pred = dnn_model(X_sim_tensor)
    
    # Step 5: Get fusion predictions
    xgb_pred_tensor = torch.FloatTensor(xgb_pred).to(device)
    xgb_latent_tensor = torch.FloatTensor(xgb_latent_scaled).to(device)
    
    fusion_model.eval()
    with torch.no_grad():
        final_logits = fusion_model(xgb_pred_tensor, xgb_latent_tensor, dnn_latent, dnn_pred)
        final_probs = torch.sigmoid(final_logits).cpu().numpy()
    
    print("[INFO] Prediction complete")
    return final_probs


# Example usage (uncomment when you have test data)
# df_test = pd.read_csv('test.csv')
# test_predictions = predict_tire_defects(
#     df_test, preprocessor, xgb_model, dnn_model, fusion_model,
#     selected_design_features, selected_simulation_features,
#     scaler_xgb_latent, device
# )
# 
# # Create submission file
# submission = pd.DataFrame({
#     'id': df_test['id'],  # Adjust column name as needed
#     'prediction': test_predictions.flatten()
# })
# submission.to_csv('submission.csv', index=False)

print("\n[INFO] Prediction pipeline function created successfully")

## 8. Save Models and Artifacts

In [None]:
import pickle

print("\n" + "="*60)
print("SAVING MODELS AND ARTIFACTS")
print("="*60)

# Save preprocessor
with open('preprocessor.pkl', 'wb') as f:
    pickle.dump(preprocessor, f)
print("[SAVED] preprocessor.pkl")

# Save XGBoost model
xgb_model.save_model('xgb_model.json')
print("[SAVED] xgb_model.json")

# Save XGBoost latent scaler
with open('scaler_xgb_latent.pkl', 'wb') as f:
    pickle.dump(scaler_xgb_latent, f)
print("[SAVED] scaler_xgb_latent.pkl")

# Save feature lists
feature_config = {
    'selected_design_features': selected_design_features,
    'selected_simulation_features': selected_simulation_features,
    'all_design_features': DESIGN_FEATURES,
    'all_simulation_features': SIMULATION_FEATURES
}

with open('feature_config.pkl', 'wb') as f:
    pickle.dump(feature_config, f)
print("[SAVED] feature_config.pkl")

# Neural network models are already saved as:
# - best_dnn_model.pth
# - best_fusion_model.pth

print("\n[INFO] All models and artifacts saved successfully!")
print("\n" + "="*60)
print("PROJECT COMPLETE")
print("="*60)

## 9. Summary and Next Steps

### What We've Built:
1. **Feature Engineering Pipeline**: Modular preprocessing with verbose logging
2. **Feature Selection**: Using XGBoost importance to select top features
3. **Hybrid Model Architecture**:
   - Branch 1: XGBoost for Design/Process parameters
   - Branch 2: Deep Neural Network for Simulation features
   - Fusion Layer: MLP combining both branches for final prediction

### Key Advantages:
- **Ensemble Strength**: Combines tree-based and neural network approaches
- **Feature-Specific Learning**: Different model types for different feature groups
- **Interpretability**: XGBoost provides feature importance, fusion weights show contribution
- **Robustness**: Multiple layers of learning reduce overfitting

### Next Steps:
1. **Hyperparameter Tuning**: Use Optuna or GridSearch for optimal parameters
2. **Cross-Validation**: Implement K-fold CV for more robust evaluation
3. **Feature Engineering**: Create domain-specific features (e.g., curve statistics)
4. **Model Ensemble**: Try CatBoost, LightGBM in addition to XGBoost
5. **Threshold Optimization**: Find optimal decision threshold for F1 or business metric
6. **Production Deployment**: Package models for inference API

### Manufacturing AI Best Practices Implemented:
- Separate handling of design vs simulation data (domain knowledge)
- Feature selection to reduce noise from high-dimensional data
- Ensemble methods for critical quality decisions
- Model tracking and versioning
- Reproducible pipeline with random seeds