# 7. 🤖 Model Training and Evaluation



This notebook demonstrates how to train and evaluate machine learning models using the 3WToolkit. We will cover a complete workflow from data preparation to model assessment, including:

- Loading and preprocessing the 3W dataset
- Training MLP models with different configurations
- Evaluating model performance using various metrics
- Visualizing training history and results

## Learning Objectives

By the end of this notebook, you will be able to:
- Configure and train MLP models using the 3WToolkit
- Evaluate model performance using comprehensive metrics
- Visualize training progress and model predictions
- Understand the complete model training workflow


## Required Imports

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import torch

from tqdm import tqdm
from pathlib import Path

from ThreeWToolkit.preprocessing import Windowing
from ThreeWToolkit.core.base_preprocessing import WindowingConfig
from ThreeWToolkit.trainer.trainer import ModelTrainer, TrainerConfig
from ThreeWToolkit.models.mlp import MLPConfig
from ThreeWToolkit.dataset import ParquetDataset
from ThreeWToolkit.core.base_dataset import ParquetDatasetConfig
from ThreeWToolkit.core.base_assessment import ModelAssessmentConfig
from ThreeWToolkit.assessment.model_assess import ModelAssessment
from ThreeWToolkit.core.enums import TaskType


Let's create a ParquetDataset that loads cleaned data with target classes 0, 1, and 2

In [None]:
# Define dataset path
dataset_path = Path("../../dataset")

# Create and load dataset with target classes 0, 1, and 2
ds_config = ParquetDatasetConfig(
    path=dataset_path, 
    clean_data=True, 
    download=False, 
    target_class=[0, 1, 2]
)
ds = ParquetDataset(ds_config)

print("Dataset loaded successfully!")
print(f"Total events: {len(ds)}")
print(f"Sample event structure: {ds[0].keys()}")
print(f"Sample label: {ds[0]['label']}")


## 2. Model Configuration and Trainer Setup

**II. Instantiating configuration classes for the MLP model, Training parameters, and Evaluation parameters.**

With the ParquetDataset instance defined, we can set the parameters for the MLP model using the MLPConfig object. These parameters will be combined with the TrainerConfig and managed through the ModelTrainer, which encapsulates the training workflow.

The defined workflow controls most relevant parameters that will be used for training of a model. 

Finally, the ModelTrainer is instantiated with the training configuration, while the ModelAssessment object prepares the evaluation pipeline. The model architecture can be visualized by printing the `trainer.model`.


In [None]:
# Define window size for the model
window_size = 100

# Configure the MLP model
mlp_config = MLPConfig(
    input_size=window_size,
    hidden_sizes=(32, 16),
    output_size=3,  # 3 classes: 0, 1, 2
    random_seed=11,
    activation_function="relu",
    regularization=None,
)

# Configure the trainer
trainer_config = TrainerConfig(
    optimizer="adam",
    criterion="cross_entropy",
    batch_size=32,
    epochs=20,
    seed=11,
    config_model=mlp_config,
    learning_rate=0.001,
    device="cuda" if torch.cuda.is_available() else "cpu",
    cross_validation=False,
    shuffle_train=True,
)

# Configure model assessment
assessment_config = ModelAssessmentConfig(
    metrics=["balanced_accuracy", "precision", "recall", "f1"],
    task_type=TaskType.CLASSIFICATION,
    class_names=["Class_0", "Class_1", "Class_2"],
    export_results=True,
    generate_report=False,
)

# Initialize trainer and assessor
trainer = ModelTrainer(trainer_config)
assessor = ModelAssessment(assessment_config)

# Display the model architecture
print("Model Architecture:")
print(trainer.model)


## 3. Data Preprocessing with Windowing

**III. Preprocessing the data**

The next step is to iterate over a dataset of time series events, applying a windowing function to a selected signal column, in this case "T-TPT".

All windowed segments from all events are then concatenated into a single DataFrame (dfs_final). This prepares the data for supervised training, where each row represents a windowed segment with its corresponding class label.


In [None]:
# Select target columns and prepare training data with windowing
selected_col = "T-TPT"
dfs = []

# Configure windowing
wind = Windowing(WindowingConfig(
    window="hann",
    window_size=window_size,
    overlap=0.5,
    pad_last_window=True
))

print("Processing events with windowing...")
for event in tqdm(ds):
    # Apply windowing to the selected column
    windowed_signal = wind(event["signal"][selected_col])
    
    # Remove the window column (not needed for training)
    windowed_signal.drop(columns=["win"], inplace=True)
    
    # Add the label for this event
    windowed_signal["label"] = np.unique(event["label"]["class"])[0]
    
    dfs.append(windowed_signal)

# Concatenate all windowed data
dfs_final = pd.concat(dfs, ignore_index=True, axis=0)

print(f"\nPreprocessing completed!")
print(f"Total windows: {len(dfs_final)}")
print(f"Window size: {window_size}")
print(f"Features per window: {dfs_final.shape[1] - 1}")  # -1 for label column
print(f"Label distribution:")
print(dfs_final["label"].value_counts().sort_index())


## 4. Model Training

**IV. Training**

Finally we can call the train function using the trainer object.


In [None]:
# Prepare training data
x_train = dfs_final.iloc[:, :-1]  # All columns except the last one (label)
y_train = dfs_final["label"].astype(int)  # Convert labels to integers

print("Starting model training...")
print(f"Training data shape: {x_train.shape}")
print(f"Training labels shape: {y_train.shape}")

# Train the MLP model using the ModelTrainer interface
trainer.train(x_train=x_train, y_train=y_train)

print("Training completed!")


## 5. Model Evaluation and Assessment

**V. Assessment**

The trainer class allows direct evaluation of the trained model using the `assess` method, which returns a dictionary containing performance metrics and evaluation parameters.


In [None]:
# Method 1: Direct assessment using trainer
print("=== Method 1: Direct Assessment using Trainer ===")
trainer_results = trainer.assess(x_train, y_train, assessment_config)


**Another option, and the most recommended one, is to use the `ModelAssessment` class to perform the evaluation of the results.**


In [None]:
# Method 2: Using ModelAssessment class (recommended)
print("\n=== Method 2: Using ModelAssessment Class (Recommended) ===")
results = assessor.evaluate(trainer.model, x_train, y_train)
print("\nDetailed Results:")
print(results)


### Retrieving Aggregated Metrics


In [None]:
# Display summary metrics
print("=== Model Performance Summary ===")
print(assessor.summary())


## 6. Training History Visualization

The trainer object also collects a history of validation and training loss that can be visualized after the training is completed.


In [None]:
# Visualize training history
plt.figure(figsize=(12, 5))

# Plot training and validation loss
plt.subplot(1, 2, 1)
plt.plot(trainer.history[0]["val_loss"], label="Validation Loss", marker='o')
plt.plot(trainer.history[0]["train_loss"], label="Training Loss", marker='s')
plt.title("Training and Validation Loss")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.legend()
plt.grid(True, alpha=0.3)

# Plot training and validation accuracy (if available)
plt.subplot(1, 2, 2)
if "val_accuracy" in trainer.history[0] and "train_accuracy" in trainer.history[0]:
    plt.plot(trainer.history[0]["val_accuracy"], label="Validation Accuracy", marker='o')
    plt.plot(trainer.history[0]["train_accuracy"], label="Training Accuracy", marker='s')
    plt.title("Training and Validation Accuracy")
    plt.xlabel("Epoch")
    plt.ylabel("Accuracy")
    plt.legend()
    plt.grid(True, alpha=0.3)
else:
    plt.text(0.5, 0.5, "Accuracy metrics not available\nin this training configuration", 
             ha='center', va='center', transform=plt.gca().transAxes)
    plt.title("Training Metrics")

plt.tight_layout()
plt.show()

# Print final training statistics
print("=== Final Training Statistics ===")
print(f"Final Training Loss: {trainer.history[0]['train_loss'][-1]:.4f}")
print(f"Final Validation Loss: {trainer.history[0]['val_loss'][-1]:.4f}")
if "val_accuracy" in trainer.history[0]:
    print(f"Final Training Accuracy: {trainer.history[0]['train_accuracy'][-1]:.4f}")
    print(f"Final Validation Accuracy: {trainer.history[0]['val_accuracy'][-1]:.4f}")


## 7. Additional Analysis and Insights

Let's perform some additional analysis to better understand our model's performance and the data characteristics.


In [None]:
# Analyze data characteristics
print("=== Data Analysis ===")
print(f"Dataset size: {len(ds)} events")
print(f"Total windows after preprocessing: {len(dfs_final)}")
print(f"Window size: {window_size}")
print(f"Overlap: 50%")

# Class distribution analysis
print(f"\nClass Distribution:")
class_counts = dfs_final["label"].value_counts().sort_index()
for class_id, count in class_counts.items():
    percentage = (count / len(dfs_final)) * 100
    print(f"  Class {class_id}: {count} windows ({percentage:.1f}%)")

# Model complexity analysis
print(f"\nModel Complexity:")
print(f"  Input features: {window_size}")
print(f"  Hidden layers: {len(mlp_config.hidden_sizes)}")
print(f"  Hidden units: {mlp_config.hidden_sizes}")
print(f"  Output classes: {mlp_config.output_size}")
print(f"  Total parameters: {sum(p.numel() for p in trainer.model.parameters())}")

# Training configuration summary
print(f"\nTraining Configuration:")
print(f"  Optimizer: {trainer_config.optimizer}")
print(f"  Learning rate: {trainer_config.learning_rate}")
print(f"  Batch size: {trainer_config.batch_size}")
print(f"  Epochs: {trainer_config.epochs}")
print(f"  Device: {trainer_config.device}")


## 8. Summary and Next Steps

### What We've Accomplished

In this notebook, we've successfully demonstrated the complete model training and evaluation workflow using the 3WToolkit v2.0.0:

1. **Dataset Loading**: Loaded the 3W dataset with cleaned data for classes 0, 1, and 2
2. **Model Configuration**: Set up an MLP model with appropriate architecture for our classification task
3. **Data Preprocessing**: Applied windowing to convert time series into fixed-size windows suitable for training
4. **Model Training**: Trained the MLP model using the configured trainer
5. **Model Evaluation**: Assessed model performance using comprehensive metrics
6. **Visualization**: Created plots to understand training progress and model behavior

### Key Insights

- The windowing approach allows us to treat time series segments as independent samples
- The MLP model can effectively learn patterns from windowed time series data
- The 3WToolkit provides comprehensive evaluation metrics and visualization tools

### Next Steps

To further improve your model, consider:

1. **Experiment with different window sizes** to find the optimal balance between context and computational efficiency
2. **Try different model architectures** (deeper networks, different activation functions)
3. **Implement cross-validation** for more robust performance estimation
4. **Use feature extraction methods** (statistical, wavelet, or exponentially weighted features) as shown in the previous notebooks
5. **Experiment with different preprocessing strategies** (normalization, imputation methods)
6. **Try ensemble methods** or other advanced techniques

### Additional Resources

- Check out the Pipeline Integration notebook for automated workflows
- Explore the Feature Extraction notebooks for advanced preprocessing techniques
- Review the Data Visualization notebooks for comprehensive analysis tools
