# Model Evaluation and Cross-Validation

This notebook contains functions for evaluating classification models and performing cross-validation. It includes error handling to ensure robustness. # type: ignore

## 1. Importing Libraries

We will start by importing the necessary libraries.

In [None]:
# Import necessary libraries
from utils.error_handler import error_handler, ModelError
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.model_selection import cross_val_score
import numpy as np

## 2. Evaluating Classification Models

The `evaluate_classification_model` function computes various evaluation metrics for classification models, including accuracy, precision, recall, and F1 score. It also includes error handling to manage mismatched dimensions between predictions and true labels.

In [None]:
@error_handler
def evaluate_classification_model(y_true, y_pred):
    """
    Evaluate the classification model using various metrics.

    Parameters:
    - y_true: array-like, true labels
    - y_pred: array-like, predicted labels

    Returns:
    - metrics: dict, containing accuracy, precision, recall, and F1 score
    """
    if len(y_true) != len(y_pred):
        raise ModelError("Prediction and ground truth dimensions do not match")
    
    metrics = {
        'accuracy': accuracy_score(y_true, y_pred),
        'precision': precision_score(y_true, y_pred, average='weighted'),
        'recall': recall_score(y_true, y_pred, average='weighted'),
        'f1': f1_score(y_true, y_pred, average='weighted')
    }
    
    return metrics

## 3. Cross-Validation

The `cross_validate_model` function performs cross-validation on a given model and dataset. It returns the mean and standard deviation of the scores obtained during cross-validation.

In [None]:
@error_handler
def cross_validate_model(model, X, y, cv=5):
    """
    Perform cross-validation on the given model.

    Parameters:
    - model: the model to evaluate
    - X: array-like, feature data
    - y: array-like, target labels
    - cv: int, number of cross-validation folds

    Returns:
    - dict: containing mean score, standard deviation, and individual scores
    """
    scores = cross_val_score(model, X, y, cv=cv)
    return {
        'mean_score': np.mean(scores),
        'std_score': np.std(scores),
        'scores': scores
    }

## 4. Example Usage

Here, we can provide an example of how to use the above functions with a sample dataset and model.

In [None]:
# Example usage
if __name__ == "__main__":
    from sklearn.datasets import load_iris
    from sklearn.ensemble import RandomForestClassifier

    # Load sample data
    data = load_iris()
    X, y = data.data, data.target

    # Train a model
    model = RandomForestClassifier(random_state=42)
    model.fit(X, y)

    # Make predictions
    y_pred = model.predict(X)

    # Evaluate the model
    metrics = evaluate_classification_model(y, y_pred)
    print("Evaluation Metrics:", metrics)

    # Perform cross-validation
    cv_results = cross_validate_model(model, X, y, cv=5)
    print("Cross-Validation Results:", cv_results)