## Model Evaluation Metrics

### Classification Evaluation Metrics

This section uses a classification model example to show metrics like Accuracy, Precision, Recall, F1-Score, and the Confusion Matrix.

In [1]:
import numpy as np
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score, 
    confusion_matrix, classification_report
)
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification

# 1. Prepare Data
X, y = make_classification(n_samples=1000, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# 2. Train Model and Predict
model = LogisticRegression(random_state=42)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("--- Classification Metrics ---")

# A. Confusion Matrix
# Shows the counts of correct and incorrect predictions, broken down by class.
cm = confusion_matrix(y_test, y_pred)
print("\nConfusion Matrix:")
print(cm)
# Interpretation (for binary classification):
# cm[0, 0]: True Negatives (TN)
# cm[0, 1]: False Positives (FP)
# cm[1, 0]: False Negatives (FN)
# cm[1, 1]: True Positives (TP)


# B. Core Metrics
print(f"\nAccuracy: {accuracy_score(y_test, y_pred):.4f}")
# Accuracy: (TP + TN) / (TP + TN + FP + FN). Overall correctness.

print(f"Precision: {precision_score(y_test, y_pred):.4f}")
# Precision: TP / (TP + FP). Proportion of positive predictions that were actually correct.

print(f"Recall (Sensitivity): {recall_score(y_test, y_pred):.4f}")
# Recall: TP / (TP + FN). Proportion of actual positives that were identified correctly.

print(f"F1-Score: {f1_score(y_test, y_pred):.4f}")
# F1-Score: Harmonic mean of Precision and Recall. Good for imbalanced datasets.

# C. Classification Report (combines all metrics)
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

--- Classification Metrics ---

Confusion Matrix:
[[127  18]
 [ 27 128]]

Accuracy: 0.8500
Precision: 0.8767
Recall (Sensitivity): 0.8258
F1-Score: 0.8505

Classification Report:
              precision    recall  f1-score   support

           0       0.82      0.88      0.85       145
           1       0.88      0.83      0.85       155

    accuracy                           0.85       300
   macro avg       0.85      0.85      0.85       300
weighted avg       0.85      0.85      0.85       300



### Regression Evaluation Metrics

This section uses a regression model example to show metrics like Mean Squared Error, Root Mean Squared Error, and R-squared.

In [2]:
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.linear_model import LinearRegression
from sklearn.datasets import make_regression

# 1. Prepare Data
X_reg, y_reg = make_regression(n_samples=1000, n_features=1, noise=10, random_state=42)
X_train_reg, X_test_reg, y_train_reg, y_test_reg = train_test_split(
    X_reg, y_reg, test_size=0.3, random_state=42
)

# 2. Train Model and Predict
model_reg = LinearRegression()
model_reg.fit(X_train_reg, y_train_reg)
y_pred_reg = model_reg.predict(X_test_reg)

print("\n--- Regression Metrics ---")

print(f"\nMean Absolute Error (MAE): {mean_absolute_error(y_test_reg, y_pred_reg):.4f}")
# MAE: Average of the absolute differences between predictions and actual values.

print(f"Mean Squared Error (MSE): {mean_squared_error(y_test_reg, y_pred_reg):.4f}")
# MSE: Average of the squared differences. Penalizes large errors more heavily.

print(f"Root Mean Squared Error (RMSE): {np.sqrt(mean_squared_error(y_test_reg, y_pred_reg)):.4f}")
# RMSE: Square root of MSE. Interpretable in the same unit as the target variable.

print(f"R-squared (R2 Score): {r2_score(y_test_reg, y_pred_reg):.4f}")
# R2 Score: Measures the proportion of variance in the dependent variable that is predictable
# from the independent variables. Closer to 1 is better.


--- Regression Metrics ---

Mean Absolute Error (MAE): 8.1743
Mean Squared Error (MSE): 104.3826
Root Mean Squared Error (RMSE): 10.2168
R-squared (R2 Score): 0.7040
