Regression Metrics

Error-based Metrics

MAE (Mean Absolute Error): Average of absolute differences

MAE = (1/n) * Σ|y_true - y_pred|

Robust to outliers, interpretable in original units

MSE (Mean Squared Error): Average of squared differences

MSE = (1/n) * Σ(y_true - y_pred)²

Penalizes larger errors more heavily

RMSE (Root Mean Squared Error): Square root of MSE

RMSE = √MSE

In same units as target, sensitive to outliers

RMSLE (Root Mean Squared Logarithmic Error)

RMSLE = √(1/n) * Σ(log(y_pred+1) - log(y_true+1))²

Less sensitive to outliers, penalizes underestimates more

Percentage Error Metrics

MAPE (Mean Absolute Percentage Error)

MAPE = (100%/n) * Σ|(y_true - y_pred)/y_true|

Undefined for zero values, asymmetric penalty

sMAPE (Symmetric MAPE)

sMAPE = (200%/n) * Σ|y_pred - y_true|/(|y_true| + |y_pred|)

Bounded between 0% and 200%

Goodness-of-fit Metrics

R² (Coefficient of Determination)

R² = 1 - (SS_residual/SS_total)

Proportion of variance explained, range [-∞, 1]

Adjusted R²

Adjusts for number of predictors

Adj. R² = 1 - [(1-R²)(n-1)/(n-p-1)]

Where p = number of features

Quantile Loss

Measures performance at different quantiles

Useful for uncertainty estimation and interval prediction

Classification Metrics

Binary Classification Metrics

Confusion Matrix-based:

Precision: TP/(TP + FP) - Accuracy of positive predictions

Recall/Sensitivity: TP/(TP + FN) - Coverage of actual positives

F1-Score: Harmonic mean of precision and recall

F1 = 2 * (precision * recall)/(precision + recall)

Specificity: TN/(TN + FP) - Coverage of actual negatives

Threshold-independent:

ROC-AUC: Area under Receiver Operating Characteristic curve

Plots TPR (Recall) vs FPR (1-Specificity)

Measures overall ranking performance

PR-AUC: Area under Precision-Recall curve

Better for imbalanced datasets

Probability Calibration:

Log Loss (Binary Cross-Entropy)

-1/n * Σ[y_true*log(y_pred) + (1-y_true)*log(1-y_pred)]

Penalizes confident wrong predictions

Brier Score

Mean squared error of probabilities

BS = 1/n * Σ(y_true - y_pred)²

Multi-class & Multi-label
Cohen's Kappa: Agreement corrected for chance

Useful for imbalanced classes

Matthews Correlation Coefficient (MCC)

MCC = (TP×TN - FP×FN)/√((TP+FP)(TP+FN)(TN+FP)(TN+FN))

Balanced measure even with class imbalance

Hamming Loss: Fraction of wrong labels

For multi-label classification

Jaccard Score (IoU): Intersection over Union

J = TP/(TP + FP + FN)

When to Use Which

Regression:

Default starting points: RMSE (interpretable), R² (goodness)

Business context: MAPE (percentage error important)

Outliers present: MAE or Huber loss

Exponential data: RMSLE

Probabilistic forecasting: Quantile loss

Classification:

Balanced data: Accuracy, F1

Imbalanced data: Precision-Recall AUC, F1, MCC

Probability estimates: Log Loss, Brier Score

Medical/security: High recall (minimize false negatives)

Spam detection: High precision (minimize false positives)

Multi-label: Hamming Loss, Jaccard Score

