# Comparison between models

## 1. High-level observations

All models perform nearly perfect on training (especially tree-based ones → clear sign of overfitting).

On test data, performance differs mainly in how well they handle:

- Minority class (0.0, only 3 samples)

- Mid-size class (1.0, 86 samples)

- Majority class (2.0, 511 samples)

Balanced Accuracy is crucial here because raw accuracy is misleading (predicting class 2.0 alone gives ~85% already).

## 2. Model-by-model analysis
### 🔹 Random Forest

- Test Accuracy: 92.5%

- Balanced Accuracy: 0.55 (very poor)

Confusion Matrix: misclassifies class 0 heavily (only 1/3 correct), struggles more on class 1.

#### Why:

- Random Forests are ensembles of deep trees → they can overfit training (shown by perfect scores).

- With imbalanced data, they bias toward the majority class, since each tree’s splitting criterion optimizes overall accuracy, not minority class recall.

- Thus, while overall accuracy looks okay, it fails on balanced accuracy.

### 🔹 XGBoost

- Test Accuracy: 95.2%

- Balanced Accuracy: 0.83

Much better recall on class 1 (86% vs RF’s 87%) and stable on class 2.

Slightly struggles with the tiny class 0 (2/3 correct).

#### Why:

- XGBoost uses boosting → it iteratively focuses on mistakes, so minority and medium-sized classes get more attention than in RF.

- Regularization (shrinkage, tree depth limits) prevents extreme overfitting compared to RF.

- Overall, this makes it more robust to imbalance.

### 🔹 Gradient Boosting

- Test Accuracy: 98%

- Balanced Accuracy: 0.87 (best among all).

Predicts class 0 (2/3 correct) and does excellent on class 1 (95% recall).

#### Why:

- Similar to XGBoost, but often more conservative → fits more smoothly with higher bias but less variance.

- This prevents overfitting on minority data while still handling imbalance effectively.

- The very low log-loss (0.046) shows strong confidence calibration, meaning its probability outputs are very well aligned with reality.

### 🔹 Logistic Regression

- Test Accuracy: 98% (same as GBM)

- Balanced Accuracy: 0.87 (also excellent).

Confusion Matrix: also 2/3 correct on class 0, great recall on 1 and 2.

#### Why:

- Logistic regression is a linear model. If your features are already linearly separable (or nearly so), LR can do very well.

- Unlike trees, it doesn’t overfit easily → you see stable performance across train/test.

- Its performance here suggests your dataset has features that linearly separate classes quite well, especially between 1 and 2.

## 3. Why one performs better than another

- Random Forest < Boosted Trees < Logistic Regression/Gradient Boosting

- Random Forest overfits and struggles with imbalance → weak generalization.

- XGBoost corrects some of this by focusing on errors, but still slightly worse calibration.

- Gradient Boosting and Logistic Regression both achieve strong generalization and balanced accuracy, suggesting the data is not too complex — linear or weak learners suffice.

- Logistic Regression’s competitiveness shows non-linear power isn’t always necessary if feature engineering is strong.

- Gradient Boosting edges out XGBoost because it regularizes a bit more, leading to smoother probability estimates.

## 4. Beyond metrics

- Interpretability: Logistic Regression is simplest to explain (feature weights → direct interpretation).

- Scalability: Random Forest and Gradient Boosting are heavier; Logistic Regression is very lightweight.

- Calibration: Gradient Boosting had the lowest log loss → better probability estimates.

- Imbalanced Data: Boosting methods handle imbalance better than bagging (RF). Logistic Regression naturally balances better if features separate well.

## ✅ Conclusion:

- Random Forest → not suitable for your imbalanced setup (biased toward majority).

- XGBoost → strong, but slightly less calibrated.

- Gradient Boosting → best overall balance of recall, calibration, and accuracy.

- Logistic Regression → surprisingly effective; if interpretability matters, this is the best choice.

👉 If your goal is maximum balanced accuracy: Gradient Boosting.
👉 If your goal is simplicity + interpretability: Logistic Regression.
👉 If your goal is scalable ensemble: XGBoost.