### Scope of Final Modeling Notebook Scope

This notebook focuses on **consolidation, validation, and final decision-making**, building on insights from prior exploratory and preprocessing analyses.

---

**Objective**

The objectives of this notebook are to:

- Summarize key insights from **EDA** and **baseline preprocessing experiments**
- Narrow down the modeling scope to **final candidate models and strategies**
- Perform a **focused comparison of imbalance-handling approaches**
- Select a **final modeling configuration** for production-grade training

> This notebook does **not** introduce new feature engineering or hyperparameter tuning.

---

**Inputs from Previous Analysis**

Based on findings from:
- `eda.ipynb`
- `preprocessing_&_baseline_models.ipynb`

The following decisions are treated as **final**:

- Feature-augmented (`fe_aug`) representation is the preferred input format
- Tree-based ensemble and boosting models are most suitable
- Linear and single-tree models are excluded
- Model evaluation should prioritize **recall** and **PR-AUC**

---

**Modeling Scope**

This notebook evaluates **final candidate models** under a controlled setup:

**Models:**

- LightGBM (LGBM)
- XGBoost (XGB)
- Random Forest (RF)
- Decision Tree (DT)

**Fixed Components:**

- Feature-augmented feature set (`fe_aug`)
- Train–test split from mandatory preprocessing
- Evaluation metrics: precision, recall, F1-score, PR-AUC

---

**Focused Imbalance-Handling Comparison**

We compare the following configurations using the **feature-augmented dataset**:

1. **Feature-augmented + No Imbalance Handling**
   - Serves as the reference baseline
   - Represents the simplest deployable solution

2. **Feature-augmented + Class Weighting**
   - Applied selectively to boosting models
   - Evaluates recall improvement via algorithm-level weighting

3. **Feature-augmented + SMOTE-Tomek**
   - Applies data-level resampling on the training set only
   - Evaluates recall gains versus false-positive risk

**Purpose**
- Quantify recall–precision trade-offs
- Identify the most balanced and reliable strategy per model
- Avoid unnecessary complexity where baseline performance is sufficient

---

**Summary of Observations**

**1️ Baseline Features With Class Weighting**

| Model | Weighted | Precision_1 | Recall_1 | F1_1 | PR_AUC |
|-------|----------|-------------|----------|------|--------|
| XGB_weighted | True | 0.270 | 0.611 | 0.374 | 0.375 |
| LGBM_weighted | True | 0.172 | 0.683 | 0.274 | 0.421 |
| XGB_normal | False | 0.589 | 0.279 | 0.378 | 0.358 |
| LGBM_normal | False | 0.597 | 0.305 | 0.404 | 0.374 |

**Key Insights:**

- Class weighting significantly improves recall for boosting models.
- Linear models and SVC have poor precision–recall balance.
- Boosting with class weighting is effective for minority-class capture but increases false positives.



**2️ Domain-Driven Feature Engineering**

| Model | Feature Set | Precision_1 | Recall_1 | F1_1 | PR_AUC |
|-------|------------|-------------|----------|------|--------|
| RF | fe_aug | 0.674 | 0.332 | 0.445 | 0.429 |
| LGBM | fe_aug | 0.601 | 0.351 | 0.443 | 0.400 |
| RF | fe_only | 0.670 | 0.294 | 0.408 | 0.400 |
| LGBM | fe_only | 0.633 | 0.309 | 0.415 | 0.383 |

**Key Insights:**

- `fe_aug` consistently outperforms baseline features for tree-based models.
- `fe_only` provides competitive performance and reduces feature dimensionality.
- Single trees benefit less from engineered features.

> **Takeaway:** Feature engineering improves predictive performance, especially when retained **with original features** (`fe_aug`).



**3️ Data-Level Handling / Resampling**

| Model | Resampling | Precision_1 | Recall_1 | F1_1 | PR_AUC |
|-------|------------|-------------|----------|------|--------|
| LGBM | SMOTE | 0.265 | 0.607 | 0.369 | 0.374 |
| LGBM | SMOTE_Tomek | 0.271 | 0.592 | 0.372 | 0.387 |
| XGB | SMOTE | 0.294 | 0.550 | 0.383 | 0.356 |
| XGB | SMOTE_Tomek | 0.283 | 0.531 | 0.369 | 0.364 |

**Key Insights:**

- Resampling improves recall for minority class.
- Precision drops substantially, leading to lower F1 in some cases.
- Boosting models still outperform linear or single-tree models even with resampling.

> **Takeaway:** Resampling improves recall but introduces false positives; it is **less stable than class weighting or feature augmentation**.

---

**Consolidated Insights**

1. **Feature augmentation (`fe_aug`)** is the most consistent improvement across models.
2. **Class weighting** helps **boosting models** achieve higher recall.
3. **Data-level resampling** is optional but increases false positives; careful thresholding required.
4. **Top candidates for production:**
   - `RF_fe_aug`
   - `LGBM_fe_aug` (with/without class weighting)
   - `XGB_fe_aug` (with class weighting)

---
**Threshold Optimization**

For shortlisted model–strategy combinations:
- Precision–Recall curves are analyzed
- Decision thresholds are adjusted to align with business risk tolerance
- Performance impact is evaluated beyond the default 0.5 threshold

---

**Next Steps**
By the end of this notebook, we aim to:

1. Finalize **1–2 model candidates**.
2. Finalize **preferred imbalance-handling strategy**.
3. Determine **operating threshold**.
4. Handoff to `src/` pipeline:
   - Hyperparameter tuning via Optuna
   - Experiment tracking and reproducibility with MLflow
   - Production-ready implementation

---

**Out of Scope**

- No new feature engineering
- No hyperparameter optimization
- No automated experiment tracking
- Pipeline implementation in production will be handled in the `src/` directory