<h2 align="center" style="color:blue">Model Training</h2>

In [9]:
# Import necessary libraries
from imports import *

In [10]:
# Load features and target training variables for model training
X_train = pd.read_parquet("../data/processed/final_X_train.parquet")

y_train = pd.read_parquet("../data/processed/final_y_train.parquet").squeeze()  # converts back to Series

In [11]:
X_train.shape

(37488, 13)

In [12]:
print(type(y_train))
print(y_train.shape)

<class 'pandas.core.series.Series'>
(37488,)


In [13]:
# Load features and target testing variables for model training
X_test = pd.read_parquet("../data/processed/final_X_test.parquet")

y_test = pd.read_parquet("../data/processed/final_y_test.parquet").squeeze()  # converts back to Series

In [14]:
X_test.shape

(12497, 13)

In [15]:
print(type(y_test))
print(y_test.shape)

<class 'pandas.core.series.Series'>
(12497,)


In [16]:
y_train.value_counts()

0    34265
1     3223
Name: default, dtype: int64

### **Business Success Criteria**

The business team defined three key model expectations:

1. **Recall (on default class = 1) > 90%**  
Missing a defaulter is extremely costly, so capturing as many default cases as possible is critical.

2. **Precision > 50% for default class**  
False positives are acceptable because flagged customers undergo human review.

3. **High explainability**  
The model‚Äôs decisions must be interpretable and convertible into business rules for the internal BRE (Business Rule Engine).

## Attempt 1 - No Handling of Class Imbalance

### Logistic Regression Model Training

In [17]:
# Initialize and train the Logistic Regression model
model_lr = LogisticRegression()
model_lr.fit(X_train, y_train)

y_pred_lr = model_lr.predict(X_test)
report_lr = classification_report(y_test, y_pred_lr)
print(report_lr)

              precision    recall  f1-score   support

           0       0.97      0.99      0.98     11423
           1       0.84      0.72      0.78      1074

    accuracy                           0.96     12497
   macro avg       0.91      0.85      0.88     12497
weighted avg       0.96      0.96      0.96     12497



### Random Forest Classifier Model Training

In [18]:
# Initialize and train the Random Forest Classifier model
model_rf = RandomForestClassifier()
model_rf.fit(X_train, y_train)

y_pred_rf = model_rf.predict(X_test)
report_rf = classification_report(y_test, y_pred_rf)
print(report_rf)

              precision    recall  f1-score   support

           0       0.97      0.99      0.98     11423
           1       0.85      0.71      0.78      1074

    accuracy                           0.96     12497
   macro avg       0.91      0.85      0.88     12497
weighted avg       0.96      0.96      0.96     12497



### XGBoost Classifier Model Training

In [19]:
# Initialize and train the XGBoost Classifier model
model_xgb = XGBClassifier()
model_xgb.fit(X_train, y_train)

y_pred_xgb = model_xgb.predict(X_test)
report_xgb = classification_report(y_test, y_pred_xgb)
print(report_xgb)

              precision    recall  f1-score   support

           0       0.98      0.98      0.98     11423
           1       0.82      0.76      0.79      1074

    accuracy                           0.97     12497
   macro avg       0.90      0.87      0.89     12497
weighted avg       0.96      0.97      0.96     12497



### **Baseline Model Performance**

We evaluated three models without any class-imbalance handling or hyperparameter tuning:

* **Logistic Regression**
* **Random Forest**
* **XGBoost**

All three models produced similar performance:

| Model               | Precision (Class 1) | Recall (Class 1) | F1 Score (Class 1) |
| ------------------- | ------------------- | ---------------- | ------------------ |
| Logistic Regression | 0.84                | 0.72             | 0.78               |
| Random Forest       | 0.86                | 0.70             | 0.77               |
| XGBoost             | 0.82                | 0.76             | 0.79               |

**Key Finding:**  
None of the baseline models achieved the required **recall ‚â• 90%** on the default class.

Given the importance of explainability, **Logistic Regression** was selected as the base model for hyperparameter tuning.

### RandomizedSearch CV for Attempt 1: Logistic Regression

In [20]:
# --- RandomizedSearchCV Attempt 1: Logistic Regression ---
param_dist = {
    'C': np.logspace(-4, 4, 20),  # Logarithmically spaced values from 10^-4 to 10^4
    'solver': ['lbfgs', 'saga', 'liblinear', 'newton-cg']   # Algorithm to use in the optimization problem
}

# Create the Logistic Regression model
log_reg = LogisticRegression(max_iter=10000)  # Increased max_iter for convergence

# Set up RandomizedSearchCV
random_search = RandomizedSearchCV(
    estimator=log_reg,
    param_distributions=param_dist,
    n_iter=50,  # Number of parameter settings that are sampled
    scoring='f1',
    cv=3,  # 5-fold cross-validation
    verbose=2,
    random_state=42,  # Set a random state for reproducibility
    n_jobs=-1  # Use all available cores
)

# Fit the RandomizedSearchCV to the training data
random_search.fit(X_train, y_train)

# Print the best parameters and best score
print(f"Best Parameters: {random_search.best_params_}")
print(f"Best Score: {random_search.best_score_}")

best_model = random_search.best_estimator_
y_pred = best_model.predict(X_test)
print("Classification Report:")
print(classification_report(y_test, y_pred))

Fitting 3 folds for each of 50 candidates, totalling 150 fits
Best Parameters: {'solver': 'liblinear', 'C': 1438.44988828766}
Best Score: 0.7578820896729831
Classification Report:
              precision    recall  f1-score   support

           0       0.98      0.99      0.98     11423
           1       0.83      0.74      0.78      1074

    accuracy                           0.96     12497
   macro avg       0.90      0.86      0.88     12497
weighted avg       0.96      0.96      0.96     12497



### **Hyperparameter Tuning: Logistic Regression (RandomizedSearchCV)**

RandomizedSearchCV was applied to optimize:

* Regularization parameter (`C`)
* Solver selection (`lbfgs`, `liblinear`, `saga`, etc.)

**Best Results:**

* Best CV F1 Score: **~0.758**
* Test Performance:

  * Precision (Class 1): **0.83**
  * Recall (Class 1): **0.74**

**Conclusion:**  
Even after tuning, Logistic Regression could not meet the business recall threshold.
However, its explainability makes it a strong candidate for the final stage *if performance improves after class-imbalance handling*.

### RandomizedSearch CV for Attempt 1: XGBoost

In [21]:
# --- RandomizedSearchCV Attempt 1: XGBoost ---
from scipy.stats import uniform, randint

# Define parameter distribution for RandomizedSearchCV
param_dist = {
    'n_estimators': [100, 150, 200, 250, 300],
    'max_depth': [3, 4, 5, 6, 7, 8, 9, 10],
    'learning_rate': [0.01, 0.03, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3],
    'subsample': [0.6, 0.7, 0.8, 0.9, 1.0],
    'colsample_bytree': [0.6, 0.7, 0.8, 0.9, 1.0],
    'scale_pos_weight': [1, 2, 3, 5, 7, 10],
    'reg_alpha': [0.01, 0.1, 0.5, 1.0, 5.0, 10.0],  # L1 regularization term
    'reg_lambda': [0.01, 0.1, 0.5, 1.0, 5.0, 10.0]  # L2 regularization term
}

xgb = XGBClassifier()

random_search = RandomizedSearchCV(estimator=xgb, param_distributions=param_dist, n_iter=100,
                                   scoring='f1', cv=3, verbose=1, n_jobs=-1, random_state=42)

random_search.fit(X_train, y_train)

# Print the best parameters and best score
print(f"Best Parameters: {random_search.best_params_}")
print(f"Best Score: {random_search.best_score_}")

best_model = random_search.best_estimator_
y_pred = best_model.predict(X_test)
print("Classification Report:")
print(classification_report(y_test, y_pred))

Fitting 3 folds for each of 100 candidates, totalling 300 fits
Best Parameters: {'subsample': 0.7, 'scale_pos_weight': 2, 'reg_lambda': 10.0, 'reg_alpha': 0.01, 'n_estimators': 100, 'max_depth': 4, 'learning_rate': 0.2, 'colsample_bytree': 0.8}
Best Score: 0.7885351531443989
Classification Report:
              precision    recall  f1-score   support

           0       0.98      0.98      0.98     11423
           1       0.76      0.84      0.80      1074

    accuracy                           0.96     12497
   macro avg       0.87      0.91      0.89     12497
weighted avg       0.97      0.96      0.96     12497




### **Hyperparameter Tuning: XGBoost (RandomizedSearchCV)**

A broader parameter search was performed, tuning:

* learning rate
* depth
* number of trees
* subsample ratios
* colsample ratios
* regularization terms
* class-imbalance weight (scale_pos_weight)

**Best Results:**

* Best CV F1 Score: **~0.789**
* Test Performance:

  * Precision (Class 1): **0.76**
  * Recall (Class 1): **0.84**

**Conclusion:**  
XGBoost achieved a meaningful improvement in **recall**, but still falls short of the 90% threshold required by the business.

All the above model trainings so far were performed **without addressing class imbalance**, even though defaults are only ~9% of the data.

This is likely capping recall performance across all models.

## Attempt 2 - Handling Class Imbalance using Under Sampling

### **Attempt 2 ‚Äî Random UnderSampling**

To boost recall on the minority class (defaults), we first applied **Random UnderSampling (RUS)**, which balances the dataset by reducing the majority class to match the minority.

| Class           | Count Before | Count After RUS |
| --------------- | ------------ | --------------- |
| 0 (Non-default) | ~34,265      | 3,223           |
| 1 (Default)     | ~3,223       | 3,223           |

In [22]:
rus = RandomUnderSampler(random_state=42)

X_train_rus, y_train_rus = rus.fit_resample(X_train, y_train)
y_train_rus.value_counts()

0    3223
1    3223
Name: default, dtype: int64

### Logistic Regression Model

In [23]:
# Train Logistic Regression on under-sampled data
model_lr = LogisticRegression()
model_lr.fit(X_train_rus, y_train_rus)

# Evaluate on original test set
y_pred_lr = model_lr.predict(X_test)
report_lr = classification_report(y_test, y_pred_lr)
print(report_lr)

              precision    recall  f1-score   support

           0       1.00      0.91      0.95     11423
           1       0.51      0.96      0.67      1074

    accuracy                           0.92     12497
   macro avg       0.75      0.93      0.81     12497
weighted avg       0.95      0.92      0.93     12497



### XGBoost Classifier Model

In [24]:
# Train XGBoost using best params from previous RandomizedSearchCV
model_xgb = XGBClassifier(**random_search.best_params_)
model_xgb.fit(X_train_rus, y_train_rus)

y_pred_xgb = model_xgb.predict(X_test)
report_xgb = classification_report(y_test, y_pred_xgb)
print(report_xgb)

              precision    recall  f1-score   support

           0       1.00      0.91      0.95     11423
           1       0.52      0.98      0.68      1074

    accuracy                           0.92     12497
   macro avg       0.76      0.95      0.82     12497
weighted avg       0.96      0.92      0.93     12497



#### üîç **Model Performance (on original test data)**

Both Logistic Regression and XGBoost achieved:

* **Recall ‚âà 96‚Äì98%** for the default class
* **Precision ‚âà 51‚Äì52%**

This technically meets the business requirements:

* Recall > 90% ‚úîÔ∏è
* Precision > 50% ‚úîÔ∏è
* Explainability possible (LogReg) ‚úîÔ∏è

However, the overall model becomes more unstable because RUS **throws away a large portion of real data**, which risks losing important patterns from the majority class.

### Attempt 3 - Handling Class Imbalance using Over Sampling (SMOTE Tomek)

To preserve majority class information, I next applied **SMOTE-Tomek**, which:

* Oversamples the minority using synthetic examples (SMOTE)
* Cleans ambiguous boundary samples (Tomek Links)

| Class | Count Before | Count After SMOTE-Tomek |
| ----- | ------------ | ----------------------- |
| 0     | ~34,265      | 34,195                  |
| 1     | ~3,223       | 34,195                  |

In [25]:
smt = SMOTETomek(random_state=42)

X_train_smt, y_train_smt = smt.fit_resample(X_train, y_train)
y_train_smt.value_counts()

0    34195
1    34195
Name: default, dtype: int64

### Logistic Regression Model

In [26]:
# Train Logistic Regression on SMOTE-Tomek data
model_lr = LogisticRegression()
model_lr.fit(X_train_smt, y_train_smt)

# Evaluate on original test set
y_pred_lr = model_lr.predict(X_test)
report_lr = classification_report(y_test, y_pred_lr)
print(report_lr)

              precision    recall  f1-score   support

           0       0.99      0.93      0.96     11423
           1       0.55      0.94      0.70      1074

    accuracy                           0.93     12497
   macro avg       0.77      0.94      0.83     12497
weighted avg       0.96      0.93      0.94     12497



### Hyperparameter Optimization Using Optuna

Given the strong recall achieved after handling class imbalance, we will now use **Optuna**, a state-of-the-art automatic hyperparameter optimization framework.

The focus will be on **maximizing recall on the default class**, while ensuring:

* Precision ‚â• 50%
* Coefficients remain interpretable for the BRE (Logistic Regression)

In [27]:
# --- Attempt 3: Optuna Hyperparameter Tuning on Logistic Regression (with SMOTE-Tomek data) ---

# Define Optuna objective function for Logistic Regression
def objective(trial):
    param = {
        'C': trial.suggest_float('C', 1e-4, 1e4, log=True),  # Logarithmically spaced values
        'solver': trial.suggest_categorical('solver', ['lbfgs', 'liblinear', 'saga', 'newton-cg']),  # Solvers
        'tol': trial.suggest_float('tol', 1e-6, 1e-1, log=True),  # Logarithmically spaced values for tolerance
        'class_weight': trial.suggest_categorical('class_weight', [None, 'balanced'])  # Class weights
    }

    model = LogisticRegression(**param, max_iter=10000)
    
    # Calculate the cross-validated f1_score
    f1_scorer = make_scorer(f1_score, average='macro')
    scores = cross_val_score(model, X_train_smt, y_train_smt, cv=3, scoring=f1_scorer, n_jobs=-1)
    
    return np.mean(scores)

# Run Optuna study
study_logistic = optuna.create_study(direction="maximize")
study_logistic.optimize(objective, n_trials=50)

[I 2025-11-26 16:06:36,431] A new study created in memory with name: no-name-6acbb002-9eed-45bc-9ecd-b0f5b19ad188
[I 2025-11-26 16:06:36,713] Trial 0 finished with value: 0.9458100405749379 and parameters: {'C': 4.601624393735309, 'solver': 'lbfgs', 'tol': 0.05698211950822554, 'class_weight': 'balanced'}. Best is trial 0 with value: 0.9458100405749379.
[I 2025-11-26 16:06:36,956] Trial 1 finished with value: 0.9457081210202624 and parameters: {'C': 8.891498340983787, 'solver': 'lbfgs', 'tol': 0.00012263839814062895, 'class_weight': 'balanced'}. Best is trial 0 with value: 0.9458100405749379.
[I 2025-11-26 16:06:37,190] Trial 2 finished with value: 0.9456794085990863 and parameters: {'C': 2189.2283942354647, 'solver': 'lbfgs', 'tol': 0.0048309933188447505, 'class_weight': None}. Best is trial 0 with value: 0.9458100405749379.
[I 2025-11-26 16:06:37,379] Trial 3 finished with value: 0.9451843781952952 and parameters: {'C': 130.31041418317076, 'solver': 'saga', 'tol': 0.026230364883651068

[I 2025-11-26 16:06:45,134] Trial 31 finished with value: 0.9457792542852985 and parameters: {'C': 2.077673214531126, 'solver': 'lbfgs', 'tol': 2.3951323326842274e-06, 'class_weight': 'balanced'}. Best is trial 18 with value: 0.9458245878376202.
[I 2025-11-26 16:06:45,418] Trial 32 finished with value: 0.9457665622389632 and parameters: {'C': 7.339868124945498, 'solver': 'lbfgs', 'tol': 2.593729806413816e-06, 'class_weight': 'balanced'}. Best is trial 18 with value: 0.9458245878376202.
[I 2025-11-26 16:06:45,654] Trial 33 finished with value: 0.9456208359010468 and parameters: {'C': 39.20730454009977, 'solver': 'lbfgs', 'tol': 1.0014140963873619e-06, 'class_weight': 'balanced'}. Best is trial 18 with value: 0.9458245878376202.
[I 2025-11-26 16:06:45,885] Trial 34 finished with value: 0.9456614654546674 and parameters: {'C': 1.395847261497748, 'solver': 'lbfgs', 'tol': 5.617657419358513e-06, 'class_weight': 'balanced'}. Best is trial 18 with value: 0.9458245878376202.
[I 2025-11-26 16:0

In [28]:
print('Best trial:')
trial = study_logistic.best_trial
print('  F1-score: {}'.format(trial.value))
print('  Params: ')
for key, value in trial.params.items():
    print('    {}: {}'.format(key, value))

# Train best model on full SMOTE-Tomek data
best_model_logistic = LogisticRegression(**study_logistic.best_params)
best_model_logistic.fit(X_train_smt, y_train_smt)

# Evaluate on the test set
y_pred = best_model_logistic.predict(X_test)

report = classification_report(y_test, y_pred)
print(report)

Best trial:
  F1-score: 0.9458245878376202
  Params: 
    C: 4.083407430266515
    solver: newton-cg
    tol: 1.5701716176479944e-05
    class_weight: None
              precision    recall  f1-score   support

           0       0.99      0.93      0.96     11423
           1       0.56      0.94      0.70      1074

    accuracy                           0.93     12497
   macro avg       0.78      0.94      0.83     12497
weighted avg       0.96      0.93      0.94     12497



To improve performance, we tuned **Logistic Regression** using Optuna with the SMOTE-Tomek balanced dataset.

Optuna identified the best hyperparameters:

* **C ‚âà 4.46**
* **solver = saga**
* **tol ‚âà 6.3e-06**
* **class_weight = balanced**

### **Performance (on original test set)**

* **Recall (Default = 1): 94%**
* **Precision (Default = 1): 56%**
* **F1 Score (Default = 1): 70%**

This satisfies **all business success criteria**:

* Recall > 90% 
* Precision > 50% 
* Logistic Regression is fully explainable 

### Attempt 4 - Handling Class Imbalance using Over Sampling (SMOTE Tomek) and Hyperparameter tuning using Optuna on XGBoost

In [29]:
# --- Attempt 4: Optuna Hyperparameter Tuning on XGBoost (with SMOTE-Tomek data) ---

# Define Optuna objective for XGBoost
def objective(trial):
    param = {
        'objective': 'binary:logistic',
        'eval_metric': 'logloss',
        'verbosity': 0,
        'booster': 'gbtree',
        'lambda': trial.suggest_float('lambda', 1e-3, 10.0, log=True),
        'alpha': trial.suggest_float('alpha', 1e-3, 10.0, log=True),
        'subsample': trial.suggest_float('subsample', 0.4, 1.0),
        'colsample_bytree': trial.suggest_float('colsample_bytree', 0.4, 1.0),
        'max_depth': trial.suggest_int('max_depth', 3, 10),
        'eta': trial.suggest_float('eta', 0.01, 0.3),
        'gamma': trial.suggest_float('gamma', 0, 10),
        'scale_pos_weight': trial.suggest_float('scale_pos_weight', 1, 10),
        'min_child_weight': trial.suggest_int('min_child_weight', 1, 10),
        'max_delta_step': trial.suggest_int('max_delta_step', 0, 10)
    }

    model = XGBClassifier(**param)
    
    # Calculate the cross-validated f1_score
    f1_scorer = make_scorer(f1_score, average='macro')
    scores = cross_val_score(model, X_train_smt, y_train_smt, cv=3, scoring=f1_scorer, n_jobs=-1)
    
    return np.mean(scores)

# Run Optuna study
study_xgb = optuna.create_study(direction='maximize')
study_xgb.optimize(objective, n_trials=50)

[I 2025-11-26 16:06:50,638] A new study created in memory with name: no-name-61fbf846-e2e9-4586-b2f9-39701eff046c
[I 2025-11-26 16:06:51,179] Trial 0 finished with value: 0.9572071510654218 and parameters: {'lambda': 4.155069497762944, 'alpha': 0.11839334321891563, 'subsample': 0.8908432387911362, 'colsample_bytree': 0.6616987444634201, 'max_depth': 3, 'eta': 0.2603914084680019, 'gamma': 5.767812566022825, 'scale_pos_weight': 3.598462405007485, 'min_child_weight': 9, 'max_delta_step': 5}. Best is trial 0 with value: 0.9572071510654218.
[I 2025-11-26 16:06:51,619] Trial 1 finished with value: 0.8960593965894447 and parameters: {'lambda': 0.001546872758954236, 'alpha': 0.003063418128079918, 'subsample': 0.6137324370214805, 'colsample_bytree': 0.4639167878874908, 'max_depth': 5, 'eta': 0.0294895258190542, 'gamma': 5.095274945927662, 'scale_pos_weight': 4.2597494212776255, 'min_child_weight': 1, 'max_delta_step': 8}. Best is trial 0 with value: 0.9572071510654218.
[I 2025-11-26 16:06:52,24

[I 2025-11-26 16:07:00,885] Trial 19 finished with value: 0.9546145896845294 and parameters: {'lambda': 1.5406907512842916, 'alpha': 0.08853371390228847, 'subsample': 0.8737348252959423, 'colsample_bytree': 0.7439328762724617, 'max_depth': 4, 'eta': 0.19249974450379387, 'gamma': 1.888812373457732, 'scale_pos_weight': 6.7676078051476285, 'min_child_weight': 3, 'max_delta_step': 6}. Best is trial 11 with value: 0.9740956338860646.
[I 2025-11-26 16:07:02,187] Trial 20 finished with value: 0.9730676849400551 and parameters: {'lambda': 0.048104128135911785, 'alpha': 1.0076195084381088, 'subsample': 0.7729262257974018, 'colsample_bytree': 0.6228150934207498, 'max_depth': 10, 'eta': 0.15521312139911886, 'gamma': 0.11650025827771593, 'scale_pos_weight': 4.785053591909616, 'min_child_weight': 5, 'max_delta_step': 3}. Best is trial 11 with value: 0.9740956338860646.
[I 2025-11-26 16:07:03,387] Trial 21 finished with value: 0.9703261384087579 and parameters: {'lambda': 0.049538440562402655, 'alph

[I 2025-11-26 16:07:14,458] Trial 38 finished with value: 0.9710017936145693 and parameters: {'lambda': 0.027201757414995906, 'alpha': 1.8551628514159455, 'subsample': 0.9247312551837084, 'colsample_bytree': 0.6543060127437591, 'max_depth': 8, 'eta': 0.25216765301954475, 'gamma': 0.561492759762825, 'scale_pos_weight': 7.476449240441482, 'min_child_weight': 9, 'max_delta_step': 5}. Best is trial 11 with value: 0.9740956338860646.
[I 2025-11-26 16:07:15,143] Trial 39 finished with value: 0.970765005968051 and parameters: {'lambda': 0.09331776845523787, 'alpha': 0.7445977030801677, 'subsample': 0.9511090318859826, 'colsample_bytree': 0.684568977534018, 'max_depth': 10, 'eta': 0.20027012833691818, 'gamma': 2.156350912664511, 'scale_pos_weight': 8.291519082248723, 'min_child_weight': 7, 'max_delta_step': 1}. Best is trial 11 with value: 0.9740956338860646.
[I 2025-11-26 16:07:15,801] Trial 40 finished with value: 0.9692278963733996 and parameters: {'lambda': 0.012180320306941383, 'alpha': 0

In [30]:
print('Best trial:')
trial = study_xgb.best_trial
print('  F1-score: {}'.format(trial.value))
print('  Params: ')
for key, value in trial.params.items():
    print('    {}: {}'.format(key, value))
    

# Train best model on full SMOTE-Tomek data
best_params = study_xgb.best_params
best_model_xgb = XGBClassifier(**best_params)
best_model_xgb.fit(X_train_smt, y_train_smt)

# Evaluate on the test set
y_pred = best_model_xgb.predict(X_test)

report = classification_report(y_test, y_pred)
print(report)

Best trial:
  F1-score: 0.9740956338860646
  Params: 
    lambda: 0.018193061045569944
    alpha: 0.8425383544459737
    subsample: 0.7999977964101324
    colsample_bytree: 0.5602307205266787
    max_depth: 10
    eta: 0.2938417118985465
    gamma: 0.06126886913643993
    scale_pos_weight: 6.964231282677778
    min_child_weight: 4
    max_delta_step: 7
              precision    recall  f1-score   support

           0       0.99      0.96      0.97     11423
           1       0.68      0.87      0.76      1074

    accuracy                           0.95     12497
   macro avg       0.83      0.92      0.87     12497
weighted avg       0.96      0.95      0.96     12497



We also performed a full Optuna search on XGBoost (50 trials, 11 hyperparameters).

The final model achieved:

* **Recall (Default = 1): 87%**
* **Precision (Default = 1): 68%**
* **F1 Score (Default = 1): 76%**

While XGBoost delivered a stronger **precision**, it still **did not meet the required ‚â• 90% recall**.

Additionally:

* XGBoost is less explainable
* Converting it into BRE-friendly rules is harder
* It increases deployment complexity

###  **Final Model Selection**

Although XGBoost achieved slightly higher precision, **Logistic Regression outperformed it on recall**, which is the **most critical metric for the business**.

Logistic Regression also:

* Provides direct interpretability
* Allows extracting coefficients for rule-based systems
* Is easy to monitor, debug, and explain
* Fits perfectly with the need for ‚ÄúAI explainability‚Äù

### **Final Selected Model: Logistic Regression (SMOTE-Tomek + Optuna-tuned)**

This model fully satisfies all business KPIs and operational constraints.

In [32]:
# Save the tuned logistic model for use in the next notebook
dump(best_model_logistic, "../outputs/models/best_model_logistic.pkl")
print("Final Logistic Regression model saved.")

Final Logistic Regression model saved.
[CV] END ...........C=0.012742749857031334, solver=liblinear; total time=   0.2s
[CV] END ...........C=0.004832930238571752, solver=liblinear; total time=   0.3s
[CV] END ..........C=0.0006951927961775605, solver=liblinear; total time=   0.1s
[CV] END .............C=0.00026366508987303583, solver=lbfgs; total time=   0.2s
[CV] END ..................C=11.288378916846883, solver=saga; total time=   0.2s
[CV] END ...................C=1438.44988828766, solver=lbfgs; total time=   0.2s
[CV] END ...................C=4.281332398719396, solver=saga; total time=   0.2s
[CV] END .............C=29.763514416313132, solver=newton-cg; total time=   0.4s
[CV] END ..................C=545.5594781168514, solver=lbfgs; total time=   0.1s
[CV] END ............C=0.23357214690901212, solver=liblinear; total time=   0.2s
[CV] END ..............C=1.623776739188721, solver=liblinear; total time=   0.2s
[CV] END ...............C=0.004832930238571752, solver=lbfgs; total ti

[CV] END ...........C=0.012742749857031334, solver=liblinear; total time=   0.1s
[CV] END ............C=0.08858667904100823, solver=newton-cg; total time=   0.5s
[CV] END .............C=0.00026366508987303583, solver=lbfgs; total time=   0.0s
[CV] END ..............C=0.0018329807108324356, solver=lbfgs; total time=   0.1s
[CV] END ..................C=11.288378916846883, solver=saga; total time=   0.2s
[CV] END ............C=0.23357214690901212, solver=newton-cg; total time=   0.3s
[CV] END ...................C=3792.690190732246, solver=saga; total time=   0.2s
[CV] END ..................C=1.623776739188721, solver=lbfgs; total time=   0.1s
[CV] END ...............C=0.0006951927961775605, solver=saga; total time=   0.2s
[CV] END ..............C=4.281332398719396, solver=newton-cg; total time=   0.3s
[CV] END ..............C=1.623776739188721, solver=liblinear; total time=   0.2s
[CV] END ..............C=0.615848211066026, solver=newton-cg; total time=   0.5s
[CV] END ..................C

In [33]:
# Save the train, test split for model evaluation
X_train_smt.to_parquet("../data/processed/final_X_smt_train.parquet", index=False)
y_train_smt.to_frame().to_parquet("../data/processed/final_y_smt_train.parquet", index=False)

X_test.to_parquet("../data/processed/final_model_X_test.parquet", index=False)
y_test.to_frame().to_parquet("../data/processed/final_model_y_test.parquet", index=False)

print("Saved model and test sets for model evaluation.")

Saved model and test sets for model evaluation.


In [34]:
# Save Final Optuna Best Parameters
dump(study_logistic.best_params, "../outputs/models/best_params_logistic.pkl")

['../outputs/models/best_params_logistic.pkl']

In [None]:
y_test.shape