# Bank Customer Churn - Hyperparameter Tuning

Now that we have everything needed to train the model, the last step remaining is Hyperparameter Tuning, the process of finding the best possible values for a model hyperparameters.

This can manually by defining an hyperparameter space, obtaining the model metric and changing it again to see how it affects the model, or done automatically via several methods and libraries like Grid Search or Hyperopt.

**Grid Search**, as the name says, is a grid-based method in which it tries every possible combination of hyperparameter values within a predefined grid. Its simple to implement, but can be very slow for large search spaces.

**Hyperopt** on the other hand uses Bayesian optimization to sample hyperparameter combinations based on previous results, prioritizing areas with higher potential. It is more complex to setup, but works well to more complex models.

Here, Hyperopt will be using along with **MLFlow** to store the final model score and hyperparameters. In case a new model must be trained with new data, it is possible to compare its performance to the older model, its hyperparameters, and on the long run it may help to see if is possible to use a fixed hyperparameter space in every model.


In [36]:
import datetime
import pickle

import pandas as pd
import numpy as np
import mlflow

from hyperopt import STATUS_OK, Trials, fmin, hp, tpe, space_eval
from sklearn.metrics import f1_score, accuracy_score
from xgboost import XGBClassifier

In [4]:
with open('../data/processed/training_data.pkl','rb') as f:
    training_data = pickle.load(f)

In [32]:
space = {
            'objective': 'binary:logistic',
            'n_estimators': 1500,
            'colsample_bytree': hp.uniform('colsample_bytree', 0.2, 0.7),
            'gamma': hp.uniform('gamma', 0, 0.5),
            'learning_rate': hp.quniform('learning_rate', 0.001, 0.05, 0.01),
            'max_depth': hp.choice('max_depth', range(5, 15, 1)),
            'min_child_weight': hp.quniform('min_child_weight', 1, 5, 1),
            'reg_alpha': hp.uniform('reg_alpha', 0, 10),
            'reg_lambda': hp.uniform('reg_lambda', 0.1, 5),
            'eval_metric': 'error',
            'tree_method': 'hist',
            'subsample': 0.6,
            'seed': 42
        }

model = XGBClassifier()


In [39]:
def train_model(
    training_data: dict[str, pd.DataFrame | pd.Series],
    space: dict
    )-> XGBClassifier: 
    x_train = training_data.get('x_train')
    y_train = training_data.get('y_train')
    x_valid = training_data.get('x_valid')
    y_valid = training_data.get('y_valid')
    
    x_train_sample = x_train.sample(frac=0.5, random_state=0)
    y_train_sample = y_train.loc[x_train_sample.index]
    x_valid_sample = x_valid.sample(frac=0.5, random_state=0)
    y_valid_sample = y_valid.loc[x_valid_sample.index]
    
    def objective(space):
        clf = XGBClassifier(
            objective=space['objective'],
            n_estimators=int(space['n_estimators']),
            max_depth=int(space['max_depth']),
            tree_method=space['tree_method'],
            gamma=space['gamma'],
            reg_alpha=space['reg_alpha'],
            min_child_weight=int(space['min_child_weight']),
            colsample_bytree=space['colsample_bytree'],
            learning_rate=space['learning_rate'],
            reg_lambda=space['reg_lambda'],
            eval_metric=space['eval_metric'],
            early_stopping_rounds=150)

        evaluation = [(x_valid_sample, y_valid_sample)]
        clf.fit(x_train_sample, y_train_sample,
                eval_set=evaluation,  verbose=50)

        pred = clf.predict(x_valid)

        score = f1_score(y_valid, pred)
        #print("F1 Score:", score)
        return {'loss': 1 - score, 'status': STATUS_OK}
    trials = Trials()

    hyperparams = fmin(fn=objective,
                       space=space,
                       algo=tpe.suggest,
                       max_evals=25,
                       trials=trials)

    best_hyperparams = space_eval(space, hyperparams)
    
    mlflow.log_params(best_hyperparams)
    model = XGBClassifier(**best_hyperparams)
    model.fit(x_train, y_train)
    pred = model.predict(x_valid)

    score = f1_score(y_valid, pred)
    print("F1 Score:", score)

    return model
    

In [40]:
def evaluate_model(model: XGBClassifier, training_data: dict[str, pd.DataFrame | pd.Series]) -> float:
    x_valid = training_data.get('x_valid')
    y_valid = training_data.get('y_valid')
    x_test = training_data.get('x_test')
    y_test = training_data.get('y_test')
    
    validation_preds = model.predict(x_valid)
    test_preds = model.predict(x_test)
    
    val_f1_score = f1_score(y_valid, validation_preds)
    test_f1_score = f1_score(y_test, test_preds)
   
    val_accuracy = accuracy_score(y_valid, validation_preds)
    test_accuracy = accuracy_score(y_test, test_preds)
   
    mlflow.log_metrics(
       metrics={
           'Validation F1 Score': val_f1_score,
           'Test F1 Score': test_f1_score,
           'Validation Accuracy': val_accuracy,
           'Test Accuracy':test_accuracy,
       }
   )
    
    return test_f1_score
    

In [42]:
# mlflow.set_tracking_uri('http://localhost:5000') in case it wants to be used in a local server

mlflow.set_experiment('BANK CHURN PROJECT')
with mlflow.start_run(run_name='XGBoost Model Training'):
    tuned_model = train_model(training_data, space)
    model_f1_score = evaluate_model(tuned_model, training_data)
    
    with open(f"../models/xgb_{datetime.datetime.today().strftime(format='%Y-%m-%d')}_score_{model_f1_score:.2f}.pkl", 'wb') as f:
        pickle.dump(tuned_model, f)
    

2025/01/20 20:58:54 INFO mlflow.tracking.fluent: Experiment with name 'BANK CHURN PROJECT' does not exist. Creating a new experiment.


[0]	validation_0-error:0.21067                        
[50]	validation_0-error:0.21067                       
[100]	validation_0-error:0.20533                      
[150]	validation_0-error:0.19733                      
[200]	validation_0-error:0.18267                      
[250]	validation_0-error:0.17333                      
[300]	validation_0-error:0.17067                      
[350]	validation_0-error:0.16533                      
[400]	validation_0-error:0.15867                      
[450]	validation_0-error:0.15600                      
[500]	validation_0-error:0.15200                      
[550]	validation_0-error:0.15067                      
[600]	validation_0-error:0.15333                      
[615]	validation_0-error:0.15333                      
[0]	validation_0-error:0.21067                                                 
[50]	validation_0-error:0.19200                                                
[100]	validation_0-error:0.16667                                      

With the model finished, let's move towards the final step: **inference**, where the model is used to generate predictions from previously unseen data.