<p align="center">
  <img src="https://www.verifia.ca/assets/logo.png" width="160px" alt="VerifIA Logo"/><br>
  <strong>© 2025 VerifIA. All rights reserved.</strong>
</p>

### VerifIA - Model Verification: Loan Eligibility Prediction with XGBoost

This notebook addresses a loan eligibility prediction problem using an XGBoost classifier. We tune the model via Bayesian hyperparameter optimization (using BayesSearchCV) with cross-validation. After selecting the best parameters, the trained model is wrapped with VerifIA’s XGBModel wrapper. Next, we generate a domain configuration—either automatically using AI and external domain documents or by loading a pre-defined YAML file—and then verify the model’s rule consistency using VerifIA.

In [None]:
!pip install requests
!pip install scikit-optimize
!pip install "../../dist/verifia-0.1.0-py3-none-any.whl[xgboost, genflow]"

#### 0. Download Resources

Before running any other cells, make sure you have all required resources:

In [2]:
!curl -sL https://tinyurl.com/r6m2zk87 -o downloader.py

In [3]:
from downloader import download_resource
url = 'https://www.verifia.ca/assets/use-cases/'
download_resource(url+'data/loan_eligibility.csv', 
                  dest_dir='../data')
download_resource(url+'domains/loan_eligibility.yaml', 
                  dest_dir='../domains')
download_resource(url+'documents/loan_eligibility.zip', 
                  dest_dir='../documents/loan_eligibility')

{'extracted': True,
 'files': ['articles/',
  'articles/Criteria You Need to Know Before You Apply.pdf',
  'articles/Personal loan elegibility.pdf',
  'articles/Personal loan requirements.pdf',
  'articles/What Are Personal Loan Eligibility Requirements.pdf',
  'data_report.pdf',
  'domain_definition_meeting_notes.pdf',
  'domain_definition_report.pdf',
  'feature_selection.pdf',
  'sensitivity_analysis_meeting_notes.pdf',
  'sensitivity_analysis_report.pdf']}

#### 1. Importing Libraries and Setting Up

We begin by importing required libraries such as Pandas, NumPy, XGBoost, and modules from skopt and VerifIA.

In [None]:
%load_ext autoreload
%autoreload 2
import os
import getpass
import gc
import numpy as np
import pandas as pd
import xgboost as xgb
from skopt import BayesSearchCV
from skopt.callbacks import DeadlineStopper, DeltaYStopper
from sklearn.model_selection import StratifiedKFold
from verifia.models import XGBModel, build_from_model_card
from verifia.verification.results import RulesViolationResult
from verifia.context.data import Dataset
from verifia.verification.verifiers import RuleConsistencyVerifier

#### 2. Data Loading

Constants are defined (e.g., random seed, model directory paths, and data file path). The loan eligibility dataset is loaded from a CSV file. The target variable (*loan_paid*) is separated from the feature columns, and categorical features are identified from the dataset.

In [5]:
RAND_SEED = 0
MODELS_DIRPATH = "../models"
DATA_PATH = "../data/loan_eligibility.csv"
dataframe = pd.read_csv(DATA_PATH)
target_name = "loan_paid"
feature_names = set(dataframe.columns) - {target_name}
cat_feature_names = set(dataframe.select_dtypes(include=["object"]).columns) - {target_name}

#### 3. Building the XGBoost Model Wrapper

Using VerifIA’s `build_from_model_card`, we create an instance of `XGBModel`. This wrapper stores critical metadata (such as model name, version, type, feature names, categorical feature names, target name, and the local directory for model storage) and provides a standardized interface for verification tasks.

In [6]:
model_wrapper:XGBModel = build_from_model_card({
    "name": "loan_eligibility",
    "version": "2",
    "type": "classification",
    "description": "model predicts the loan eligibility of a customer.",
    "framework": "xgboost",
    "feature_names": feature_names,
    "cat_feature_names": cat_feature_names,
    "target_name": target_name,
    "local_dirpath": MODELS_DIRPATH
})

#### 4. Preparing the Dataset

The loaded DataFrame is transformed into a VerifIA `Dataset` object. This object organizes the feature and target data, automatically detecting categorical features. The dataset is then split into training and testing subsets (using an 80/20 split) to facilitate model tuning and subsequent evaluation.

In [7]:
dataset = Dataset(dataframe, model_wrapper.target_name, 
                  model_wrapper.feature_names, 
                  model_wrapper.cat_feature_names)
train_dataset, test_dataset = dataset.split(0.8, RAND_SEED)

#### 5. Defining and Tuning the XGBoost Classifier

An XGBoost classifier is initialized with categorical feature support and a fixed random seed.  
A search space for key hyperparameters (including number of estimators, learning rate, maximum tree depth, minimum child weight, gamma, subsample ratio, column sample by tree, and regularization terms) is defined. Using BayesSearchCV with StratifiedKFold cross-validation, we perform Bayesian hyperparameter tuning. Callback functions such as a time-based stopper and an evaluation step printer monitor the optimization progress. The best hyperparameters are extracted after the search.


In [8]:
xgb_model = xgb.XGBClassifier(enable_categorical=True, random_state=RAND_SEED)

In [9]:
cv_splits_count, max_trials, n_hparams_at_trial = 5, 10, 3
skf = StratifiedKFold(n_splits=cv_splits_count, shuffle=True, random_state=RAND_SEED)
search_spaces = {
                'n_estimators': [10, 50, 100, 200, 400, 600, 800, 1000],
                'learning_rate': [0.001, 0.01, 0.1, 0.2],
                'max_depth': [3, 6, 9, 12],
                'min_child_weight': [1, 3, 5, 7],
                'gamma': [0.0, 0.1, 0.2, 0.3, 0.4],
                'subsample': [0.5, 0.75, 1.0],
                'colsample_bytree': [0.5, 0.75, 1.0],
                'reg_alpha': [0.0, 0.1, 0.5, 1.0],
                'reg_lambda': [0.0, 0.1, 0.5, 1.0]
                }
hparams_tuner = BayesSearchCV(estimator=xgb_model,                                    
                    search_spaces=search_spaces,                      
                    scoring='f1',                                  
                    cv=skf,                               # number of splits for cross-validation            
                    n_iter=max_trials,                                # max number of trials
                    n_points=n_hparams_at_trial,                      # number of hyperparameter sets evaluated at the same time
                    iid=False,                                        # if not iid it optimizes on the cv score
                    return_train_score=False,                         
                    refit=False,                                      
                    optimizer_kwargs={'base_estimator': 'GP'},        # optmizer parameters: we use Gaussian Process (GP)
                    n_jobs=-1,                                      
                    random_state=RAND_SEED) 

In [10]:
counter = 1
def onstep(res):
    global counter
    x0 = res.x_iters   # List of input points
    y0 = res.func_vals # Evaluation of input points
    print(f'Last eval #{counter}: {x0[-1]}', 
          f' - Score {y0[-1]:.3f}')
    print(f' - Best Score {res.fun:.3f}',
          f' - Best Args: {res.x}')
    counter += 1

#overdone_control = DeltaYStopper(delta=0.0001)               # We stop if the gain of the optimization becomes too small
time_limit_control = DeadlineStopper(total_time=60 * 45)     # We impose a time limit (45 minutes)

callbacks=[time_limit_control, onstep]

In [11]:
X = train_dataset.feature_data(True)
y = train_dataset.target_data
hparams_tuner.fit(X, y, callback=callbacks)

hparams_evals_count = len(hparams_tuner.cv_results_['params'])
best_score = hparams_tuner.best_score_
best_score_std = pd.DataFrame(hparams_tuner.cv_results_).iloc[hparams_tuner.best_index_].std_test_score
best_params = hparams_tuner.best_params_
print(f"candidates checked: {hparams_evals_count}, best CV score: {best_score:.3f}, best_score_std:{best_score_std:.3f}")
print(f"best_params: {best_params}")

Last eval #1: [1.0, 0.2, 0.01, 9, 5, 100, 0.1, 0.5, 0.5]  - Score -0.911
 - Best Score -0.911  - Best Args: [1.0, 0.2, 0.01, 9, 5, 100, 0.1, 0.5, 0.5]
Last eval #2: [1.0, 0.0, 0.1, 9, 1, 200, 1.0, 1.0, 0.5]  - Score -0.908
 - Best Score -0.911  - Best Args: [1.0, 0.2, 0.01, 9, 5, 100, 0.1, 0.5, 0.5]
Last eval #3: [0.5, 0.1, 0.1, 9, 1, 100, 0.0, 1.0, 1.0]  - Score -0.911
 - Best Score -0.911  - Best Args: [0.5, 0.1, 0.01, 6, 1, 600, 0.5, 1.0, 0.75]
Last eval #4: [0.5, 0.3, 0.01, 6, 7, 600, 1.0, 0.1, 0.75]  - Score -0.911
 - Best Score -0.911  - Best Args: [0.5, 0.1, 0.01, 6, 1, 600, 0.5, 1.0, 0.75]
candidates checked: 10, best CV score: 0.911, best_score_std:0.001
best_params: OrderedDict([('colsample_bytree', 0.5), ('gamma', 0.1), ('learning_rate', 0.01), ('max_depth', 6), ('min_child_weight', 1), ('n_estimators', 600), ('reg_alpha', 0.5), ('reg_lambda', 1.0), ('subsample', 0.75)])


#### 6. Training and Wrapping the Final XGBoost Model

The XGBoost model is re-instantiated with the best hyperparameters from the tuning phase. The model is then trained on the training data, and the model is assigned to the XGBModel wrapper. We evaluate its performance on the test dataset using the f1 score (or any other relevant metric), ensuring that the model meets the desired classification performance.

In [12]:
xgb_model = xgb.XGBClassifier(**best_params, enable_categorical=True, random_state=RAND_SEED)
xgb_model.fit(X, y)
model_wrapper.wrap_model(xgb_model)
model_wrapper.save_model()
model_wrapper.load_model()
metric_name, metric_score = model_wrapper.calculate_predictive_performance(test_dataset)
print(f"Test Performance Metric : {metric_name}={metric_score}")

Test Performance Metric : F1-Score=0.9110096549659457


#### 7. Loading or Generating the Domain Configuration

VerifIA allows you to create a domain configuration in two way. With the domain configuration available (either generated or loaded), we instantiate the `RuleConsistencyVerifier`. This verifier uses the domain rules and constraints to evaluate whether the model’s predictions on the test data are consistent with our domain knowledge.

##### **Option A: Predefined Domain File:**

You can load a predefined domain YAML file (e.g., "loan_eligibility.yaml") to provide the necessary constraints and rules.

In [13]:
DOMAIN_PATH = f"../domains/loan_eligibility.yaml"
model_verifier = RuleConsistencyVerifier(DOMAIN_PATH)

##### **Option B: AI-Powered Domain Generation:**  

Alternatively, VerifIA’s DomainGenFlow module is used to generate a rich domain configuration from the training data. By supplying the training dataframe, a directory of PDF documents containing domain knowledge, and the model card information, a domain configuration dictionary is produced.

**Setup OpenAI and LangSmith Keys**

Note that LangSmith is not needed, but it is helpful. If you do want to use LangSmith, after you sign up at the link above, make sure to set your environment variables to start logging traces.

Accessing the OpenAI API requires an API key, which you can get by creating an account. Once you have a key you'll want to set it as an environment variable by running:

In [None]:
os.environ["LANGCHAIN_TRACING_V2"] = 'true'
os.environ["LANGCHAIN_ENDPOINT"] = 'https://api.smith.langchain.com'
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass(prompt='Your LANGCHAIN_API_KEY? ')
os.environ["OPENAI_API_KEY"] = getpass.getpass(prompt='Your OPENAI_API_KEY? ')
os.environ["USER_AGENT"] = 'my_agent'
os.environ["LANGCHAIN_PROJECT"] = 'VERIFIA_TEST'
os.environ["VERIFIA_GPT_NAME"] = 'gpt-4.1'

In [None]:
from verifia.generation import DomainGenFlow

DOMAIN_PDF_DIRPATH = "../documents/loan_eligibility"
domain_genflow = DomainGenFlow()
domain_genflow.load_ctx(dataframe=train_dataset.data,
                        pdfs_dirpath=DOMAIN_PDF_DIRPATH,
                        model_card=model_wrapper.model_card.to_dict())
domain_cfg_dict = domain_genflow.run(save=True, local_path="./domain.yaml")
model_verifier = RuleConsistencyVerifier(domain_cfg_dict=domain_cfg_dict)

#### 8. Running the Rule Consistency Verifier

Using the generated domain configuration (or the loaded YAML file), we instantiate a `RuleConsistencyVerifier`. This verifier is then connected to the wrapped XGBoost model and the test dataset. We run the verification process using a Random Sampler (RS) search algorithm with specified parameters (population size, maximum iterations, and original seed size). The verifier explores the input space to identify any rule violations or inconsistencies in the model’s predictions.

In [14]:
result:RulesViolationResult = model_verifier.verify(model_wrapper)\
                                            .on(test_dataset.data)\
                                            .using("RS")\
                                            .run(pop_size=4, max_iters=5, orig_seed_size=10)

Processing Original Inputs: 10it [00:00, 17.08it/s]it/s]
Processing Original Inputs: 10it [00:00, 15.06it/s]5,  1.69it/s]
Processing Original Inputs: 10it [00:00, 13.96it/s]5,  1.56it/s]
Processing Original Inputs: 10it [00:00, 18.18it/s]5,  1.47it/s]
Processing Original Inputs: 10it [00:00, 16.26it/s]4,  1.59it/s]
Processing Original Inputs: 10it [00:00, 17.70it/s]3,  1.60it/s]
Processing Original Inputs: 10it [00:00, 16.86it/s]3,  1.65it/s]
Processing Original Inputs: 10it [00:00, 13.31it/s]2,  1.66it/s]
Processing Original Inputs: 10it [00:00, 15.19it/s]1,  1.53it/s]
Processing Original Inputs: 10it [00:00, 14.40it/s]1,  1.53it/s]
Processing Original Inputs: 10it [00:00, 16.48it/s]00,  1.50it/s]
Processing Rules: 100%|██████████| 11/11 [00:07<00:00,  1.56it/s]


#### 9. Saving the Verification Report and Model Artifacts

Finally, the verification results are saved as an HTML report that provides a detailed summary of rule compliance and any detected violations. Additionally, the trained XGBoost model and its model card are saved, ensuring that the model’s configuration and verification status are archived for future reference or reproducibility.

In [15]:
result.save_as_html("../reports/loan_eligibility.html")

In [16]:
model_wrapper.save_model()
model_wrapper.save_model_card("../models/loan_eligibility.yaml")