# Monotone constraints all models

## Rationale:

Monotone constraints showed to benefit the ``Boruta``model so we are to test whether or not this also benefits to each model.

## Methodology:
We are to define the monotone constraints for each feature i.e the direction of the relationship between feature and target in the following way:

1. We are to train a linear regression model to explain target average (average risk) for every model decile.
2. We are then to keep the trend coeficient of the model to define the constraint.
3. The constrains may be the sign of the coreficient: ```(+ , -, 0)```
4. Finally, we are to test the model performance of both models, the one with and without constraints.

The previous methodology will be applied to all the models.

## Conclusions:

**Conclusions from Model Performance Table:**

The results showed that this methodology did not benefit to all the models.


| Model                      | out_of_fold | validation |
|----------------------------|------------|------------|
| all_features [173]         | 0.862250   | 0.866135   |
| all_mc_features [173]      | 0.862359   | 0.867179   |
| base_features [10]         | 0.864081   | 0.865510   |
| base_mc_features [10]      | 0.861828   | 0.864351   |
| ensemble_features [82]     | 0.864084   | 0.866569   |
| ensemble_mc_features [82]  | 0.863107   | 0.866836   |
| fw_features [53]           | 0.862487   | 0.866534   |
| fw_mc_features [53]        | 0.862111   | 0.866872   |




In [2]:
%load_ext autoreload
%autoreload 2

import numpy as np
import pandas as pd

from lightgbm import LGBMClassifier as lgbm
from sklearn.inspection import permutation_importance
from sklearn.model_selection import train_test_split
from copy import deepcopy

import warnings;warnings.filterwarnings("ignore")

import sys
sys.path.append("../")

# local imports
from src.learner_params import target_column, boruta_learner_params, test_params
from utils.functions__utils import find_constraint

from src.learner_params import params_all, params_ensemble, params_fw, params_original, MODEL_PARAMS
from utils.feature_selection_lists import fw_features, boruta_features, ensemble_features
from utils.features_lists import all_features_list, base_features

from utils.functions__training import model_pipeline

In [3]:
train_df = pd.read_pickle("../data/train_df.pkl")
validation_df = pd.read_pickle("../data/validation_df.pkl")

## Find the monotone constraints for every model

We are to use the methodology based on linear model to find the constraints

In [4]:
boruta_monotone_const_dict = {}
for feature in boruta_features:
    aux = find_constraint(train_df, feature, target_column)
    boruta_monotone_const_dict[feature] = aux

fw_monotone_const_dict = {}
for feature in fw_features:
    aux = find_constraint(train_df, feature, target_column)
    fw_monotone_const_dict[feature] = aux

ensemble_monotone_const_dict = {}
for feature in ensemble_features:
    aux = find_constraint(train_df, feature, target_column)
    ensemble_monotone_const_dict[feature] = aux

all_monotone_const_dict = {}
for feature in all_features_list:
    aux = find_constraint(train_df, feature, target_column)
    all_monotone_const_dict[feature] = aux

base_monotone_const_dict = {}
for feature in base_features:
    aux = find_constraint(train_df, feature, target_column)
    base_monotone_const_dict[feature] = aux

In [5]:
boruta_params = deepcopy(MODEL_PARAMS)
boruta_params["learner_params"]["extra_params"]["monotone_constraints"] = list(boruta_monotone_const_dict.values())

fw_params = deepcopy(params_fw)
fw_params["learner_params"]["extra_params"]["monotone_constraints"] = list(fw_monotone_const_dict.values())

ensemble_params = deepcopy(params_ensemble)
ensemble_params["learner_params"]["extra_params"]["monotone_constraints"] = list(ensemble_monotone_const_dict.values())

all_params = deepcopy(params_all)
all_params["learner_params"]["extra_params"]["monotone_constraints"] = list(all_monotone_const_dict.values())

base_params = deepcopy(params_original)
base_params["learner_params"]["extra_params"]["monotone_constraints"] = list(base_monotone_const_dict.values())

## Train the learners

### Base with no constraints
We are to train each model with the optimized hyperparameters

In [6]:
boruta_logs = model_pipeline(train_df = train_df,
                            validation_df = validation_df,
                            params = MODEL_PARAMS,
                            target_column = target_column,
                            features = boruta_features,
                            cv = 3,
                            random_state = 42,
                            apply_shap = False
                          )

2023-10-11T09:56:57 | INFO | Starting pipeline: Generating 3 k-fold training...
2023-10-11T09:56:58 | INFO | Training for fold 1
2023-10-11T09:58:26 | INFO | Training for fold 2
2023-10-11T09:59:58 | INFO | Training for fold 3
2023-10-11T10:01:33 | INFO | CV training finished!
2023-10-11T10:01:33 | INFO | Training the model in the full dataset...
2023-10-11T10:03:48 | INFO | Training process finished!
2023-10-11T10:03:48 | INFO | Calculating metrics...
2023-10-11T10:03:48 | INFO | Full process finished in 6.84 minutes.
2023-10-11T10:03:48 | INFO | Saving the predict function.
2023-10-11T10:03:48 | INFO | Predict function saved.


In [7]:
fw_logs = model_pipeline(train_df = train_df,
                            validation_df = validation_df,
                            params = params_fw,
                            target_column = target_column,
                            features = fw_features,
                            cv = 3,
                            random_state = 42,
                            apply_shap = False
                          )

2023-10-11T10:03:48 | INFO | Starting pipeline: Generating 3 k-fold training...
2023-10-11T10:03:48 | INFO | Training for fold 1
2023-10-11T10:04:27 | INFO | Training for fold 2
2023-10-11T10:05:06 | INFO | Training for fold 3
2023-10-11T10:05:45 | INFO | CV training finished!
2023-10-11T10:05:45 | INFO | Training the model in the full dataset...
2023-10-11T10:06:32 | INFO | Training process finished!
2023-10-11T10:06:32 | INFO | Calculating metrics...
2023-10-11T10:06:32 | INFO | Full process finished in 2.74 minutes.
2023-10-11T10:06:32 | INFO | Saving the predict function.
2023-10-11T10:06:32 | INFO | Predict function saved.


In [8]:
ensemble_logs = model_pipeline(train_df = train_df,
                            validation_df = validation_df,
                            params = params_ensemble,
                            target_column = target_column,
                            features = ensemble_features,
                            cv = 3,
                            random_state = 42,
                            apply_shap = False
                          )

2023-10-11T10:06:33 | INFO | Starting pipeline: Generating 3 k-fold training...
2023-10-11T10:06:33 | INFO | Training for fold 1
2023-10-11T10:06:55 | INFO | Training for fold 2
2023-10-11T10:07:17 | INFO | Training for fold 3
2023-10-11T10:07:39 | INFO | CV training finished!
2023-10-11T10:07:39 | INFO | Training the model in the full dataset...
2023-10-11T10:08:07 | INFO | Training process finished!
2023-10-11T10:08:07 | INFO | Calculating metrics...
2023-10-11T10:08:07 | INFO | Full process finished in 1.58 minutes.
2023-10-11T10:08:07 | INFO | Saving the predict function.
2023-10-11T10:08:07 | INFO | Predict function saved.


In [9]:
base_logs = model_pipeline(train_df = train_df,
                            validation_df = validation_df,
                            params = params_original,
                            target_column = target_column,
                            features = base_features,
                            cv = 3,
                            random_state = 42,
                            apply_shap = False
                          )

2023-10-11T10:08:08 | INFO | Starting pipeline: Generating 3 k-fold training...
2023-10-11T10:08:08 | INFO | Training for fold 1
2023-10-11T10:08:30 | INFO | Training for fold 2
2023-10-11T10:08:54 | INFO | Training for fold 3
2023-10-11T10:09:16 | INFO | CV training finished!
2023-10-11T10:09:16 | INFO | Training the model in the full dataset...
2023-10-11T10:09:43 | INFO | Training process finished!
2023-10-11T10:09:43 | INFO | Calculating metrics...
2023-10-11T10:09:43 | INFO | Full process finished in 1.60 minutes.
2023-10-11T10:09:43 | INFO | Saving the predict function.
2023-10-11T10:09:43 | INFO | Predict function saved.


In [10]:
all_logs = model_pipeline(train_df = train_df,
                            validation_df = validation_df,
                            params = params_all,
                            target_column = target_column,
                            features = all_features_list,
                            cv = 3,
                            random_state = 42,
                            apply_shap = False
                          )

2023-10-11T10:09:44 | INFO | Starting pipeline: Generating 3 k-fold training...
2023-10-11T10:09:44 | INFO | Training for fold 1
2023-10-11T10:10:27 | INFO | Training for fold 2
2023-10-11T10:11:10 | INFO | Training for fold 3
2023-10-11T10:11:53 | INFO | CV training finished!
2023-10-11T10:11:53 | INFO | Training the model in the full dataset...
2023-10-11T10:12:46 | INFO | Training process finished!
2023-10-11T10:12:46 | INFO | Calculating metrics...
2023-10-11T10:12:46 | INFO | Full process finished in 3.04 minutes.
2023-10-11T10:12:46 | INFO | Saving the predict function.
2023-10-11T10:12:46 | INFO | Predict function saved.


### Base with constraints
We are to train each learner function adding constraints to the already optimized hyperparamters

In [11]:
boruta_mc_logs = model_pipeline(train_df = train_df,
                            validation_df = validation_df,
                            params = boruta_params,
                            target_column = target_column,
                            features = boruta_features,
                            cv = 3,
                            random_state = 42,
                            apply_shap = False
                          )

2023-10-11T10:12:46 | INFO | Starting pipeline: Generating 3 k-fold training...
2023-10-11T10:12:46 | INFO | Training for fold 1
2023-10-11T10:14:22 | INFO | Training for fold 2
2023-10-11T10:15:58 | INFO | Training for fold 3
2023-10-11T10:17:33 | INFO | CV training finished!
2023-10-11T10:17:33 | INFO | Training the model in the full dataset...
2023-10-11T10:19:46 | INFO | Training process finished!
2023-10-11T10:19:46 | INFO | Calculating metrics...
2023-10-11T10:19:46 | INFO | Full process finished in 7.01 minutes.
2023-10-11T10:19:46 | INFO | Saving the predict function.
2023-10-11T10:19:46 | INFO | Predict function saved.


In [12]:
fw_mc_logs = model_pipeline(train_df = train_df,
                            validation_df = validation_df,
                            params = fw_params,
                            target_column = target_column,
                            features = fw_features,
                            cv = 3,
                            random_state = 42,
                            apply_shap = False
                          )

2023-10-11T10:19:47 | INFO | Starting pipeline: Generating 3 k-fold training...
2023-10-11T10:19:47 | INFO | Training for fold 1
2023-10-11T10:20:36 | INFO | Training for fold 2
2023-10-11T10:21:25 | INFO | Training for fold 3
2023-10-11T10:22:15 | INFO | CV training finished!
2023-10-11T10:22:15 | INFO | Training the model in the full dataset...
2023-10-11T10:23:16 | INFO | Training process finished!
2023-10-11T10:23:16 | INFO | Calculating metrics...
2023-10-11T10:23:16 | INFO | Full process finished in 3.50 minutes.
2023-10-11T10:23:16 | INFO | Saving the predict function.
2023-10-11T10:23:16 | INFO | Predict function saved.


In [13]:
ensemble_mc_logs = model_pipeline(train_df = train_df,
                            validation_df = validation_df,
                            params = ensemble_params,
                            target_column = target_column,
                            features = ensemble_features,
                            cv = 3,
                            random_state = 42,
                            apply_shap = False
                          )

2023-10-11T10:23:17 | INFO | Starting pipeline: Generating 3 k-fold training...
2023-10-11T10:23:17 | INFO | Training for fold 1
2023-10-11T10:23:41 | INFO | Training for fold 2
2023-10-11T10:24:04 | INFO | Training for fold 3
2023-10-11T10:24:28 | INFO | CV training finished!
2023-10-11T10:24:28 | INFO | Training the model in the full dataset...
2023-10-11T10:24:58 | INFO | Training process finished!
2023-10-11T10:24:58 | INFO | Calculating metrics...
2023-10-11T10:24:59 | INFO | Full process finished in 1.70 minutes.
2023-10-11T10:24:59 | INFO | Saving the predict function.
2023-10-11T10:24:59 | INFO | Predict function saved.


In [14]:
all_mc_logs = model_pipeline(train_df = train_df,
                            validation_df = validation_df,
                            params = all_params,
                            target_column = target_column,
                            features = all_features_list,
                            cv = 3,
                            random_state = 42,
                            apply_shap = False
                          )

2023-10-11T10:24:59 | INFO | Starting pipeline: Generating 3 k-fold training...
2023-10-11T10:24:59 | INFO | Training for fold 1
2023-10-11T10:25:52 | INFO | Training for fold 2
2023-10-11T10:26:44 | INFO | Training for fold 3
2023-10-11T10:27:36 | INFO | CV training finished!
2023-10-11T10:27:36 | INFO | Training the model in the full dataset...
2023-10-11T10:28:42 | INFO | Training process finished!
2023-10-11T10:28:42 | INFO | Calculating metrics...
2023-10-11T10:28:42 | INFO | Full process finished in 3.72 minutes.
2023-10-11T10:28:42 | INFO | Saving the predict function.
2023-10-11T10:28:42 | INFO | Predict function saved.


In [15]:
base_mc_logs = model_pipeline(train_df = train_df,
                            validation_df = validation_df,
                            params = base_params,
                            target_column = target_column,
                            features = base_features,
                            cv = 3,
                            random_state = 42,
                            apply_shap = False
                          )

2023-10-11T10:28:42 | INFO | Starting pipeline: Generating 3 k-fold training...
2023-10-11T10:28:42 | INFO | Training for fold 1
2023-10-11T10:29:06 | INFO | Training for fold 2
2023-10-11T10:29:30 | INFO | Training for fold 3
2023-10-11T10:29:53 | INFO | CV training finished!
2023-10-11T10:29:53 | INFO | Training the model in the full dataset...
2023-10-11T10:30:22 | INFO | Training process finished!
2023-10-11T10:30:22 | INFO | Calculating metrics...
2023-10-11T10:30:22 | INFO | Full process finished in 1.68 minutes.
2023-10-11T10:30:22 | INFO | Saving the predict function.
2023-10-11T10:30:22 | INFO | Predict function saved.


## Model evaluation
We are to compare the performance of the models with vs without monotone constraints

In [22]:
model_metrics  ={}
models = [base_logs, all_logs, fw_logs, ensemble_logs]
names = ["base_features", "all_features", "fw_features", "ensemble_features"]
sizes = [len(base_features),len(all_features_list), len(fw_features), len(ensemble_features)]

for model, name, size in zip(models, names, sizes):
    model_metrics[f"{name} [{size}]"] = model["metrics"]["roc_auc"]
base_models_df = pd.DataFrame(model_metrics).T.sort_values(by = "validation", ascending = False)

In [23]:
model_metrics  ={}
models = [base_mc_logs, all_mc_logs, fw_mc_logs, ensemble_mc_logs]
names = ["base_mc_features", "all_mc_features", "fw_mc_features", "ensemble_mc_features"]
sizes = [len(base_features),len(all_features_list), len(fw_features), len(ensemble_features)]

for model, name, size in zip(models, names, sizes):
    model_metrics[f"{name} [{size}]"] = model["metrics"]["roc_auc"]
mc_models_df = pd.DataFrame(model_metrics).T.sort_values(by = "validation", ascending = False)

In [29]:
pd.concat([base_models_df, mc_models_df]).sort_index()

Unnamed: 0,out_of_fold,validation
all_features [173],0.86225,0.866135
all_mc_features [173],0.862359,0.867179
base_features [10],0.864081,0.86551
base_mc_features [10],0.861828,0.864351
ensemble_features [82],0.864084,0.866569
ensemble_mc_features [82],0.863107,0.866836
fw_features [53],0.862487,0.866534
fw_mc_features [53],0.862111,0.866872
