# Model ensemble

## Rationale:

Combining the predictions from different models has proved to be an excellent way to increase model performance in practice. So we are to aggregate the predicitons for our different models i.e. model that were found during feature selection process.

## Methodology:

We will ensemble the 5 different model predictions in 3 different ways:

1. Using the average prediction for each model as the final prediction on the ```private dataset```
2. We are to create a linear stacking model to combine the predictions for each model and then use this model for predicting the ```private dataset```. For doung so, first we need to generate the **Out of fold** predictions of every model and from this we are to train the staker model.
3. We are to follow the same methodology as step 2 but using a non linear model, a **Multilayer perceptron** as stacker model.

## Conclusions:

**Conclusions from Model Performance Table:**

1. **Prediction Averaging**: Models using prediction averaging techniques, such as 'prediction_average' and 'prediction_boruta,' consistently perform well with high ROC AUC scores in both out-of-fold and validation datasets.

2. **Ensemble Models**: 'prediction_ensemble' and 'prediction_MrMr' exhibit competitive performance, indicating that ensemble techniques are effective for this task.

3. **All Features Model**: The 'prediction_all_features' model, which includes all available features, performs respectably, demonstrating that feature selection methods might be further explored for optimization.

4. **Optuna Optimization**: 'prediction_Optuna' shows slightly lower ROC AUC scores, suggesting that the specific optimization process may require further tuning or different hyperparameter settings.

5. **Validation Dataset**: In general, the models perform slightly better on the validation dataset compared to the out-of-fold dataset, indicating that further validation and cross-validation might help in model selection.

6. **Overall Strategy**: Depending on the specific goals and constraints of the project, different strategies (averaging, ensemble, feature selection) can be considered for achieving the desired performance.




| model                    | oof_roc_auc | validation_roc_auc |
|--------------------------|------------|---------------------|
| prediction_lr            | 0.795479   | 0.801554            |
| prediction_average       | 0.795401   | 0.801386            |
| prediction_boruta        | 0.795680   | 0.800874            |
| prediction_ensemble      | 0.792880   | 0.799627            |
| prediction_MrMr          | 0.792839   | 0.799425            |
| prediction_all_features  | 0.791760   | 0.799144            |
| prediction_Optuna        | 0.788259   | 0.795938            |







In [63]:
%load_ext autoreload
%autoreload 2

import cloudpickle as cp
from functools import reduce

import numpy as np
import pandas as pd

from copy import deepcopy

from sklearn.model_selection import KFold, cross_val_score
from sklearn.metrics import roc_auc_score

import warnings;warnings.filterwarnings("ignore")

import joblib

from sklearn.linear_model import LogisticRegressionCV

import sys
sys.path.append("../")

# local imports
from src.learner_params import target_column, space_column, boruta_learner_params, test_params
from utils.functions__utils import find_constraint
from utils.functions__utils import train_binary

from utils.feature_selection_lists import fw_features, boruta_features, optuna_features, ensemble_features
from utils.features_lists import all_features_list

from src.learner_params import MODEL_PARAMS

from utils.functions__training import model_pipeline

from src.learner_params import params_all, params_ensemble, params_fw, params_optuna

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


### Load the datasets

In [64]:
validation_df = pd.read_pickle("../data/validation_df.pkl")
# load the datasets
all_oof_df = pd.read_pickle("../data/all_oof_df.pkl")
boruta_oof_df = pd.read_pickle("../data/boruta_oof_df.pkl")
fw_oof_df = pd.read_pickle("../data/fw_oof_df.pkl")
optuna_oof_df = pd.read_pickle("../data/optuna_oof_df.pkl")
ensemble_oof_df = pd.read_pickle("../data/ensemble_oof_df.pkl")
# load the learners
all_predict_fn = joblib.load("../model_files/all_learner.pkl")
boruta_predict_fn = joblib.load("../model_files/boruta_learner.pkl")
fw_predict_fn = joblib.load("../model_files/fw_learner.pkl")
optuna_predict_fn = joblib.load("../model_files/optuna_learner.pkl")
ensemble_predict_fn = joblib.load("../model_files/ensemble_learner.pkl")

ldf = [
        all_oof_df,
        boruta_oof_df,
        fw_oof_df, 
        optuna_oof_df, 
        ensemble_oof_df
]

lpf = [all_predict_fn,
       boruta_predict_fn,
       fw_predict_fn,
       optuna_predict_fn, 
       ensemble_predict_fn
      ]

names = ["all_features",
        "boruta",
        "MrMr",
        "Optuna", 
        "ensemble"
      ]


### Join the datasets

In [65]:
l, v = [], []

for name, _df in zip(names, ldf):
    aux = _df[[space_column, "prediction"]].rename(columns = {"prediction":f"prediction_{name}"})
    l.append(aux)
df_oof = reduce(lambda x,y:pd.merge(x,y, on = space_column), l)

for name, predict_fn in zip(names, lpf):
    aux = predict_fn["predict_fn"](validation_df)[[space_column, "prediction"]].rename(columns = {"prediction":f"prediction_{name}"})
    v.append(aux)
df_validation = reduce(lambda x,y:pd.merge(x,y, on = space_column), v)

columns = ['prediction_MrMr',
 'prediction_Optuna',
 'prediction_all_features',
 'prediction_boruta',
 'prediction_ensemble']

### Create the predictions

In [66]:
df_oof.loc[:,"prediction_average"] = df_oof.loc[:,columns].mean(axis = 1)
df_validation.loc[:,"prediction_average"] = df_validation.loc[:,columns].mean(axis = 1)

In [68]:
lr = LogisticRegressionCV(cv = 3, random_state=42)

In [69]:
aux = df_oof.merge(all_oof_df[[space_column, target_column]], on = space_column)
result = train_binary(aux, columns, target_column, lr)

Score on test set for fold 1 is :0.798
Score on test set for fold 2 is :0.795
Score on test set for fold 3 is :0.794


In [70]:
df_validation.loc[:,"prediction_lr"] = result["model"].predict_proba(df_validation[columns])[:,1]
df_oof = df_oof.merge(result["data"][[space_column, "prediction"]], on = space_column)
df_oof.rename(columns = {"prediction":"prediction_lr"}, inplace = True)

In [71]:
df_oof = df_oof.merge(all_oof_df[[space_column, target_column]], on = space_column)
df_validation = df_validation.merge(validation_df[[space_column, target_column]], on = space_column)

### Performance comparison

In [72]:
columns = ['prediction_MrMr',
 'prediction_Optuna',
 'prediction_all_features',
 'prediction_boruta',
 'prediction_ensemble',
 'prediction_average', 
 'prediction_lr']

In [73]:
ls = {}
for col in columns:
    ls[f"{col}"] = roc_auc_score(df_oof[target_column], df_oof[col])

lv = {}
for col in columns:
    lv[f"{col}"] = roc_auc_score(df_validation[target_column], df_validation[col])

In [84]:
pd.merge(pd.DataFrame(ls.items(), columns = ["model", "oof_roc_auc"]), 
         pd.DataFrame(lv.items(), columns = ["model", "validation_roc_auc"]),
         on = "model").sort_values(by = "validation_roc_auc", ascending = False).set_index("model")

Unnamed: 0_level_0,oof_roc_auc,validation_roc_auc
model,Unnamed: 1_level_1,Unnamed: 2_level_1
prediction_lr,0.795479,0.801554
prediction_average,0.795401,0.801386
prediction_boruta,0.79568,0.800874
prediction_ensemble,0.79288,0.799627
prediction_MrMr,0.792839,0.799425
prediction_all_features,0.79176,0.799144
prediction_Optuna,0.788259,0.795938
