# Tutorial

In this tutorial we fairly compare a number of ensemble methods using EI's built in nested cross-validation implementation, and show how predictions can be made with the selected final model. We then show how we can intepret the model by calculating feature rankings.

### Performance analysis and selection of ensemble methods

First of all let's import some `sklearn` models, `EnsembleIntegration` and some additional ensemble methods:

In [1]:
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier, RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC
from sklearn.neural_network import MLPClassifier
from xgboost import XGBClassifier
import pandas as pd
from eipy.ei import EnsembleIntegration
from eipy.additional_ensembles import MeanAggregation, CES

Next make some dummy "multi-modal" data from the breast cancer dataset:

In [2]:
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

data = load_breast_cancer()
feature_names = data.feature_names
X = data.data
X = pd.DataFrame(X, columns=feature_names)  # add feature names
y = np.abs(data.target - 1)  # make "malignancy" the positive class rather than "benign"

X_1 = X.iloc[:, 0:10]
X_2 = X.iloc[:, 10:]

X_1_train, X_1_test, y_train, y_test = train_test_split(X_1, y, test_size=0.2, random_state=3, stratify=y)
X_2_train, X_2_test, _, _ = train_test_split(X_2, y, test_size=0.2, random_state=3, stratify=y)

Create dictionaries containing data modalities:

In [3]:
data_train = {
                "Modality_1": X_1_train,
                "Modality_2": X_2_train
                }

data_test = {
                "Modality_1": X_1_test,
                "Modality_2": X_2_test
                }

Define metrics of interest. `fmax_score` is a custom metric that outputs both a score and a corresponding threshold.

In [19]:
from eipy.metrics import fmax_score
from sklearn.metrics import roc_auc_score, matthews_corrcoef

metrics = {
            'f_max': fmax_score,
            'auc': roc_auc_score,
            'mcc': matthews_corrcoef
            }

Define base predictors:

In [20]:
base_predictors = {
                    'ADAB': AdaBoostClassifier(),
                    'XGB': XGBClassifier(),
                    'DT': DecisionTreeClassifier(),
                    'RF': RandomForestClassifier(), 
                    'GB': GradientBoostingClassifier(),
                    'KNN': KNeighborsClassifier(),
                    'LR': LogisticRegression(),
                    'NB': GaussianNB(),
                    'MLP': MLPClassifier(),
                    'SVM': SVC(probability=True),
}

Initialise Ensemble Integration:

In [6]:
EI = EnsembleIntegration(
                        base_predictors=base_predictors,
                        k_outer=5,
                        k_inner=5,
                        n_samples=1,
                        sampling_strategy="undersampling",
                        sampling_aggregation="mean",
                        n_jobs=-1,
                        metrics=metrics,
                        random_state=38,
                        project_name="breast_cancer",
                        model_building=True,
                        )

Fit base predictors on each modality. Remember to include the unique modality name.

In [7]:
for name, modality in data_train.items():
    EI.fit_base(modality, y_train, modality_name=name)

Training base predictors on Modality_1...
        
... for ensemble performance analysis...


Generating meta training data: |          |  0%

Generating meta training data: |██████████|100%
Generating meta test data: |██████████|100%



... for final ensemble...


Generating meta training data: |██████████|100%
Training final base predictors: |██████████|100%




Training base predictors on Modality_2...
        
... for ensemble performance analysis...


Generating meta training data: |██████████|100%
Generating meta test data: |██████████|100%



... for final ensemble...


Generating meta training data: |██████████|100%
Training final base predictors: |██████████|100%






We can check the cross validated performance of each base predictor on each modality with the `base_summary` dictionary. The metric scores are stored in a dataframe and can be accessed with the `metrics` key. The corresponding threshold values used to threshold the probability vector can be accessed with the `thresholds` key. 

In [8]:
EI.base_summary['metrics']

modality,Modality_1,Modality_1,Modality_1,Modality_1,Modality_1,Modality_1,Modality_1,Modality_1,Modality_1,Modality_1,Modality_2,Modality_2,Modality_2,Modality_2,Modality_2,Modality_2,Modality_2,Modality_2,Modality_2,Modality_2
base predictor,ADAB,DT,GB,KNN,LR,MLP,NB,RF,SVM,XGB,ADAB,DT,GB,KNN,LR,MLP,NB,RF,SVM,XGB
f_max,0.918129,0.890173,0.919075,0.836013,0.882022,0.825397,0.896359,0.912181,0.844156,0.927711,0.952663,0.902857,0.935385,0.896755,0.936047,0.895522,0.935294,0.941896,0.88254,0.943284
auc,0.977802,0.914345,0.977564,0.917214,0.97032,0.924272,0.972621,0.974592,0.935459,0.979608,0.984128,0.926109,0.984004,0.958524,0.988524,0.963715,0.987255,0.987513,0.96613,0.984727
mcc,0.856092,0.823152,0.856725,0.661425,0.79451,0.681488,0.801473,0.856725,0.709596,0.869794,0.911191,0.843132,0.887612,0.835469,0.882481,0.834907,0.877393,0.884048,0.785839,0.888982


Now let's define some meta models for stacked generalization. We add an "S." prefix to the keys of stacking algorithms.

In [13]:
meta_predictors = {     
                    'Mean' : MeanAggregation(),
                    'CES' : CES(scoring=lambda y_test, y_pred: fmax_score(y_test, y_pred)[0]),
                    'S.ADAB': AdaBoostClassifier(),
                    'S.XGB': XGBClassifier(),
                    'S.DT': DecisionTreeClassifier(),
                    "S.RF": RandomForestClassifier(), 
                    'S.GB': GradientBoostingClassifier(),
                    'S.KNN': KNeighborsClassifier(),
                    'S.LR': LogisticRegression(),
                    'S.NB': GaussianNB(),
                    'S.MLP': MLPClassifier(),
                    'S.SVM': SVC(probability=True),
}

Fit meta models:

In [14]:
EI.fit_meta(meta_predictors=meta_predictors)

Analyzing ensembles: |██████████|100%
Training final meta models: |██████████|100%


<eipy.ei.EnsembleIntegration at 0x7f0f81171e50>

Check the meta summary with `meta_summary`:

In [15]:
EI.meta_summary['metrics']

Unnamed: 0,Mean,CES,S.ADAB,S.XGB,S.DT,S.RF,S.GB,S.KNN,S.LR,S.NB,S.MLP,S.SVM
f_max,0.93913,0.946429,0.936047,0.94864,0.935294,0.953846,0.950725,0.94362,0.953488,0.94186,0.956012,0.949853
auc,0.98871,0.989267,0.981424,0.984727,0.948349,0.985614,0.979567,0.975697,0.988338,0.975728,0.98324,0.979794
mcc,0.902221,0.910688,0.88228,0.915499,0.896698,0.910506,0.910506,0.910557,0.910557,0.897779,0.924733,0.910506


The MLP stacking algorithm has the best $\text{F}_\text{max}$ performance (the preferred metric for imbalanced datasets) so let's select it as our final model.

### Predictions on unseen data

Since we ran EI with `model_building=True`, we can make predictions. Let's predict the test set and apply the $\text{F}_\text{max}$ threshold calculated during training:

In [16]:
y_pred = EI.predict(X_dict=data_test, meta_model_key='S.MLP')

threshold = EI.meta_summary['thresholds']['S.MLP']['f_max']

y_pred[y_pred>=threshold] = 1
y_pred[y_pred<threshold] = 0

print(y_pred)

[0. 1. 0. 1. 0. 1. 1. 0. 1. 0. 0. 0. 0. 1. 0. 0. 1. 1. 0. 1. 0. 1. 1. 0.
 1. 1. 1. 0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 1. 0. 0. 0. 1. 0. 0. 1. 0. 0. 1. 0. 1. 0. 0. 0. 1. 0. 0. 1. 1. 0.
 1. 0. 0. 0. 1. 1. 0. 1. 0. 1. 1. 1. 1. 1. 1. 0. 0. 0.]



### Interpreting the final model

We now use `PermutationInterpreter` to interpret the final MLP stacked generalization model. Let's first import `PermutationInterpreter` and our chosen metric, and initialise the interpreter:

In [17]:
from eipy.interpretation import PermutationInterpreter

interpreter = PermutationInterpreter(EI=EI,
                                     metric=lambda y_test, y_pred: fmax_score(y_test, y_pred)[0],
                                     meta_predictor_keys=['S.MLP'])

Calculate feature importance scores:

In [18]:
interpreter.rank_product_score(X_dict=data_test, y=y_test)

Interpreting ensembles...



Calculating local feature ranks: |██████████|100%
Calculating local model ranks: |██████████|100%

Calculating combined rank product score...
... complete!





<eipy.interpretation.PermutationInterpreter at 0x7f10e18ab210>

We can now inspect the most important features for model prediction:

In [15]:
ranking_dataframe = interpreter.ensemble_feature_ranking['S.MLP']

ranking_dataframe

Unnamed: 0,modality,feature,RPS,feature rank,ensemble method
3,Modality_1,mean area,0.081429,1.0,S.MLP
23,Modality_2,worst area,0.1425,2.0,S.MLP
21,Modality_2,worst texture,0.144079,3.0,S.MLP
7,Modality_1,mean concave points,0.153778,4.0,S.MLP
27,Modality_2,worst concave points,0.16952,5.0,S.MLP
6,Modality_1,mean concavity,0.178708,6.0,S.MLP
0,Modality_1,mean radius,0.212458,7.0,S.MLP
26,Modality_2,worst concavity,0.220967,8.0,S.MLP
22,Modality_2,worst perimeter,0.225382,9.0,S.MLP
1,Modality_1,mean texture,0.247306,10.0,S.MLP
