---
* Title:  Feature Importance Ensemble
* Author: Divish Rengasamy
* Date:   14 May 2020
---

# Dependencies

* eli5 version: 0.10.1
* matplotlib version: 3.1.3
* numpy version: 1.18.2
* pandas version: 1.0.3
* plotly version: 4.7.1
* shap version: 0.35.0
* torch version: 1.4.0
* scipy version: 1.4.1
* sklearn version: 0.22
* tensorflow_addons version: 0.9.1
* captum version: 0.2.0

In [8]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import plotly as py
import shap
from scipy import sparse
from sklearn.datasets import make_regression
from ensembleFI import ensemble_feature_importance
plt.style.use("seaborn-pastel")
import timeit
%matplotlib inline

# Generate Dataset 

In [9]:
_noise = np.arange(0,6,2)
_noise

array([0, 2, 4])

In [10]:
_informative = np.arange(20,120,20)
_informative

array([ 20,  40,  60,  80, 100])

In [11]:
_num_feats = np.arange(20,120, 40)
_num_feats

array([ 20,  60, 100])

In [12]:
_n_samples = np.arange(1000,6000,2000)
_n_samples

array([1000, 3000, 5000])

In [13]:
noise_level = list()
informative_level = list()
num_features_level = list()
for i in range(len(_noise)):
    for j in range(len(_informative)):
        for k in range(len(_num_feats)):
            noise_level.append(_noise[i])
            informative_level.append(_informative[j])
            num_features_level.append(_num_feats[k])

In [15]:
# TODO: 
#COMPARE model error vs importance error
# experimental variable: Noise, informative, number of samples, effective rank, number of features

#IMPORTANT
##################################################
# NORMALIZE RF+GB+DNN SV THEN RECALCULATE ERROR
# rank + mean (done)
##################################################

In [None]:
#num_samples = 1000
rf_mae = list()
gb_mae = list()
dnn_mae = list()

err_rf_pi = list()
err_gb_pi = list()
err_dnn_pi = list()
err_rf_sv = list() 
err_gb_sv = list()
err_dnn_sv = list()
err_dnn_ig = list()

err_mean = list()
err_median = list()
err_mode = list()
err_box = list()
err_tau = list()
err_major = list()
err_kendall = list()
err_spearman = list()
non_sig_ratio = list()

num_sam = 1000
for i in range(len(_noise)):
    for j in range(len(_informative)):
        for k in range(len(_num_feats)):
            print(f"Noise:{_noise[i]}, n_inform:{_informative[j]}%, n_feats:{_num_feats[k]}, n_samples:1000")
            start = timeit.default_timer()

            # high informative high noise
            _features, _output, _coef = make_regression(
                n_samples=num_sam,
                # three features
                n_features=_num_feats[k],
                # where only two features are useful,
                n_informative=int(_num_feats[k]*(_informative[j]/100)),
                # a single target value per observation
                n_targets=1,
                # 0.0 standard deviation of the guassian noise
                noise=_noise[i],
                # show the true coefficient used to generated the data
                coef=True,
            )

            temp_non_sig_ratio ,temp_rf_mae, temp_gb_mae, temp_dnn_mae, temp_err_rf_pi, temp_err_gb_pi, temp_err_dnn_pi, temp_err_rf_sv, temp_err_gb_sv, temp_err_dnn_sv, temp_err_dnn_ig, temp_err_mean, temp_err_median, temp_err_mode, temp_err_box, temp_err_tau, temp_err_major, temp_err_kendall, temp_err_spearman = ensemble_feature_importance(_num_feats[k], _features, _output, _coef, informative=int((_num_feats[k])*(_informative[j]/100)), noise=_noise[i])

            stop = timeit.default_timer()

            print('Time: ', stop - start)
            non_sig_ratio.append(temp_non_sig_ratio)
            rf_mae.append(temp_rf_mae)
            gb_mae.append(temp_gb_mae)
            dnn_mae.append(temp_dnn_mae)
            err_rf_pi.append(temp_err_rf_pi)
            err_gb_pi.append(temp_err_gb_pi)
            err_dnn_pi.append(temp_err_dnn_pi)
            err_rf_sv.append(temp_err_rf_sv)
            err_gb_sv.append(temp_err_gb_sv)
            err_dnn_sv.append(temp_err_dnn_sv)
            err_dnn_ig.append(temp_err_dnn_ig)
            err_mean.append(temp_err_mean)
            err_median.append(temp_err_median)
            err_mode.append(temp_err_mode)
            err_box.append(temp_err_box)
            err_tau.append(temp_err_tau)
            err_major.append(temp_err_major)
            err_kendall.append(temp_err_kendall)
            err_spearman.append(temp_err_spearman)
            

Noise:0, n_inform:20%, n_feats:20, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:40.61946632827824
predicted_shape:(200,) y_test shape:(200,)
gb_mae:32.62023238805669


Setting feature_perturbation = "tree_path_dependent" because no background data was given.
The sklearn.ensemble.gradient_boosting module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.ensemble. Anything that cannot be imported from sklearn.ensemble is now part of the private API.


Epoch 1/1000 => Loss: 19623.107422
Epoch 101/1000 => Loss: 12228.665039
Epoch 201/1000 => Loss: 6477.524902
Epoch 301/1000 => Loss: 3282.226562
Epoch 401/1000 => Loss: 1573.626221
Epoch 501/1000 => Loss: 743.093140
Epoch 601/1000 => Loss: 319.109619
Epoch 701/1000 => Loss: 141.048431
Epoch 801/1000 => Loss: 75.376892
Epoch 901/1000 => Loss: 22.293425
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 39.82363135901057
Instructions for updating:
If using Keras pass *_constraint arguments to layers.


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.03005494686353234
GB PI error: 0.030634981907259197
DNN PI error: 0.2945860694439427
RF+GB+DNN PI error: 0.11842533273824477

RF SV error: 0.02094414894025356
GB SV error: 0.012606987552258805
DNN SV error: 0.22789883713942496
RF+GB+DNN SV error: 0.013210644492654014

GB PI error: 0.030634981907259197
GB SV error: 0.012606987552258805
GB PI + GB SV error: 0.021620984729758992

RF PI error: 0.03005494686353234
RF SV error: 0.02094414894025356
RF PI + RF SV error: 0.02549954790189295

DNN PI error: 0.2945860694439427
DNN SV error: 0.0060807969854496675
DNN IG error: 0.22789883713942496
DNN PI + DNN SV + DNN IG error: 0.1676239981982311

Dummy Average of 0.0 error: 0.14526220763881112
Dummy Average of 0.5 error: 0.44763636974754323
Dummy Average of 0.1 error: 0.8547377923611889

 All MAE (mean): 0.07222526738495233
 All MAE (median): 0.02240805866386396
 All MAE (mode): 0.07222526458226339
 All MAE (box-whiskers): 0.042788056024543365
 All MAE (tau-test): 0.036706003724693

Time:  474.21831193205435
Noise:0, n_inform:20%, n_feats:60, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:92.53724171202168
predicted_shape:(200,) y_test shape:(200,)
gb_mae:86.71708327084536


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 38701.343750
Epoch 101/1000 => Loss: 26177.570312
Epoch 201/1000 => Loss: 15747.554688
Epoch 301/1000 => Loss: 9041.829102
Epoch 401/1000 => Loss: 5017.746094
Epoch 501/1000 => Loss: 2676.749756
Epoch 601/1000 => Loss: 1371.380737
Epoch 701/1000 => Loss: 658.197510
Epoch 801/1000 => Loss: 361.709106
Epoch 901/1000 => Loss: 134.959885
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 65.17645367241573


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.05992042415654597
GB PI error: 0.05932275753788156
DNN PI error: 0.14368150255928586
RF+GB+DNN PI error: 0.081289727281767

RF SV error: 0.04961042306059355
GB SV error: 0.043428473134256976
DNN SV error: 0.2465762623047339
RF+GB+DNN SV error: 0.044419105662746455

GB PI error: 0.05932275753788156
GB SV error: 0.043428473134256976
GB PI + GB SV error: 0.05137561533606927

RF PI error: 0.05992042415654597
RF SV error: 0.04961042306059355
RF PI + RF SV error: 0.054765423608569765

DNN PI error: 0.14368150255928586
DNN SV error: 0.041374391186404784
DNN IG error: 0.2465762623047339
DNN PI + DNN SV + DNN IG error: 0.13504272586293511

Dummy Average of 0.0 error: 0.10387900443249455
Dummy Average of 0.5 error: 0.44953894983476617
Dummy Average of 0.1 error: 0.8961209955675055

 All MAE (mean): 0.072917960753732
 All MAE (median): 0.053866127535833835
 All MAE (mode): 0.07291796658504321
 All MAE (box-whiskers): 0.051696676964928856
 All MAE (tau-test): 0.05352280139243038
 A

Time:  532.3357184189954
Noise:0, n_inform:20%, n_feats:100, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:128.33210990705612
predicted_shape:(200,) y_test shape:(200,)
gb_mae:117.46122741623218


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 45528.488281
Epoch 101/1000 => Loss: 31708.136719
Epoch 201/1000 => Loss: 19830.494141
Epoch 301/1000 => Loss: 11901.843750
Epoch 401/1000 => Loss: 6876.151367
Epoch 501/1000 => Loss: 3771.346191
Epoch 601/1000 => Loss: 2088.061279
Epoch 701/1000 => Loss: 1059.648438
Epoch 801/1000 => Loss: 540.159546
Epoch 901/1000 => Loss: 368.099884
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 64.1750485109233


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.06786971485780258
GB PI error: 0.076391316507782
DNN PI error: 0.15841060435520304
RF+GB+DNN PI error: 0.09737041161373915

RF SV error: 0.06329466890395663
GB SV error: 0.05782215683610455
DNN SV error: 0.1833346047779836
RF+GB+DNN SV error: 0.06506007344852385

GB PI error: 0.076391316507782
GB SV error: 0.05782215683610455
GB PI + GB SV error: 0.06710673667194328

RF PI error: 0.06786971485780258
RF SV error: 0.06329466890395663
RF PI + RF SV error: 0.0655821918808796

DNN PI error: 0.15841060435520304
DNN SV error: 0.07610604621821165
DNN IG error: 0.1833346047779836
DNN PI + DNN SV + DNN IG error: 0.13514110248123548

Dummy Average of 0.0 error: 0.10061784387966886
Dummy Average of 0.5 error: 0.4504875544734357
Dummy Average of 0.1 error: 0.8993821561203312

 All MAE (mean): 0.07906338758629738
 All MAE (median): 0.07260008973466575
 All MAE (mode): 0.0790633658128579
 All MAE (box-whiskers): 0.06505448546980085
 All MAE (tau-test): 0.06561361219730119
 All MAE (ma

Time:  608.7375983649981
Noise:0, n_inform:40%, n_feats:20, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:46.20903178117315
predicted_shape:(200,) y_test shape:(200,)
gb_mae:40.74969579109707


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 22833.865234
Epoch 101/1000 => Loss: 14316.892578
Epoch 201/1000 => Loss: 7554.087891
Epoch 301/1000 => Loss: 3780.575928
Epoch 401/1000 => Loss: 1830.951538
Epoch 501/1000 => Loss: 886.846741
Epoch 601/1000 => Loss: 800.643433
Epoch 701/1000 => Loss: 135.949768
Epoch 801/1000 => Loss: 47.652706
Epoch 901/1000 => Loss: 37.370987
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 32.43215043385238


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.08007956645003597
GB PI error: 0.07099262033960861
DNN PI error: 0.324940329749411
RF+GB+DNN PI error: 0.152033548135501

RF SV error: 0.06661735247585246
GB SV error: 0.04720914520273598
DNN SV error: 0.3035326591718968
RF+GB+DNN SV error: 0.03854037763956864

GB PI error: 0.07099262033960861
GB SV error: 0.04720914520273598
GB PI + GB SV error: 0.05896354078091488

RF PI error: 0.08007956645003597
RF SV error: 0.06661735247585246
RF PI + RF SV error: 0.07334845946294422

DNN PI error: 0.324940329749411
DNN SV error: 0.00881748880759024
DNN IG error: 0.3035326591718968
DNN PI + DNN SV + DNN IG error: 0.2069384275051625

Dummy Average of 0.0 error: 0.1859346979533052
Dummy Average of 0.5 error: 0.4055965259209879
Dummy Average of 0.1 error: 0.8140653020466948

 All MAE (mean): 0.10186264322959601
 All MAE (median): 0.06775922117957156
 All MAE (mode): 0.10186263844493548
 All MAE (box-whiskers): 0.06568733735705728
 All MAE (tau-test): 0.06001856058732362
 All MAE (majo

Time:  456.81782620900776
Noise:0, n_inform:40%, n_feats:60, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:158.9903414323364
predicted_shape:(200,) y_test shape:(200,)
gb_mae:148.64790995663475


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 70476.976562
Epoch 101/1000 => Loss: 49003.089844
Epoch 201/1000 => Loss: 33586.324219
Epoch 301/1000 => Loss: 22312.849609
Epoch 401/1000 => Loss: 14513.568359
Epoch 501/1000 => Loss: 9312.574219
Epoch 601/1000 => Loss: 5873.027344
Epoch 701/1000 => Loss: 3529.406494
Epoch 801/1000 => Loss: 2129.095215
Epoch 901/1000 => Loss: 1197.253296
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 84.7181915490907


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.13908340158344493
GB PI error: 0.11911957679875657
DNN PI error: 0.2965569736872074
RF+GB+DNN PI error: 0.17765041757818947

RF SV error: 0.1298126464860447
GB SV error: 0.10822302908281713
DNN SV error: 0.246328829106726
RF+GB+DNN SV error: 0.09274615626247991

GB PI error: 0.11911957679875657
GB SV error: 0.10822302908281713
GB PI + GB SV error: 0.11367130294078684

RF PI error: 0.13908340158344493
RF SV error: 0.1298126464860447
RF PI + RF SV error: 0.1344480240347448

DNN PI error: 0.2965569736872074
DNN SV error: 0.058285165063709087
DNN IG error: 0.246328829106726
DNN PI + DNN SV + DNN IG error: 0.18686073599922298

Dummy Average of 0.0 error: 0.20574406381001892
Dummy Average of 0.5 error: 0.39222467484600154
Dummy Average of 0.1 error: 0.794255936189981

 All MAE (mean): 0.12901224784082216
 All MAE (median): 0.12358294278088344
 All MAE (mode): 0.12901225902770716
 All MAE (box-whiskers): 0.11834361466140311
 All MAE (tau-test): 0.12533805352334204
 All MAE (ma

Time:  550.9991540810443
Noise:0, n_inform:40%, n_feats:100, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:208.68235995452554
predicted_shape:(200,) y_test shape:(200,)
gb_mae:197.58962515489915


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 108675.710938
Epoch 101/1000 => Loss: 87453.796875
Epoch 201/1000 => Loss: 66918.601562
Epoch 301/1000 => Loss: 47671.058594
Epoch 401/1000 => Loss: 33213.183594
Epoch 501/1000 => Loss: 23217.687500
Epoch 601/1000 => Loss: 16141.665039
Epoch 701/1000 => Loss: 11052.888672
Epoch 801/1000 => Loss: 7414.513672
Epoch 901/1000 => Loss: 5124.335938
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 107.17991307364322


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.1291787371319362
GB PI error: 0.13091666740253388
DNN PI error: 0.27087718831153446
RF+GB+DNN PI error: 0.1683718309079936

RF SV error: 0.12714448019907984
GB SV error: 0.11343837188427049
DNN SV error: 0.1654402605231856
RF+GB+DNN SV error: 0.1147967883729079

GB PI error: 0.13091666740253388
GB SV error: 0.11343837188427049
GB PI + GB SV error: 0.12107955784181917

RF PI error: 0.1291787371319362
RF SV error: 0.12714448019907984
RF PI + RF SV error: 0.12631811017222358

DNN PI error: 0.27087718831153446
DNN SV error: 0.12383624505824133
DNN IG error: 0.1654402605231856
DNN PI + DNN SV + DNN IG error: 0.17248184907175512

Dummy Average of 0.0 error: 0.1774337533737018
Dummy Average of 0.5 error: 0.4056752682552778
Dummy Average of 0.1 error: 0.8225662466262982

 All MAE (mean): 0.11817763634230094
 All MAE (median): 0.12436970108945598
 All MAE (mode): 0.11817761743317769
 All MAE (box-whiskers): 0.12593876547902927
 All MAE (tau-test): 0.13062273324015647
 All MAE (m

Time:  648.3572620769846
Noise:0, n_inform:60%, n_feats:20, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:104.40941262461133
predicted_shape:(200,) y_test shape:(200,)
gb_mae:87.89268234986706


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 44364.199219
Epoch 101/1000 => Loss: 31534.953125
Epoch 201/1000 => Loss: 20213.859375
Epoch 301/1000 => Loss: 12626.438477
Epoch 401/1000 => Loss: 7785.760742
Epoch 501/1000 => Loss: 4683.794922
Epoch 601/1000 => Loss: 2705.598389
Epoch 701/1000 => Loss: 1444.914551
Epoch 801/1000 => Loss: 716.147034
Epoch 901/1000 => Loss: 387.042969
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 59.31400976533996


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.15359609284062284
GB PI error: 0.11053643367876234
DNN PI error: 0.5689853042902777
RF+GB+DNN PI error: 0.27154125411509045

RF SV error: 0.11705978746071644
GB SV error: 0.07548373846325614
DNN SV error: 0.2382152680454715
RF+GB+DNN SV error: 0.06272431281382344

GB PI error: 0.11053643367876234
GB SV error: 0.07548373846325614
GB PI + GB SV error: 0.09301008607100923

RF PI error: 0.15359609284062284
RF SV error: 0.11705978746071644
RF PI + RF SV error: 0.13394250569910365

DNN PI error: 0.5689853042902777
DNN SV error: 0.03332312985786896
DNN IG error: 0.2382152680454715
DNN PI + DNN SV + DNN IG error: 0.26407075253505374

Dummy Average of 0.0 error: 0.35794060780099957
Dummy Average of 0.5 error: 0.3693465161662152
Dummy Average of 0.1 error: 0.6420593921990003

 All MAE (mean): 0.16914989618157789
 All MAE (median): 0.1270801489484113
 All MAE (mode): 0.16896402678504044
 All MAE (box-whiskers): 0.1350068853199732
 All MAE (tau-test): 0.12283824029185313
 All MAE (

Time:  495.12662701902445
Noise:0, n_inform:60%, n_feats:60, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:210.29942561447118
predicted_shape:(200,) y_test shape:(200,)
gb_mae:197.8669101086064


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 111582.781250
Epoch 101/1000 => Loss: 86751.796875
Epoch 201/1000 => Loss: 64806.574219
Epoch 301/1000 => Loss: 47476.824219
Epoch 401/1000 => Loss: 34413.984375
Epoch 501/1000 => Loss: 24547.335938
Epoch 601/1000 => Loss: 17333.070312
Epoch 701/1000 => Loss: 12204.553711
Epoch 801/1000 => Loss: 8303.990234
Epoch 901/1000 => Loss: 5713.091797
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 97.19585065553615


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.1949522988283978
GB PI error: 0.17275622859250722
DNN PI error: 0.3409302221767447
RF+GB+DNN PI error: 0.2194260220112333

RF SV error: 0.18025004437118256
GB SV error: 0.14127921532564905
DNN SV error: 0.2889731004693389
RF+GB+DNN SV error: 0.13229679079700993

GB PI error: 0.17275622859250722
GB SV error: 0.14127921532564905
GB PI + GB SV error: 0.15588537778163236

RF PI error: 0.1949522988283978
RF SV error: 0.18025004437118256
RF PI + RF SV error: 0.18707670637733387

DNN PI error: 0.3409302221767447
DNN SV error: 0.10365203110908629
DNN IG error: 0.2889731004693389
DNN PI + DNN SV + DNN IG error: 0.2248801625104016

Dummy Average of 0.0 error: 0.29029250602958806
Dummy Average of 0.5 error: 0.3424223540861859
Dummy Average of 0.1 error: 0.7097074939704119

 All MAE (mean): 0.15866004487269797
 All MAE (median): 0.1687403887345835
 All MAE (mode): 0.1586600492802746
 All MAE (box-whiskers): 0.15889009725716327
 All MAE (tau-test): 0.16262593398979822
 All MAE (majo

Time:  587.8957072789781
Noise:0, n_inform:60%, n_feats:100, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:298.25503621887435
predicted_shape:(200,) y_test shape:(200,)
gb_mae:291.8023990782129


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 203205.703125
Epoch 101/1000 => Loss: 171370.265625
Epoch 201/1000 => Loss: 140682.640625
Epoch 301/1000 => Loss: 110410.593750
Epoch 401/1000 => Loss: 86941.539062
Epoch 501/1000 => Loss: 68683.250000
Epoch 601/1000 => Loss: 52386.851562
Epoch 701/1000 => Loss: 39999.660156
Epoch 801/1000 => Loss: 29658.154297
Epoch 901/1000 => Loss: 22366.632812
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 171.71766107658632


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.25252841136740023
GB PI error: 0.22729315901692032
DNN PI error: 0.29962090275457653
RF+GB+DNN PI error: 0.25242930644064426

RF SV error: 0.24268875150589106
GB SV error: 0.22607333520741965
DNN SV error: 0.22959953424875412
RF+GB+DNN SV error: 0.20465575941414982

GB PI error: 0.22729315901692032
GB SV error: 0.22607333520741965
GB PI + GB SV error: 0.2223003307146894

RF PI error: 0.25252841136740023
RF SV error: 0.24268875150589106
RF PI + RF SV error: 0.2475394745631486

DNN PI error: 0.29962090275457653
DNN SV error: 0.1651520779185842
DNN IG error: 0.22959953424875412
DNN PI + DNN SV + DNN IG error: 0.22298959368827184

Dummy Average of 0.0 error: 0.3103413905171269
Dummy Average of 0.5 error: 0.34777112429904555
Dummy Average of 0.1 error: 0.689658609482873

 All MAE (mean): 0.19913984935241152
 All MAE (median): 0.22289872756375428
 All MAE (mode): 0.19913983871647611
 All MAE (box-whiskers): 0.21105994610701143
 All MAE (tau-test): 0.21612315980693647
 All MAE

Time:  610.2261691149906
Noise:0, n_inform:80%, n_feats:20, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:76.97901547999611
predicted_shape:(200,) y_test shape:(200,)
gb_mae:70.09181805855282


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 39629.738281
Epoch 101/1000 => Loss: 27975.070312
Epoch 201/1000 => Loss: 17358.162109
Epoch 301/1000 => Loss: 10326.499023
Epoch 401/1000 => Loss: 5922.744141
Epoch 501/1000 => Loss: 3314.102295
Epoch 601/1000 => Loss: 1732.247070
Epoch 701/1000 => Loss: 867.460876
Epoch 801/1000 => Loss: 406.988647
Epoch 901/1000 => Loss: 255.400742
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 44.42616916943475


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.1821690801488985
GB PI error: 0.1520040081208029
DNN PI error: 0.32726153082252607
RF+GB+DNN PI error: 0.18091909091631714

RF SV error: 0.14193981007190676
GB SV error: 0.10610587098670632
DNN SV error: 0.32873509866368306
RF+GB+DNN SV error: 0.09393517180454349

GB PI error: 0.1520040081208029
GB SV error: 0.10610587098670632
GB PI + GB SV error: 0.12905493955375458

RF PI error: 0.1821690801488985
RF SV error: 0.14193981007190676
RF PI + RF SV error: 0.16205444511040262

DNN PI error: 0.32726153082252607
DNN SV error: 0.03538078941404603
DNN IG error: 0.32873509866368306
DNN PI + DNN SV + DNN IG error: 0.20109695493559734

Dummy Average of 0.0 error: 0.35288664957461857
Dummy Average of 0.5 error: 0.29884842949779616
Dummy Average of 0.1 error: 0.6471133504253814

 All MAE (mean): 0.12857207070989762
 All MAE (median): 0.1391502057447513
 All MAE (mode): 0.1285720765711427
 All MAE (box-whiskers): 0.14046275084791676
 All MAE (tau-test): 0.14714693676527107
 All MAE 

Time:  502.52820156898815
Noise:0, n_inform:80%, n_feats:60, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:227.74083129017825
predicted_shape:(200,) y_test shape:(200,)
gb_mae:223.65628501758306


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 141929.687500
Epoch 101/1000 => Loss: 106255.140625
Epoch 201/1000 => Loss: 79787.031250
Epoch 301/1000 => Loss: 59374.718750
Epoch 401/1000 => Loss: 44007.574219
Epoch 501/1000 => Loss: 32125.607422
Epoch 601/1000 => Loss: 23348.210938
Epoch 701/1000 => Loss: 16708.447266
Epoch 801/1000 => Loss: 11708.950195
Epoch 901/1000 => Loss: 8298.475586
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 107.16553797856206


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.24316510119210638
GB PI error: 0.20596151395352075
DNN PI error: 0.34323778560458323
RF+GB+DNN PI error: 0.2417952249123043

RF SV error: 0.2317867493012969
GB SV error: 0.18476661213822612
DNN SV error: 0.33042020885750295
RF+GB+DNN SV error: 0.1688244996039971

GB PI error: 0.20596151395352075
GB SV error: 0.18476661213822612
GB PI + GB SV error: 0.19131605144920755

RF PI error: 0.24316510119210638
RF SV error: 0.2317867493012969
RF PI + RF SV error: 0.23665529404405142

DNN PI error: 0.34323778560458323
DNN SV error: 0.12506246719201225
DNN IG error: 0.33042020885750295
DNN PI + DNN SV + DNN IG error: 0.24441472989390833

Dummy Average of 0.0 error: 0.36152015620604167
Dummy Average of 0.5 error: 0.3294327573215156
Dummy Average of 0.1 error: 0.6384798437939586

 All MAE (mean): 0.1842055311270672
 All MAE (median): 0.19243546431338396
 All MAE (mode): 0.1842055639384296
 All MAE (box-whiskers): 0.20199530861989945
 All MAE (tau-test): 0.20249491668219002
 All MAE (

Time:  574.0431990980287
Noise:0, n_inform:80%, n_feats:100, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:356.62675702120964
predicted_shape:(200,) y_test shape:(200,)
gb_mae:350.80496600377023


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 311350.531250
Epoch 101/1000 => Loss: 257223.453125
Epoch 201/1000 => Loss: 218058.734375
Epoch 301/1000 => Loss: 181541.796875
Epoch 401/1000 => Loss: 150453.031250
Epoch 501/1000 => Loss: 121951.492188
Epoch 601/1000 => Loss: 100336.812500
Epoch 701/1000 => Loss: 81099.921875
Epoch 801/1000 => Loss: 65576.429688
Epoch 901/1000 => Loss: 53307.101562
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 175.52065586818892


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.3148438918200093
GB PI error: 0.2519485303486559
DNN PI error: 0.34706450935979377
RF+GB+DNN PI error: 0.2883804898600379

RF SV error: 0.3388271462290851
GB SV error: 0.3121159317939386
DNN SV error: 0.31178351687214023
RF+GB+DNN SV error: 0.27450208875328314

GB PI error: 0.2519485303486559
GB SV error: 0.3121159317939386
GB PI + GB SV error: 0.26675536006082234

RF PI error: 0.3148438918200093
RF SV error: 0.3388271462290851
RF PI + RF SV error: 0.325466371943231

DNN PI error: 0.34706450935979377
DNN SV error: 0.20474507384622645
DNN IG error: 0.31178351687214023
DNN PI + DNN SV + DNN IG error: 0.27058183452204093

Dummy Average of 0.0 error: 0.4260579837358188
Dummy Average of 0.5 error: 0.31074832047341344
Dummy Average of 0.1 error: 0.5739420162641813

 All MAE (mean): 0.25029032941354956
 All MAE (median): 0.2976472623084032
 All MAE (mode): 0.2502903091311145
 All MAE (box-whiskers): 0.2527427404418428
 All MAE (tau-test): 0.25710916415182883
 All MAE (majority

Time:  705.6957046450116
Noise:0, n_inform:100%, n_feats:20, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:157.92570041991002
predicted_shape:(200,) y_test shape:(200,)
gb_mae:137.6149316553751


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 69224.460938
Epoch 101/1000 => Loss: 52347.394531
Epoch 201/1000 => Loss: 36063.078125
Epoch 301/1000 => Loss: 24243.095703
Epoch 401/1000 => Loss: 15918.988281
Epoch 501/1000 => Loss: 10154.441406
Epoch 601/1000 => Loss: 6421.067383
Epoch 701/1000 => Loss: 3966.428955
Epoch 801/1000 => Loss: 2387.546875
Epoch 901/1000 => Loss: 1442.892212
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 65.27304209126505


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.24412181287169182
GB PI error: 0.20699003172995903
DNN PI error: 0.26360873749293917
RF+GB+DNN PI error: 0.18696918035150661

RF SV error: 0.21289752598174827
GB SV error: 0.16005679118604949
DNN SV error: 0.3156257508224286
RF+GB+DNN SV error: 0.12638714365970496

GB PI error: 0.20699003172995903
GB SV error: 0.16005679118604949
GB PI + GB SV error: 0.16748421290059343

RF PI error: 0.24412181287169182
RF SV error: 0.21289752598174827
RF PI + RF SV error: 0.21356800902825163

DNN PI error: 0.26360873749293917
DNN SV error: 0.035753438767539114
DNN IG error: 0.3156257508224286
DNN PI + DNN SV + DNN IG error: 0.13647937790242248

Dummy Average of 0.0 error: 0.5125517764807115
Dummy Average of 0.5 error: 0.23351349840789443
Dummy Average of 0.1 error: 0.4874482235192885

 All MAE (mean): 0.10629971589048581
 All MAE (median): 0.1697381146849253
 All MAE (mode): 0.10629965496473419
 All MAE (box-whiskers): 0.1140620656630175
 All MAE (tau-test): 0.12704150095169972
 All MA

Time:  529.5343992490089
Noise:0, n_inform:100%, n_feats:60, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:292.24570113274547
predicted_shape:(200,) y_test shape:(200,)
gb_mae:275.98088563215714


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 198999.546875
Epoch 101/1000 => Loss: 165182.187500
Epoch 201/1000 => Loss: 134351.937500
Epoch 301/1000 => Loss: 105224.562500
Epoch 401/1000 => Loss: 82958.812500
Epoch 501/1000 => Loss: 65060.011719
Epoch 601/1000 => Loss: 50538.769531
Epoch 701/1000 => Loss: 39017.457031
Epoch 801/1000 => Loss: 30223.101562
Epoch 901/1000 => Loss: 22732.929688
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 135.8820425316256


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.3929479726761445
GB PI error: 0.2088346431472343
DNN PI error: 0.3994769760484199
RF+GB+DNN PI error: 0.29994585003327434

RF SV error: 0.39310920127570276
GB SV error: 0.31029264367254067
DNN SV error: 0.3195189454370412
RF+GB+DNN SV error: 0.2962145603085919

GB PI error: 0.2088346431472343
GB SV error: 0.31029264367254067
GB PI + GB SV error: 0.24786272345666424

RF PI error: 0.3929479726761445
RF SV error: 0.39310920127570276
RF PI + RF SV error: 0.3926678946463624

DNN PI error: 0.3994769760484199
DNN SV error: 0.199216653447578
DNN IG error: 0.3195189454370412
DNN PI + DNN SV + DNN IG error: 0.27077309167236696

Dummy Average of 0.0 error: 0.5132976349136164
Dummy Average of 0.5 error: 0.2738957122710223
Dummy Average of 0.1 error: 0.4867023650863836

 All MAE (mean): 0.24808059574213864
 All MAE (median): 0.33690543878658635
 All MAE (mode): 0.24808059710046837
 All MAE (box-whiskers): 0.25145756031977706
 All MAE (tau-test): 0.2611111867257666
 All MAE (majority

Time:  544.9480174760101
Noise:0, n_inform:100%, n_feats:100, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:394.86299327791
predicted_shape:(200,) y_test shape:(200,)
gb_mae:382.754221731405


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 298503.406250
Epoch 101/1000 => Loss: 245654.937500
Epoch 201/1000 => Loss: 205289.265625
Epoch 301/1000 => Loss: 169928.093750
Epoch 401/1000 => Loss: 140012.734375
Epoch 501/1000 => Loss: 115091.992188
Epoch 601/1000 => Loss: 93424.343750
Epoch 701/1000 => Loss: 75274.289062
Epoch 801/1000 => Loss: 60582.359375
Epoch 901/1000 => Loss: 48678.355469
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 203.56323963727243


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.33489147931576085
GB PI error: 0.30383146975546294
DNN PI error: 0.3647839575960544
RF+GB+DNN PI error: 0.31294941789470193

RF SV error: 0.3772478825411041
GB SV error: 0.33720262586765387
DNN SV error: 0.31463683853468555
RF+GB+DNN SV error: 0.30324295016630737

GB PI error: 0.30383146975546294
GB SV error: 0.33720262586765387
GB PI + GB SV error: 0.31424718216400527

RF PI error: 0.33489147931576085
RF SV error: 0.3772478825411041
RF PI + RF SV error: 0.3519361809730265

DNN PI error: 0.3647839575960544
DNN SV error: 0.22209511047147942
DNN IG error: 0.31463683853468555
DNN PI + DNN SV + DNN IG error: 0.27525262869428907

Dummy Average of 0.0 error: 0.49531696192652563
Dummy Average of 0.5 error: 0.25352118015555
Dummy Average of 0.1 error: 0.5046830380734744

 All MAE (mean): 0.2652478103652687
 All MAE (median): 0.3024868650532311
 All MAE (mode): 0.26524780573941914
 All MAE (box-whiskers): 0.2847458265521686
 All MAE (tau-test): 0.2842660716892095
 All MAE (major

Time:  614.8933127599885
Noise:2, n_inform:20%, n_feats:20, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:11.743886241986793
predicted_shape:(200,) y_test shape:(200,)
gb_mae:10.633312993351728


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 8903.997070
Epoch 101/1000 => Loss: 4367.563477
Epoch 201/1000 => Loss: 1679.435303
Epoch 301/1000 => Loss: 617.751953
Epoch 401/1000 => Loss: 233.710587
Epoch 501/1000 => Loss: 78.323929
Epoch 601/1000 => Loss: 17.227158
Epoch 701/1000 => Loss: 9.255601
Epoch 801/1000 => Loss: 3.670101
Epoch 901/1000 => Loss: 1.889864
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 29.074471430154418


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.008587369295122067
GB PI error: 0.0066713935700954425
DNN PI error: 0.15587595780990265
RF+GB+DNN PI error: 0.05699201324396032

RF SV error: 0.006946980739605799
GB SV error: 0.003965049630946844
DNN SV error: 0.39742607872262625
RF+GB+DNN SV error: 0.004595197076551015

GB PI error: 0.0066713935700954425
GB SV error: 0.003965049630946844
GB PI + GB SV error: 0.005289527390908959

RF PI error: 0.008587369295122067
RF SV error: 0.006946980739605799
RF PI + RF SV error: 0.0077671750173639335

DNN PI error: 0.15587595780990265
DNN SV error: 0.00602822467470584
DNN IG error: 0.39742607872262625
DNN PI + DNN SV + DNN IG error: 0.17982605598142903

Dummy Average of 0.0 error: 0.07940814783623866
Dummy Average of 0.5 error: 0.47059185216376137
Dummy Average of 0.1 error: 0.9205918521637614

 All MAE (mean): 0.07866671787554502
 All MAE (median): 0.006235456748889869
 All MAE (mode): 0.07866668084894317
 All MAE (box-whiskers): 0.007739954848291854
 All MAE (tau-test): 0.00688

Time:  510.7849401520216
Noise:2, n_inform:20%, n_feats:60, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:108.70447087861493
predicted_shape:(200,) y_test shape:(200,)
gb_mae:97.44642628331694


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 42545.859375
Epoch 101/1000 => Loss: 29019.824219
Epoch 201/1000 => Loss: 17669.140625
Epoch 301/1000 => Loss: 10337.627930
Epoch 401/1000 => Loss: 5634.400391
Epoch 501/1000 => Loss: 2971.287109
Epoch 601/1000 => Loss: 1500.399536
Epoch 701/1000 => Loss: 719.450317
Epoch 801/1000 => Loss: 346.361511
Epoch 901/1000 => Loss: 144.593277
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 63.800471242573366


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.04015499637963692
GB PI error: 0.03607000855663516
DNN PI error: 0.16986402333572406
RF+GB+DNN PI error: 0.07550632987762552

RF SV error: 0.044613897858664646
GB SV error: 0.03760566756423075
DNN SV error: 0.1559775335371797
RF+GB+DNN SV error: 0.035882851434051705

GB PI error: 0.03607000855663516
GB SV error: 0.03760566756423075
GB PI + GB SV error: 0.03585653389528427

RF PI error: 0.04015499637963692
RF SV error: 0.044613897858664646
RF PI + RF SV error: 0.04195664130953704

DNN PI error: 0.16986402333572406
DNN SV error: 0.04550284788778496
DNN IG error: 0.1559775335371797
DNN PI + DNN SV + DNN IG error: 0.11335969922276799

Dummy Average of 0.0 error: 0.11721561214736556
Dummy Average of 0.5 error: 0.4598157699769393
Dummy Average of 0.1 error: 0.8827843878526345

 All MAE (mean): 0.0520134129260241
 All MAE (median): 0.0371858747823365
 All MAE (mode): 0.05201341099035935
 All MAE (box-whiskers): 0.03731680857023201
 All MAE (tau-test): 0.03738676114476029
 All 

Time:  658.254435342038
Noise:2, n_inform:20%, n_feats:100, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:184.65888642622164
predicted_shape:(200,) y_test shape:(200,)
gb_mae:166.57512714047954


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 90793.203125
Epoch 101/1000 => Loss: 64844.484375
Epoch 201/1000 => Loss: 46706.253906
Epoch 301/1000 => Loss: 32951.140625
Epoch 401/1000 => Loss: 23025.658203
Epoch 501/1000 => Loss: 15467.665039
Epoch 601/1000 => Loss: 10279.927734
Epoch 701/1000 => Loss: 6559.573242
Epoch 801/1000 => Loss: 4223.403809
Epoch 901/1000 => Loss: 2725.250732
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 100.8527821385486


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.08370513901241042
GB PI error: 0.09155229477858849
DNN PI error: 0.18108884241486237
RF+GB+DNN PI error: 0.11385594705796194

RF SV error: 0.08634495638756873
GB SV error: 0.07860287397981092
DNN SV error: 0.21121195873915244
RF+GB+DNN SV error: 0.08697565644788284

GB PI error: 0.09155229477858849
GB SV error: 0.07860287397981092
GB PI + GB SV error: 0.08492783329797747

RF PI error: 0.08370513901241042
RF SV error: 0.08634495638756873
RF PI + RF SV error: 0.08488791535009113

DNN PI error: 0.18108884241486237
DNN SV error: 0.10825035087025016
DNN IG error: 0.21121195873915244
DNN PI + DNN SV + DNN IG error: 0.16044120023455968

Dummy Average of 0.0 error: 0.1253527341015838
Dummy Average of 0.5 error: 0.4534577289590128
Dummy Average of 0.1 error: 0.8746472658984161

 All MAE (mean): 0.09499318947889314
 All MAE (median): 0.0894359642524431
 All MAE (mode): 0.0949931910543237
 All MAE (box-whiskers): 0.08451726913270086
 All MAE (tau-test): 0.08711790378419611
 All MA

Time:  656.0833696059999
Noise:2, n_inform:40%, n_feats:20, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:47.1195497434212
predicted_shape:(200,) y_test shape:(200,)
gb_mae:38.448707074547684


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 28386.171875
Epoch 101/1000 => Loss: 18682.978516
Epoch 201/1000 => Loss: 10517.616211
Epoch 301/1000 => Loss: 5602.180176
Epoch 401/1000 => Loss: 2813.175537
Epoch 501/1000 => Loss: 1358.599731
Epoch 601/1000 => Loss: 636.911438
Epoch 701/1000 => Loss: 258.352478
Epoch 801/1000 => Loss: 102.475235
Epoch 901/1000 => Loss: 43.343315
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 50.947786638961404


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.05861353058845027
GB PI error: 0.05089442096437561
DNN PI error: 0.3877506189998584
RF+GB+DNN PI error: 0.16313941890400252

RF SV error: 0.05716957251059999
GB SV error: 0.04247007831260309
DNN SV error: 0.4357544187722304
RF+GB+DNN SV error: 0.04076834812768659

GB PI error: 0.05089442096437561
GB SV error: 0.04247007831260309
GB PI + GB SV error: 0.04668224963848934

RF PI error: 0.05861353058845027
RF SV error: 0.05716957251059999
RF PI + RF SV error: 0.05789155154952513

DNN PI error: 0.3877506189998584
DNN SV error: 0.02375887993895038
DNN IG error: 0.4357544187722304
DNN PI + DNN SV + DNN IG error: 0.27252589174525543

Dummy Average of 0.0 error: 0.20296663004532425
Dummy Average of 0.5 error: 0.44522150828139473
Dummy Average of 0.1 error: 0.7970333699546759

 All MAE (mean): 0.12329695305195817
 All MAE (median): 0.054515785337665476
 All MAE (mode): 0.12329693830912569
 All MAE (box-whiskers): 0.07726438384297699
 All MAE (tau-test): 0.07002711374376386
 All M

Time:  506.67176012397977
Noise:2, n_inform:40%, n_feats:60, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:155.51574500680988
predicted_shape:(200,) y_test shape:(200,)
gb_mae:137.640610017041


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 56256.191406
Epoch 101/1000 => Loss: 39421.800781
Epoch 201/1000 => Loss: 25861.732422
Epoch 301/1000 => Loss: 16333.496094
Epoch 401/1000 => Loss: 9951.230469
Epoch 501/1000 => Loss: 5983.395508
Epoch 601/1000 => Loss: 3579.401611
Epoch 701/1000 => Loss: 2159.992432
Epoch 801/1000 => Loss: 1200.841309
Epoch 901/1000 => Loss: 652.702454
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 72.740477950633


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.11123978007026328
GB PI error: 0.10900692035831742
DNN PI error: 0.2356514545019647
RF+GB+DNN PI error: 0.14545129836863166

RF SV error: 0.10634665347019386
GB SV error: 0.09301561362142334
DNN SV error: 0.1803879149899198
RF+GB+DNN SV error: 0.08020938646431784

GB PI error: 0.10900692035831742
GB SV error: 0.09301561362142334
GB PI + GB SV error: 0.09998458171624658

RF PI error: 0.11123978007026328
RF SV error: 0.10634665347019386
RF PI + RF SV error: 0.10778131424568932

DNN PI error: 0.2356514545019647
DNN SV error: 0.04653681862816724
DNN IG error: 0.1803879149899198
DNN PI + DNN SV + DNN IG error: 0.13925919786201796

Dummy Average of 0.0 error: 0.1758844086562892
Dummy Average of 0.5 error: 0.38995752892156144
Dummy Average of 0.1 error: 0.8241155913437107

 All MAE (mean): 0.09354288698819348
 All MAE (median): 0.10263583356354315
 All MAE (mode): 0.09354287933181658
 All MAE (box-whiskers): 0.10075943347123356
 All MAE (tau-test): 0.1071382418440009
 All MAE 

Time:  570.9177588979946
Noise:2, n_inform:40%, n_feats:100, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:204.91441486789685
predicted_shape:(200,) y_test shape:(200,)
gb_mae:189.46499090104487


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 115938.257812
Epoch 101/1000 => Loss: 86198.078125
Epoch 201/1000 => Loss: 64105.898438
Epoch 301/1000 => Loss: 46513.109375
Epoch 401/1000 => Loss: 33530.605469
Epoch 501/1000 => Loss: 24040.050781
Epoch 601/1000 => Loss: 17181.507812
Epoch 701/1000 => Loss: 11803.747070
Epoch 801/1000 => Loss: 8049.910156
Epoch 901/1000 => Loss: 5554.770508
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 102.840100280458


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.13676204723116359
GB PI error: 0.13871107827087556
DNN PI error: 0.24969149852239916
RF+GB+DNN PI error: 0.17131865938445034

RF SV error: 0.13938794411308478
GB SV error: 0.11915241942614324
DNN SV error: 0.22058752900815162
RF+GB+DNN SV error: 0.12024879966683281

GB PI error: 0.13871107827087556
GB SV error: 0.11915241942614324
GB PI + GB SV error: 0.12875734862642843

RF PI error: 0.13676204723116359
RF SV error: 0.13938794411308478
RF PI + RF SV error: 0.13807499567212422

DNN PI error: 0.24969149852239916
DNN SV error: 0.12009080530689137
DNN IG error: 0.22058752900815162
DNN PI + DNN SV + DNN IG error: 0.18574552678004555

Dummy Average of 0.0 error: 0.18505140019728686
Dummy Average of 0.5 error: 0.40555553932433974
Dummy Average of 0.1 error: 0.8149485998027132

 All MAE (mean): 0.131244562139843
 All MAE (median): 0.13288352570244436
 All MAE (mode): 0.13124455639984167
 All MAE (box-whiskers): 0.12555068276278525
 All MAE (tau-test): 0.12450747171003167
 All 

Time:  643.7620335710235
Noise:2, n_inform:60%, n_feats:20, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:91.83952821921088
predicted_shape:(200,) y_test shape:(200,)
gb_mae:84.77075097233417


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 47350.699219
Epoch 101/1000 => Loss: 34391.183594
Epoch 201/1000 => Loss: 22297.187500
Epoch 301/1000 => Loss: 14045.849609
Epoch 401/1000 => Loss: 8601.643555
Epoch 501/1000 => Loss: 5130.543945
Epoch 601/1000 => Loss: 3089.497803
Epoch 701/1000 => Loss: 1676.826538
Epoch 801/1000 => Loss: 943.169556
Epoch 901/1000 => Loss: 563.141418
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 54.31116029813137


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.15744034499545595
GB PI error: 0.13905244228137661
DNN PI error: 0.4133581441584358
RF+GB+DNN PI error: 0.19714291996001684

RF SV error: 0.16885268445917445
GB SV error: 0.12786896076571516
DNN SV error: 0.27503716710467196
RF+GB+DNN SV error: 0.10018438264655823

GB PI error: 0.13905244228137661
GB SV error: 0.12786896076571516
GB PI + GB SV error: 0.1334607015235459

RF PI error: 0.15744034499545595
RF SV error: 0.16885268445917445
RF PI + RF SV error: 0.16256559178440538

DNN PI error: 0.4133581441584358
DNN SV error: 0.013493630192724083
DNN IG error: 0.27503716710467196
DNN PI + DNN SV + DNN IG error: 0.21421100283115427

Dummy Average of 0.0 error: 0.3580155139485234
Dummy Average of 0.5 error: 0.34066771809150953
Dummy Average of 0.1 error: 0.6419844860514765

 All MAE (mean): 0.10568108139609818
 All MAE (median): 0.13835132055794697
 All MAE (mode): 0.10568111839502776
 All MAE (box-whiskers): 0.12801624501614361
 All MAE (tau-test): 0.1331754206660063
 All MA

Time:  512.8433032609755
Noise:2, n_inform:60%, n_feats:60, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:227.616823277122
predicted_shape:(200,) y_test shape:(200,)
gb_mae:206.00135523939613


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 154813.921875
Epoch 101/1000 => Loss: 126856.632812
Epoch 201/1000 => Loss: 98385.242188
Epoch 301/1000 => Loss: 75192.687500
Epoch 401/1000 => Loss: 56385.019531
Epoch 501/1000 => Loss: 41850.828125
Epoch 601/1000 => Loss: 30574.804688
Epoch 701/1000 => Loss: 22260.345703
Epoch 801/1000 => Loss: 16164.719727
Epoch 901/1000 => Loss: 11355.410156
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 94.77840350847401


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.2717189436576135
GB PI error: 0.23136992226275951
DNN PI error: 0.3330283327155705
RF+GB+DNN PI error: 0.2624689987093711

RF SV error: 0.23365157504607978
GB SV error: 0.21536539679849326
DNN SV error: 0.24030598961693056
RF+GB+DNN SV error: 0.182986496119994

GB PI error: 0.23136992226275951
GB SV error: 0.21536539679849326
GB PI + GB SV error: 0.22325781654286864

RF PI error: 0.2717189436576135
RF SV error: 0.23365157504607978
RF PI + RF SV error: 0.25257323793296066

DNN PI error: 0.3330283327155705
DNN SV error: 0.1031540196595223
DNN IG error: 0.24030598961693056
DNN PI + DNN SV + DNN IG error: 0.21075775006515637

Dummy Average of 0.0 error: 0.3474331992079678
Dummy Average of 0.5 error: 0.3671638807287195
Dummy Average of 0.1 error: 0.6525668007920323

 All MAE (mean): 0.18009149455128562
 All MAE (median): 0.22285960101252314
 All MAE (mode): 0.18009134199966634
 All MAE (box-whiskers): 0.20647849968963658
 All MAE (tau-test): 0.19338328443033934
 All MAE (maj

Time:  578.2799486040021
Noise:2, n_inform:60%, n_feats:100, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:303.08648140758896
predicted_shape:(200,) y_test shape:(200,)
gb_mae:282.68285731703344


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 244251.921875
Epoch 101/1000 => Loss: 182652.140625
Epoch 201/1000 => Loss: 149498.687500
Epoch 301/1000 => Loss: 119830.898438
Epoch 401/1000 => Loss: 94958.460938
Epoch 501/1000 => Loss: 73687.617188
Epoch 601/1000 => Loss: 58394.871094
Epoch 701/1000 => Loss: 45791.519531
Epoch 801/1000 => Loss: 34981.343750
Epoch 901/1000 => Loss: 27095.375000
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 141.7183546361281


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.22978899964033345
GB PI error: 0.22715899052833702
DNN PI error: 0.3248300695018693
RF+GB+DNN PI error: 0.2510431849443029

RF SV error: 0.25011613021972623
GB SV error: 0.2379057532844153
DNN SV error: 0.2876495455733621
RF+GB+DNN SV error: 0.2221110032536809

GB PI error: 0.22715899052833702
GB SV error: 0.2379057532844153
GB PI + GB SV error: 0.23176551946946267

RF PI error: 0.22978899964033345
RF SV error: 0.25011613021972623
RF PI + RF SV error: 0.23992825608121102

DNN PI error: 0.3248300695018693
DNN SV error: 0.1846003638743645
DNN IG error: 0.2876495455733621
DNN PI + DNN SV + DNN IG error: 0.25826769716824827

Dummy Average of 0.0 error: 0.32504593592079006
Dummy Average of 0.5 error: 0.3381152481131988
Dummy Average of 0.1 error: 0.6749540640792098

 All MAE (mean): 0.2260239180833786
 All MAE (median): 0.24419833974764973
 All MAE (mode): 0.22602392355900466
 All MAE (box-whiskers): 0.22894448803875772
 All MAE (tau-test): 0.2278915760078182
 All MAE (major

Time:  653.1087473930093
Noise:2, n_inform:80%, n_feats:20, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:126.46098579100727
predicted_shape:(200,) y_test shape:(200,)
gb_mae:104.12640359144916


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 50560.335938
Epoch 101/1000 => Loss: 36925.734375
Epoch 201/1000 => Loss: 24137.197266
Epoch 301/1000 => Loss: 15245.917969
Epoch 401/1000 => Loss: 9298.278320
Epoch 501/1000 => Loss: 5571.287109
Epoch 601/1000 => Loss: 3232.620605
Epoch 701/1000 => Loss: 1854.263184
Epoch 801/1000 => Loss: 967.409058
Epoch 901/1000 => Loss: 516.769531
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 66.83144051029402


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.19324153695290447
GB PI error: 0.14451873606956692
DNN PI error: 0.3483340827738473
RF+GB+DNN PI error: 0.1463932506117004

RF SV error: 0.17951224663663248
GB SV error: 0.132573619799672
DNN SV error: 0.36003087116528437
RF+GB+DNN SV error: 0.10882631136534862

GB PI error: 0.14451873606956692
GB SV error: 0.132573619799672
GB PI + GB SV error: 0.13733881661614178

RF PI error: 0.19324153695290447
RF SV error: 0.17951224663663248
RF PI + RF SV error: 0.17816616497109064

DNN PI error: 0.3483340827738473
DNN SV error: 0.03375200441590577
DNN IG error: 0.36003087116528437
DNN PI + DNN SV + DNN IG error: 0.20904874119053557

Dummy Average of 0.0 error: 0.46464044839146396
Dummy Average of 0.5 error: 0.28753222310341225
Dummy Average of 0.1 error: 0.535359551608536

 All MAE (mean): 0.073468461798148
 All MAE (median): 0.12140674875839848
 All MAE (mode): 0.07346844293416901
 All MAE (box-whiskers): 0.07601832014708151
 All MAE (tau-test): 0.08538402141873991
 All MAE (maj

Time:  527.6677720969892
Noise:2, n_inform:80%, n_feats:60, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:242.29191362070506
predicted_shape:(200,) y_test shape:(200,)
gb_mae:229.27314958075738


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 158898.046875
Epoch 101/1000 => Loss: 120683.171875
Epoch 201/1000 => Loss: 93635.820312
Epoch 301/1000 => Loss: 71752.500000
Epoch 401/1000 => Loss: 54443.554688
Epoch 501/1000 => Loss: 41012.003906
Epoch 601/1000 => Loss: 30854.287109
Epoch 701/1000 => Loss: 22822.957031
Epoch 801/1000 => Loss: 16512.814453
Epoch 901/1000 => Loss: 12085.656250
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 100.25545145004348


HBox(children=(FloatProgress(value=0.0, max=300.0), HTML(value='')))


RF PI error: 0.29947640732294645
GB PI error: 0.25513152783326076
DNN PI error: 0.3437525886432266
RF+GB+DNN PI error: 0.27755358364856675

RF SV error: 0.2909446786093145
GB SV error: 0.2516615576104236
DNN SV error: 0.34302981204114774
RF+GB+DNN SV error: 0.21761161511724642

GB PI error: 0.25513152783326076
GB SV error: 0.2516615576104236
GB PI + GB SV error: 0.25210583202702425

RF PI error: 0.29947640732294645
RF SV error: 0.2909446786093145
RF PI + RF SV error: 0.2951012530153211

DNN PI error: 0.3437525886432266
DNN SV error: 0.14904217012392287
DNN IG error: 0.34302981204114774
DNN PI + DNN SV + DNN IG error: 0.24249798037160178

Dummy Average of 0.0 error: 0.388195078494882
Dummy Average of 0.5 error: 0.3041113592893298
Dummy Average of 0.1 error: 0.611804921505118

 All MAE (mean): 0.2103817555486717
 All MAE (median): 0.26879271513101927
 All MAE (mode): 0.21038174952633826
 All MAE (box-whiskers): 0.22919466973048616
 All MAE (tau-test): 0.2395488892452909
 All MAE (majori

Time:  566.048215304967
Noise:2, n_inform:80%, n_feats:100, n_samples:1000
predicted_shape:(200,) y_test shape:(200,)
rf_mae:362.62996181074186
predicted_shape:(200,) y_test shape:(200,)
gb_mae:334.66300315492396


Setting feature_perturbation = "tree_path_dependent" because no background data was given.


Epoch 1/1000 => Loss: 251234.515625
Epoch 101/1000 => Loss: 219817.859375
Epoch 201/1000 => Loss: 186241.281250
Epoch 301/1000 => Loss: 159385.750000
Epoch 401/1000 => Loss: 124893.156250
Epoch 501/1000 => Loss: 100920.507812
Epoch 601/1000 => Loss: 78797.656250
Epoch 701/1000 => Loss: 62146.320312
Epoch 801/1000 => Loss: 49503.808594
Epoch 901/1000 => Loss: 39174.996094
predicted_shape:(200, 1) y_test shape:(200, 1)
dnn_mae: 196.4222237261838


In [None]:
d_1 = {'non_sig_ratio':non_sig_ratio,'noise_level':noise_level, 'informative_level':informative_level,'num_features_level':num_features_level,
     'non_sig_ratio':non_sig_ratio,'rf_mae':rf_mae,'gb_mae':gb_mae,'dnn_mae':dnn_mae,'err_rf_pi':err_rf_pi,
     'err_gb_pi':err_gb_pi,'err_dnn_pi':err_dnn_pi,'err_rf_sv':err_rf_sv,'err_gb_sv':err_gb_sv,
     'err_dnn_sv':err_dnn_sv,'err_dnn_ig':err_dnn_ig,'err_mean':err_mean,'err_median':err_median,
     'err_mode':err_mode,'err_box':err_box,'err_tau':err_tau,'err_major':err_major,'err_kendall':err_kendall,
     'err_spearman':err_spearman}

df_1000_samples_rank_avg_1 = pd.DataFrame(data=d_1)
df_1000_samples_rank_avg_1.to_csv('df_1000_samples_rank_avg_1.csv', index=False)

In [None]:
#num_samples = 1000
rf_mae = list()
gb_mae = list()
dnn_mae = list()

err_rf_pi = list()
err_gb_pi = list()
err_dnn_pi = list()
err_rf_sv = list() 
err_gb_sv = list()
err_dnn_sv = list()
err_dnn_ig = list()

err_mean = list()
err_median = list()
err_mode = list()
err_box = list()
err_tau = list()
err_major = list()
err_kendall = list()
err_spearman = list()
non_sig_ratio = list()

num_sam = 1000
for i in range(len(_noise)):
    for j in range(len(_informative)):
        for k in range(len(_num_feats)):
            print(f"Noise:{_noise[i]}, n_inform:{_informative[j]}%, n_feats:{_num_feats[k]}, n_samples:1000")
            start = timeit.default_timer()

            # high informative high noise
            _features, _output, _coef = make_regression(
                n_samples=num_sam,
                # three features
                n_features=_num_feats[k],
                # where only two features are useful,
                n_informative=int(_num_feats[k]*(_informative[j]/100)),
                # a single target value per observation
                n_targets=1,
                # 0.0 standard deviation of the guassian noise
                noise=_noise[i],
                # show the true coefficient used to generated the data
                coef=True,
            )

            temp_non_sig_ratio ,temp_rf_mae, temp_gb_mae, temp_dnn_mae, temp_err_rf_pi, temp_err_gb_pi, temp_err_dnn_pi, temp_err_rf_sv, temp_err_gb_sv, temp_err_dnn_sv, temp_err_dnn_ig, temp_err_mean, temp_err_median, temp_err_mode, temp_err_box, temp_err_tau, temp_err_major, temp_err_kendall, temp_err_spearman = ensemble_feature_importance(_num_feats[k], _features, _output, _coef, informative=int((_num_feats[k])*(_informative[j]/100)), noise=_noise[i])

            stop = timeit.default_timer()

            print('Time: ', stop - start)
            non_sig_ratio.append(temp_non_sig_ratio)
            rf_mae.append(temp_rf_mae)
            gb_mae.append(temp_gb_mae)
            dnn_mae.append(temp_dnn_mae)
            err_rf_pi.append(temp_err_rf_pi)
            err_gb_pi.append(temp_err_gb_pi)
            err_dnn_pi.append(temp_err_dnn_pi)
            err_rf_sv.append(temp_err_rf_sv)
            err_gb_sv.append(temp_err_gb_sv)
            err_dnn_sv.append(temp_err_dnn_sv)
            err_dnn_ig.append(temp_err_dnn_ig)
            err_mean.append(temp_err_mean)
            err_median.append(temp_err_median)
            err_mode.append(temp_err_mode)
            err_box.append(temp_err_box)
            err_tau.append(temp_err_tau)
            err_major.append(temp_err_major)
            err_kendall.append(temp_err_kendall)
            err_spearman.append(temp_err_spearman)
            

In [None]:
d_2 = {'non_sig_ratio':non_sig_ratio,'noise_level':noise_level, 'informative_level':informative_level,'num_features_level':num_features_level,
     'non_sig_ratio':non_sig_ratio,'rf_mae':rf_mae,'gb_mae':gb_mae,'dnn_mae':dnn_mae,'err_rf_pi':err_rf_pi,
     'err_gb_pi':err_gb_pi,'err_dnn_pi':err_dnn_pi,'err_rf_sv':err_rf_sv,'err_gb_sv':err_gb_sv,
     'err_dnn_sv':err_dnn_sv,'err_dnn_ig':err_dnn_ig,'err_mean':err_mean,'err_median':err_median,
     'err_mode':err_mode,'err_box':err_box,'err_tau':err_tau,'err_major':err_major,'err_kendall':err_kendall,
     'err_spearman':err_spearman}

df_1000_samples_rank_avg_2 = pd.DataFrame(data=d_2)
df_1000_samples_rank_avg_2.to_csv('df_1000_samples_rank_avg_2.csv', index=False)

## Design choices

### Feature Importance Selection

### Mean Decrease Impurity
#### “gini importance” or “mean decrease impurity” and is defined as the total decrease in node impurity (weighted by the probability of reaching that node (which is approximated by the proportion of samples reaching that node)) averaged over all trees of the ensemble. Gini Importance or Mean Decrease in Impurity (MDI) calculates each feature importance as the sum over the number of splits (across all tress) that include the feature, proportionally to the number of samples it splits.

#### At each split in each tree, the improvement in the split-criterion is the importance measure attributed to the splitting variable, and is accumulated over all the trees in the forest separately for each variable.

---
### Permutation Importance / Mean Decrease Accuracy
#### It is based on experiments on out-of-bag(OOB) samples, via destroying the predictive power of a feature without changing its marginal distribution.

---
### MDA vs MDI
#### The impurity-based feature importance ranks the numerical features to be the most important features as impurity-based importances are biased towards high cardinality features. It would potentially rank non-predictive numerical feature highly compared to categorical features. Additionally, impurity-based importances are computed on training set statistics and therefore do not reflect the ability of feature to be useful to make predictions that generalize to the test set.
---
### SHAP vs Permutation Importance
#### SHAP feature importance is an alternative to permutation feature importance. There is a big difference between both importance measures: Permutation feature importance is based on the decrease in model performance. SHAP is based on magnitude of feature attributions. SHAP ignores feature dependence. By replacing feature values with values from random instances, it is usually easier to randomly sample from the marginal distribution. However, if features are dependent, e.g. correlated, this leads to putting too much weight on unlikely data points. But SHAP has a solid theoretical foundation in game theory. The prediction is fairly distributed among the feature values. We get contrastive explanations that compare the prediction with the average prediction.

### Feature Importance Ensemble Methods