### Import packages

In [1]:
import imlreliability
import pandas as pd
import numpy as np

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


In [2]:
dir(imlreliability)

['__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 '_version',
 'clustering',
 'dimension_reduction',
 'feature_importance']

## Feature Importance

Reliability test of feature importance techniques can be performed with the module imlreliability.feature_importance. Non-MLP techniques can be evaluated ``feature_impoReg`` for regression tasks and ``feature_impoClass`` for classification tasks. MLP-based techniques can be evaluated by ``feature_impoReg_MLP`` for regression tasks and ``feature_impoClass_MLP`` for classification tasks. 

Model agnostic techniques can be evaluated by specifying the parameter of importance function ``importance_func``. 


In [3]:
dir(imlreliability.feature_importance)

['__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '_feature_impo',
 'feature_impoClass',
 'feature_impoClass_MLP',
 'feature_impoReg',
 'feature_impoReg_MLP',
 'rbo',
 'util_feature_impo']

## 2. Classification

#### Load data
 We use the madelon classifiction data as an example for the following sections. The data has 2000 observations and 500 feature. We pre-process the data set by scaling and normalizing the predictors. 

In [4]:
from sklearn.preprocessing import scale, normalize
import tensorflow as tf
x = pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-databases/madelon/MADELON/madelon_train.data',header=None,sep=' ')
y=pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-databases/madelon/MADELON/madelon_train.labels',header=None,sep=' ')
x=x.iloc[:,:500]
x=np.array(x)
y=y[0].tolist()

### scale and normalize data 
x = normalize(scale(x))
data_class=(x,y)


### 2.1. Model specific IML method

The estimator is assumed to implement the scikit-learn estimator interface. To measure the feature importance, either estimator needs to provide a ``score`` function or ``scoring`` must be passed. For example, in logistic regression, the magnitude of coefficients is used to evaluate feature importance if there is no user-defined scoring function provided. 


#### 2.1.1. Linear model

Here we aim to evaluate the interpretation reliability of Logistic Ridge regression with cross validation, using the ``feature_impoReg``function. We use ``LogisticRegressionCV()`` from ``sklearn`` as our estimator. By setting ``importance_func=None``, the magnitude of coefficients will be used to evaluate feature importance. 


In [5]:
from sklearn.linear_model import LogisticRegressionCV

estimator=LogisticRegressionCV(cv=5,penalty='l2',solver='saga',max_iter=100)
importance_func=None

We initialize the model with the ``mlreliability.feature_importance.feature_impoClass`` function. For illustration purpose, we run 3 repeats with 70%/30% train/test split.

In [6]:
model_class = imlreliability.feature_importance.feature_impoClass(data_class,estimator=estimator, 
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_class.fit()

0
Iter:  0




use coefs as feature importance 
1
Iter:  1




use coefs as feature importance 
2
Iter:  2




use coefs as feature importance 




The ``.get_consistency`` function results in three pandas dataframe: ``accuracy``: prediction accuracy on test set; ``consistency``: interpretation consistency measured by RBO, Jaccard score, or user-defined metrics; and prediction_consistency measured by prediction entropy and purity if ``get_prediction_consistency ==True``. 

The ``consistency`` pandas dataframe can be downloaded and upload to the dashboard. 

In [7]:
model_class.get_consistency(data_name='Madelon', estimator_name='LogisticRidge',impotance_func_name='Coef')
print(model_class.accuracy)
print(model_class.consistency)
print(model_class.prediction_consistency)

## model_class.consistency.to_csv('consis_test_fi_class.csv')

Importance Function is  Coef_LogisticRidge
      data          model  Accuracy
0  Madelon  LogisticRidge  0.571667
1  Madelon  LogisticRidge  0.573333
2  Madelon  LogisticRidge  0.583333
       data              method criteria   K  Consistency  Accuracy
0   Madelon  Coef_LogisticRidge      RBO   1        1.000     0.576
1   Madelon  Coef_LogisticRidge      RBO   2        1.000     0.576
2   Madelon  Coef_LogisticRidge      RBO   3        0.944     0.576
3   Madelon  Coef_LogisticRidge      RBO   4        0.958     0.576
4   Madelon  Coef_LogisticRidge      RBO   5        0.947     0.576
5   Madelon  Coef_LogisticRidge      RBO   6        0.928     0.576
6   Madelon  Coef_LogisticRidge      RBO   7        0.918     0.576
7   Madelon  Coef_LogisticRidge      RBO   8        0.912     0.576
8   Madelon  Coef_LogisticRidge      RBO   9        0.910     0.576
9   Madelon  Coef_LogisticRidge      RBO  10        0.914     0.576
10  Madelon  Coef_LogisticRidge      RBO  11        0.917     0.5

#### 2.1.2. Tree-base model
Here we aim to evaluate the interpretation reliability of random forest, using the ``feature_impoClass``function.. We use ``RandomForestClassifier()`` from ``sklearn`` as our estimator. By setting ``importance_func=None``, the default feature importance ``feature_importances_`` of the ``RandomForestClassifier()`` function will be used to evaluate feature importance. 
All other settings are the same as logistic regression in 2.1. 

In [8]:
from sklearn.ensemble import RandomForestClassifier
estimator=RandomForestClassifier()
importance_func=None

model_class_tree=imlreliability.feature_importance.feature_impoClass(data_class,estimator, 
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_class_tree.fit()
model_class_tree.get_consistency(data_name='Madelon', estimator_name='RF',impotance_func_name='FI')
print(model_class_tree.accuracy)
print(model_class_tree.consistency)
print(model_class_tree.prediction_consistency)

0
Iter:  0
use feature_importances_ as feature importance 
1
Iter:  1
use feature_importances_ as feature importance 
2
Iter:  2
use feature_importances_ as feature importance 
Importance Function is  FI_RF
      data model  Accuracy
0  Madelon    RF  0.636667
1  Madelon    RF  0.626667
2  Madelon    RF  0.688333
       data method criteria   K  Consistency  Accuracy
0   Madelon  FI_RF      RBO   1        0.500     0.651
1   Madelon  FI_RF      RBO   2        0.500     0.651
2   Madelon  FI_RF      RBO   3        0.611     0.651
3   Madelon  FI_RF      RBO   4        0.646     0.651
4   Madelon  FI_RF      RBO   5        0.637     0.651
5   Madelon  FI_RF      RBO   6        0.656     0.651
6   Madelon  FI_RF      RBO   7        0.674     0.651
7   Madelon  FI_RF      RBO   8        0.691     0.651
8   Madelon  FI_RF      RBO   9        0.707     0.651
9   Madelon  FI_RF      RBO  10        0.721     0.651
10  Madelon  FI_RF      RBO  11        0.739     0.651
11  Madelon  FI_RF      R

### 2.2. Model agnostic 
For model agnostic methods to measure feature importance, we provide built-in importance functions from package shap and perumutation function from sklearn.inspection. The imlreliability also support elf-defined importance function, with three argument: ``(fitted model, training x, training y)``, and 1 output importance score in forms of list or array:

``importance_func(self.fitted,x_train, y_train)``

#### 2.2.1. Permutation


##### 2.2.1.1. Random Forest + Permutation
Here we use random forest to consturct the prediction model using the ``feature_impoClass``function, and permutation as the post-hoc method to measure the feature importance, by setting ``importance_func=permutation_importance``. All other settings are the same as in 2.1. 

In [9]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.inspection import permutation_importance
estimator=RandomForestClassifier()
importance_func = permutation_importance ## change the importance function to be permutation 


model_class_tree_per=imlreliability.feature_importance.feature_impoClass(data_class,estimator, 
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_class_tree_per.fit()
model_class_tree_per.get_consistency(data_name='Madelon', estimator_name='RF',impotance_func_name='Permutation')
print(model_class_tree_per.accuracy)
print(model_class_tree_per.consistency)
print(model_class_tree_per.prediction_consistency)


0
Iter:  0
1
Iter:  1
2
Iter:  2
Importance Function is  Permutation_RF
      data model  Accuracy
0  Madelon    RF  0.653333
1  Madelon    RF  0.646667
2  Madelon    RF  0.681667
       data          method criteria   K  Consistency  Accuracy
0   Madelon  Permutation_RF      RBO   1          0.0     0.661
1   Madelon  Permutation_RF      RBO   2          0.0     0.661
2   Madelon  Permutation_RF      RBO   3          0.0     0.661
3   Madelon  Permutation_RF      RBO   4          0.0     0.661
4   Madelon  Permutation_RF      RBO   5          0.0     0.661
5   Madelon  Permutation_RF      RBO   6          0.0     0.661
6   Madelon  Permutation_RF      RBO   7          0.0     0.661
7   Madelon  Permutation_RF      RBO   8          0.0     0.661
8   Madelon  Permutation_RF      RBO   9          0.0     0.661
9   Madelon  Permutation_RF      RBO  10          0.0     0.661
10  Madelon  Permutation_RF      RBO  11          0.0     0.661
11  Madelon  Permutation_RF      RBO  12          0.

##### 2.2.1.2. MLP + Permutation
Here we construct a MLP model with two hidden layers as the prediction model, and permutation as the post-hoc method to measure the feature importance, by setting ``importance_func=permutation_importance``. Note that here we use the ``feature_impoClass_MLP`` function for MLP-based techniques. All other settings are the same as logistic regression in 2.1.1. 

A two-layer default MLP will be computed if ``estimator = None``. We also support user-defined MLP models. 


In [10]:
from eli5.sklearn import PermutationImportance
importance_func =PermutationImportance

model_class_mlp_dl=imlreliability.feature_importance.feature_impoClass_MLP(data_class,
                                                                           
                 importance_func=PermutationImportance,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_class_mlp_dl.fit()
model_class_mlp_dl.get_consistency(data_name='Madelon', estimator_name='MLP',impotance_func_name='Permutation')
print(model_class_mlp_dl.accuracy)
print(model_class_mlp_dl.consistency)
print(model_class_mlp_dl.prediction_consistency)


Instructions for updating:
Colocations handled automatically by placer.


Using TensorFlow backend.


None
Iter:  0
Instructions for updating:
Use tf.cast instead.
['eli5', 'sklearn', 'permutation_importance']


























Iter:  1
['eli5', 'sklearn', 'permutation_importance']




























Iter:  2
['eli5', 'sklearn', 'permutation_importance']


























Importance Function is  Permutation_MLP
      data model                          Accuracy
0  Madelon   MLP   [0.6803433517615001, 0.5566667]
1  Madelon   MLP        [0.5914543062448502, 0.67]
2  Madelon   MLP  [0.4396990624566873, 0.80333334]
       data           method criteria   K  Consistency  Accuracy
0   Madelon  Permutation_MLP      RBO   1        0.000     0.624
1   Madelon  Permutation_MLP      RBO   2        0.000     0.624
2   Madelon  Permutation_MLP      RBO   3        0.000     0.624
3   Madelon  Permutation_MLP      RBO   4        0.000     0.624
4   Madelon  Permutation_MLP      RBO   5        0.000     0.624
5   Madelon  Permutation_MLP      RBO   6        0.000     0.624
6   Madelon  Permutation_MLP      RBO   7        0.010     0.624
7   Madelon  Permutation_MLP      RBO   8        0.025     0.624
8   Madelon  Permutation_MLP      RBO   9        0.034     0.624
9   Madelon  Permutation_MLP      RBO  10        0.041     0.624
10  Madelon  Permutation_MLP      RBO  11

#### 2.2.2. Shapley Value 

Here we use random forest to consturct the prediction model, and SHAP as the post-hoc method to measure the feature importance, by setting ``importance_func=shap.TreeExplainer``. All other settings are the same as logistic regression in 2.1. 

In [11]:
import shap

estimator=RandomForestClassifier()
importance_func = shap.TreeExplainer ## change the importance function to be SHAP 


model_class_tree_shap=imlreliability.feature_importance.feature_impoClass(data_class,estimator, 
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_class_tree_shap.fit()
model_class_tree_shap.get_consistency(data_name='Madelon', estimator_name='RF',impotance_func_name='SHAP')
print(model_class_tree_shap.accuracy)
print(model_class_tree_shap.consistency)
print(model_class_tree_shap.prediction_consistency)


0
Iter:  0
1
Iter:  1
2
Iter:  2
Importance Function is  SHAP_RF
      data model  Accuracy
0  Madelon    RF  0.648333
1  Madelon    RF  0.655000
2  Madelon    RF  0.658333
       data   method criteria   K  Consistency  Accuracy
0   Madelon  SHAP_RF      RBO   1        0.000     0.654
1   Madelon  SHAP_RF      RBO   2        0.000     0.654
2   Madelon  SHAP_RF      RBO   3        0.000     0.654
3   Madelon  SHAP_RF      RBO   4        0.000     0.654
4   Madelon  SHAP_RF      RBO   5        0.000     0.654
5   Madelon  SHAP_RF      RBO   6        0.000     0.654
6   Madelon  SHAP_RF      RBO   7        0.000     0.654
7   Madelon  SHAP_RF      RBO   8        0.000     0.654
8   Madelon  SHAP_RF      RBO   9        0.000     0.654
9   Madelon  SHAP_RF      RBO  10        0.000     0.654
10  Madelon  SHAP_RF      RBO  11        0.008     0.654
11  Madelon  SHAP_RF      RBO  12        0.015     0.654
12  Madelon  SHAP_RF      RBO  13        0.019     0.654
13  Madelon  SHAP_RF      RBO

### 2.3. MLP specific models 
We have built-in functions to run functions from ``deepexplain`` and  ``deeplift`` packages. As well permutation and shapley values. User can either input a function or a string from [
                        'zero',
                        'saliency',
                        'grad*input',
                        'intgrad',
                        'elrp',
                        'deeplift',
                        'occlusion',
                        'shapley_sampling'] to run deepExplain. 
Or input strings from ['NonlinearMxtsMode.RevealCancel','NonlinearMxtsMode.GuidedBackprop'...] to run corresponding functions in deeplift. 


imlreliability package also support self-defined importance function, with three argument: ``(fitted model, training x, training y)``, and 1 output importance score in forms of list or array:``importance_func(model,x_train, y_train)``. 

And the defined estimator needs to be form of :
      
```Python
def _base_model_classification():
    model = Sequential()
    model.add(Dense(M, input_dim=M, activation='relu'))
    model.add(Dense(M, input_dim=M, activation='relu'))
    model.add(Dense(num_class, activation='softmax'))
    model.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy'])
       
    return model  
```

And the trained MLP model is saved as .h5 file. 
    
    

#### 2.3.1. Deeplift

Here we construct a MLP model with two hidden layers as the prediction model, and deeplift as the post-hoc method to measure the feature importance, by setting ``importance_func='NonlinearMxtsMode.RevealCancel'``. All other settings are the same as linear regression in 1.1. 

A two-layer default MLP will be computed if ``estimator = None``. We also support user-defined MLP models. Any deeplift.layers functions can be used to measure the feature importance by setting parameter ``importance_func`` in its string form. 


In [12]:
from deeplift.layers import NonlinearMxtsMode
import deeplift
importance_func = 'NonlinearMxtsMode.RevealCancel'

## Two-layer default MLP will be computed if estimator =None. Can input user-defined MLP model

model_class_mlp_dl=imlreliability.feature_importance.feature_impoClass_MLP(data_class,
                                                                           
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_class_mlp_dl.fit()
model_class_mlp_dl.get_consistency(data_name='Madelon', estimator_name='MLP',impotance_func_name='DeepLift')
print(model_class_mlp_dl.accuracy)
print(model_class_mlp_dl.consistency)
print(model_class_mlp_dl.prediction_consistency)


None
Iter:  0
DeepLift
nonlinear_mxts_mode is set to: RevealCancel
Heads-up: I assume softmax is the output layer, not an intermediate one; if it's an intermediate layer, please let me know and I will prioritise that use-case
MAKING A SESSION
Computing scores for: NonlinearMxtsMode.RevealCancel
Iter:  1
DeepLift
nonlinear_mxts_mode is set to: RevealCancel
Heads-up: I assume softmax is the output layer, not an intermediate one; if it's an intermediate layer, please let me know and I will prioritise that use-case
Computing scores for: NonlinearMxtsMode.RevealCancel
Iter:  2
DeepLift
nonlinear_mxts_mode is set to: RevealCancel
Heads-up: I assume softmax is the output layer, not an intermediate one; if it's an intermediate layer, please let me know and I will prioritise that use-case
Computing scores for: NonlinearMxtsMode.RevealCancel
Importance Function is  DeepLift_MLP
      data model                         Accuracy
0  Madelon   MLP  [0.6886192222436269, 0.5466667]
1  Madelon   MLP   

#### 2.3.2. DeepExplain
Here we construct a MLP model with two hidden layers as the prediction model, and epsilon-LRP as the post-hoc method to measure the feature importance, by setting ``importance_func='elrp'``. All other settings are the same as logistic regression in 2.1. 

A two-layer default MLP will be computed if ``estimator = None``. We also support user-defined MLP models. Any DeepExplain function can be used to measure the feature importance by setting parameter ``importance_func`` in its string form. 


In [13]:
from deepexplain.tensorflow import DeepExplain
importance_func ='elrp'

model_class_mlp_dl=imlreliability.feature_importance.feature_impoClass_MLP(data_class,
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_class_mlp_dl.fit()
model_class_mlp_dl.get_consistency(data_name='Madelon', estimator_name='MLP',impotance_func_name='elrp')
print(model_class_mlp_dl.accuracy)
print(model_class_mlp_dl.consistency)
print(model_class_mlp_dl.prediction_consistency)

None
Iter:  0
DeepExplain
Iter:  1
DeepExplain
Iter:  2
DeepExplain
Importance Function is  elrp_MLP
      data model                           Accuracy
0  Madelon   MLP         [0.6843797236680984, 0.54]
1  Madelon   MLP         [0.6057788138588269, 0.67]
2  Madelon   MLP  [0.45367496063311896, 0.78333336]
       data    method criteria   K  Consistency  Accuracy
0   Madelon  elrp_MLP      RBO   1        0.500     0.623
1   Madelon  elrp_MLP      RBO   2        0.750     0.623
2   Madelon  elrp_MLP      RBO   3        0.833     0.623
3   Madelon  elrp_MLP      RBO   4        0.812     0.623
4   Madelon  elrp_MLP      RBO   5        0.810     0.623
5   Madelon  elrp_MLP      RBO   6        0.814     0.623
6   Madelon  elrp_MLP      RBO   7        0.820     0.623
7   Madelon  elrp_MLP      RBO   8        0.835     0.623
8   Madelon  elrp_MLP      RBO   9        0.841     0.623
9   Madelon  elrp_MLP      RBO  10        0.847     0.623
10  Madelon  elrp_MLP      RBO  11        0.852     0