### Import packages

In [1]:
import imlreliability
import pandas as pd
import numpy as np#### Load Packages dir(imlreliability)

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


## Feature Importance

Reliability test of feature importance techniques can be performed with the module imlreliability.feature_importance. Non-MLP techniques can be evaluated ``feature_impoReg`` for regression tasks and ``feature_impoClass`` for classification tasks. MLP-based techniques can be evaluated by ``feature_impoReg_MLP`` for regression tasks and ``feature_impoClass_MLP`` for classification tasks. 

Model agnostic techniques can be evaluated by specifying the parameter of importance function ``importance_func``. 


## 1. Regression

#### Load data
 We use the communities regression data as an example for the following sections. The data has 1993 observations and 99 feature. We pre-process the data set by scaling and normalizing the predictors and scaling the response. 

In [2]:
from sklearn.preprocessing import scale, normalize
communities_data = pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-databases/communities/communities.data').to_numpy()
communities_data = np.delete(communities_data, np.arange(5), 1)
    # remove predictors with missing values
communities_data = np.delete(communities_data,
                             np.argwhere((communities_data == '?').sum(0) > 0).reshape(-1), 1)
communities_data = communities_data.astype(float)
x = communities_data[:, :-1]
y = communities_data[:, -1]


### scale and normalize data 
x = normalize(scale(x))
y = (scale(y))
data_reg=(x,y)


### 1.1. Model specific IML method

The estimator is assumed to implement the scikit-learn estimator interface. To measure the feature importance, either estimator needs to provide a ``score`` function or ``scoring`` must be passed. For example, in linear regression, the magnitude of coefficients is used to evaluate feature importance if there is no user-defined scoring function provided. 


#### 1.1.1. Linear model

Here we aim to evaluate the interpretation reliability of Ridge regression with cross validation, using the ``feature_impoReg``function. We use ``RidgeCV()`` from ``sklearn`` as our estimator. By setting ``importance_func=None``, the magnitude of coefficients will be used to evaluate feature importance. 


In [3]:
from sklearn.linear_model import RidgeCV
estimator=RidgeCV()
importance_func=None

We initialize the model with the ``mlreliability.feature_importance.feature_impoReg`` function. For illustration purpose, we run 3 repeats with 70%/30% train/test split.

In [4]:
model_reg = imlreliability.feature_importance.feature_impoReg(data_reg,estimator=estimator, 
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_reg.fit()

0
Iter:  0
use coefs as feature importance 
1
Iter:  1
use coefs as feature importance 
2
Iter:  2
use coefs as feature importance 


The ``.get_consistency`` function results in three pandas dataframe: ``accuracy``: prediction accuracy on test set; ``consistency``: interpretation consistency measured by RBO, Jaccard score, or user-defined metrics; and prediction_consistency measured by prediction entropy and purity if ``get_prediction_consistency ==True``. 

The ``consistency`` pandas dataframe can be downloaded and upload to the dashboard. 

In [5]:
model_reg.get_consistency(data_name='communities', estimator_name='Ridge',impotance_func_name='Coef')
print(model_reg.accuracy)
print(model_reg.consistency)
print(model_reg.prediction_consistency)

## model_reg.consistency.to_csv('consis_test_fi_reg.csv')

Importance Function is  Coef_Ridge
          data  model  Accuracy
0  communities  Ridge  0.364518
1  communities  Ridge  0.435435
2  communities  Ridge  0.356309
           data      method criteria   K  Consistency  Accuracy
0   communities  Coef_Ridge      RBO   1        1.000     0.385
1   communities  Coef_Ridge      RBO   2        0.875     0.385
2   communities  Coef_Ridge      RBO   3        0.806     0.385
3   communities  Coef_Ridge      RBO   4        0.792     0.385
4   communities  Coef_Ridge      RBO   5        0.753     0.385
5   communities  Coef_Ridge      RBO   6        0.725     0.385
6   communities  Coef_Ridge      RBO   7        0.713     0.385
7   communities  Coef_Ridge      RBO   8        0.702     0.385
8   communities  Coef_Ridge      RBO   9        0.698     0.385
9   communities  Coef_Ridge      RBO  10        0.698     0.385
10  communities  Coef_Ridge      RBO  11        0.697     0.385
11  communities  Coef_Ridge      RBO  12        0.698     0.385
12  c

#### 1.1.2. Tree-base model
Here we aim to evaluate the interpretation reliability of random forest, using the ``feature_impoReg``function.. We use ``RandomForestRegressor()`` from ``sklearn`` as our estimator. By setting ``importance_func=None``, the default feature importance ``feature_importances_`` of the ``RandomForestRegressor()`` function will be used to evaluate feature importance. 
All other settings are the same as linear regression in 1.1. 

In [7]:
from sklearn.ensemble import RandomForestRegressor
estimator=RandomForestRegressor()
importance_func=None


model_reg_tree=imlreliability.feature_importance.feature_impoReg(data_reg,estimator, 
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_reg_tree.fit()
model_reg_tree.consistency(data_name='communities', estimator_name='RF',impotance_func_name='FI')
print(model_reg_tree.accuracy)
print(model_reg_tree.consistency)
print(model_reg_tree.prediction_consistency)

0
Iter:  0
use feature_importances_ as feature importance 
1
Iter:  1
use feature_importances_ as feature importance 
2
Iter:  2
use feature_importances_ as feature importance 
Importance Function is  FI_RF
          data model  Accuracy
0  communities    RF  0.346505
1  communities    RF  0.400400
2  communities    RF  0.354550
           data method criteria   K  Consistency  Accuracy
0   communities  FI_RF      RBO   1        1.000     0.367
1   communities  FI_RF      RBO   2        1.000     0.367
2   communities  FI_RF      RBO   3        1.000     0.367
3   communities  FI_RF      RBO   4        0.969     0.367
4   communities  FI_RF      RBO   5        0.935     0.367
5   communities  FI_RF      RBO   6        0.932     0.367
6   communities  FI_RF      RBO   7        0.911     0.367
7   communities  FI_RF      RBO   8        0.891     0.367
8   communities  FI_RF      RBO   9        0.872     0.367
9   communities  FI_RF      RBO  10        0.865     0.367
10  communities  FI_

### 1.2. Model agnostic 
For model agnostic methods to measure feature importance, we provide built-in importance functions from package shap and perumutation function from sklearn.inspection. The imlreliability also support elf-defined importance function, with three argument: ``(fitted model, training x, training y)``, and 1 output importance score in forms of list or array:

``importance_func(self.fitted,x_train, y_train)``

#### 1.2.1. Permutation


##### 1.2.1.1. Random Forest + Permutation
Here we use random forest to consturct the prediction model using the ``feature_impoReg``function, and permutation as the post-hoc method to measure the feature importance, by setting ``importance_func=permutation_importance``. All other settings are the same as linear regression in 1.1. 

In [5]:
from sklearn.ensemble import RandomForestRegressor
from sklearn.inspection import permutation_importance
estimator=RandomForestRegressor()
importance_func = permutation_importance ## change the importance function to be permutation 


model_reg_tree_per=imlreliability.feature_importance.feature_impoReg(data_reg,estimator, 
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_reg_tree_per.fit()
model_reg_tree_per.consistency(data_name='communities', estimator_name='RF',impotance_func_name='Permutation')
print(model_reg_tree_per.accuracy)
print(model_reg_tree_per.consistency)
print(model_reg_tree_per.prediction_consistency)


0
Iter:  0
1
Iter:  1
2
Iter:  2
Importance Function is  Permutation_RF
          data model  Accuracy
0  communities    RF  0.347237
1  communities    RF  0.398645
2  communities    RF  0.361277
           data          method criteria   K  Consistency  Accuracy
0   communities  Permutation_RF      RBO   1        1.000     0.369
1   communities  Permutation_RF      RBO   2        1.000     0.369
2   communities  Permutation_RF      RBO   3        1.000     0.369
3   communities  Permutation_RF      RBO   4        0.938     0.369
4   communities  Permutation_RF      RBO   5        0.950     0.369
5   communities  Permutation_RF      RBO   6        0.944     0.369
6   communities  Permutation_RF      RBO   7        0.932     0.369
7   communities  Permutation_RF      RBO   8        0.909     0.369
8   communities  Permutation_RF      RBO   9        0.895     0.369
9   communities  Permutation_RF      RBO  10        0.875     0.369
10  communities  Permutation_RF      RBO  11        0.85

##### 1.2.1.2. MLP + Permutation
Here we construct a MLP model with two hidden layers as the prediction model, and permutation as the post-hoc method to measure the feature importance, by setting ``importance_func=permutation_importance``. Note that here we use the ``feature_impoReg_MLP`` function for MLP-based techniques. All other settings are the same as linear regression in 1.1.1. 

A two-layer default MLP will be computed if ``estimator = None``. We also support user-defined MLP models. 


In [6]:
from eli5.sklearn import PermutationImportance
importance_func =PermutationImportance

model_reg_mlp_dl=imlreliability.feature_importance.feature_impoReg_MLP(data_reg,
                 importance_func=PermutationImportance,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_reg_mlp_dl.fit()
model_reg_mlp_dl.consistency(data_name='communities', estimator_name='MLP',impotance_func_name='Permutation')
print(model_reg_mlp_dl.accuracy)
print(model_reg_mlp_dl.consistency)
print(model_reg_mlp_dl.prediction_consistency)


Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.


Using TensorFlow backend.


Iter:  0
Instructions for updating:
Use tf.cast instead.
['eli5', 'sklearn', 'permutation_importance']




Iter:  1
['eli5', 'sklearn', 'permutation_importance']




Iter:  2
['eli5', 'sklearn', 'permutation_importance']






AttributeError: 'feature_impoReg_MLP' object has no attribute 'consistency'

#### 1.2.2. Shapley Value 

Here we use random forest to consturct the prediction model, and SHAP as the post-hoc method to measure the feature importance, by setting ``importance_func=shap.TreeExplainer``. All other settings are the same as linear regression in 1.1. 

In [8]:
import shap

estimator=RandomForestRegressor()
importance_func = shap.TreeExplainer ## change the importance function to be SHAP 

model_reg_tree_shap=imlreliability.feature_importance.feature_impoReg(data_reg,estimator, 
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_reg_tree_shap.fit()
model_reg_tree_shap.consistency(data_name='communities', estimator_name='RF',impotance_func_name='SHAP')
print(model_reg_tree_shap.accuracy)
print(model_reg_tree_shap.consistency)
print(model_reg_tree_shap.prediction_consistency)


0
Iter:  0
1
Iter:  1
2
Iter:  2
Importance Function is  SHAP_RF
          data model  Accuracy
0  communities    RF  0.341917
1  communities    RF  0.409945
2  communities    RF  0.357607
           data   method criteria   K  Consistency  Accuracy
0   communities  SHAP_RF      RBO   1        1.000      0.37
1   communities  SHAP_RF      RBO   2        0.750      0.37
2   communities  SHAP_RF      RBO   3        0.722      0.37
3   communities  SHAP_RF      RBO   4        0.729      0.37
4   communities  SHAP_RF      RBO   5        0.703      0.37
5   communities  SHAP_RF      RBO   6        0.697      0.37
6   communities  SHAP_RF      RBO   7        0.700      0.37
7   communities  SHAP_RF      RBO   8        0.698      0.37
8   communities  SHAP_RF      RBO   9        0.688      0.37
9   communities  SHAP_RF      RBO  10        0.690      0.37
10  communities  SHAP_RF      RBO  11        0.685      0.37
11  communities  SHAP_RF      RBO  12        0.676      0.37
12  communities  S

### 1.3. MLP specific models 
We have built-in functions to run functions from ``deepexplain`` and  ``deeplift`` packages. As well permutation and shapley values. User can either input a function or a string from [
                        'zero',
                        'saliency',
                        'grad*input',
                        'intgrad',
                        'elrp',
                        'deeplift',
                        'occlusion',
                        'shapley_sampling'] to run deepExplain. 
Or input strings from ['NonlinearMxtsMode.RevealCancel','NonlinearMxtsMode.GuidedBackprop'...] to run corresponding functions in deeplift. 


imlreliability package also support self-defined importance function, with three argument: ``(fitted model, training x, training y)``, and 1 output importance score in forms of list or array:``importance_func(model,x_train, y_train)``. 

And the defined estimator needs to be form of :
      
```Python
def _base_model_regression():
    model = Sequential()
    model.add(Dense(M, input_dim=M, activation='relu'))
    model.add(Dense(M, input_dim=M, activation='relu'))
    model.add(Dense(1, kernel_initializer='normal'))
    model.compile(loss='mean_squared_error', optimizer='adam')
    return model  
```

And the trained MLP model is saved as .h5 file. 
    
    

#### 1.3.1. Deeplift

Here we construct a MLP model with two hidden layers as the prediction model, and deeplift as the post-hoc method to measure the feature importance, by setting ``importance_func='NonlinearMxtsMode.RevealCancel'``. All other settings are the same as linear regression in 1.1. 

A two-layer default MLP will be computed if ``estimator = None``. We also support user-defined MLP models. Any deeplift.layers functions can be used to measure the feature importance by setting parameter ``importance_func`` in its string form. 


In [4]:
from deeplift.layers import NonlinearMxtsMode
import deeplift
importance_func = 'NonlinearMxtsMode.RevealCancel'

model_reg_mlp_dl=imlreliability.feature_importance.feature_impoReg_MLP(data_reg,
                                                                           
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_reg_mlp_dl.fit()
model_reg_mlp_dl.consistency(data_name='communities', estimator_name='MLP',impotance_func_name='DeepLift')
print(model_reg_mlp_dl.accuracy)
print(model_reg_mlp_dl.consistency)
print(model_reg_mlp_dl.prediction_consistency)


Iter:  0
DeepLift
nonlinear_mxts_mode is set to: RevealCancel
Computing scores for: NonlinearMxtsMode.RevealCancel
Iter:  1
DeepLift
nonlinear_mxts_mode is set to: RevealCancel
Computing scores for: NonlinearMxtsMode.RevealCancel
Iter:  2
DeepLift
nonlinear_mxts_mode is set to: RevealCancel
Computing scores for: NonlinearMxtsMode.RevealCancel
Importance Function is  DeepLift_MLP
          data model  Accuracy
0  communities   MLP  0.472432
1  communities   MLP  0.435864
2  communities   MLP  0.331513
           data        method criteria   K  Consistency  Accuracy
0   communities  DeepLift_MLP      RBO   1        0.000     0.413
1   communities  DeepLift_MLP      RBO   2        0.000     0.413
2   communities  DeepLift_MLP      RBO   3        0.000     0.413
3   communities  DeepLift_MLP      RBO   4        0.000     0.413
4   communities  DeepLift_MLP      RBO   5        0.000     0.413
5   communities  DeepLift_MLP      RBO   6        0.014     0.413
6   communities  DeepLift_MLP   

#### 1.3.2. DeepExplain
Here we construct a MLP model with two hidden layers as the prediction model, and epsilon-LRP as the post-hoc method to measure the feature importance, by setting ``importance_func='elrp'``. All other settings are the same as linear regression in 1.1. 

A two-layer default MLP will be computed if ``estimator = None``. We also support user-defined MLP models. Any DeepExplain function can be used to measure the feature importance by setting parameter ``importance_func`` in its string form. 


In [3]:
from deepexplain.tensorflow import DeepExplain
importance_func ='elrp'

## Two-layer default MLP will be computed if estimator =None. Can input user-defined MLP model

model_reg_mlp_dl=imlreliability.feature_importance.feature_impoReg_MLP(data_reg,
                                                                           
                 importance_func=importance_func,
                 n_repeat=3,split_proportion=0.7,
                rand_index=1)
model_reg_mlp_dl.fit()
model_reg_mlp_dl.consistency(data_name='communities', estimator_name='MLP',impotance_func_name='elrp')
print(model_reg_mlp_dl.accuracy)
print(model_reg_mlp_dl.consistency)
print(model_reg_mlp_dl.prediction_consistency)



Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.
Iter:  0
Instructions for updating:
Use tf.cast instead.
DeepExplain
Iter:  1
DeepExplain
Iter:  2
DeepExplain
Importance Function is  elrp_MLP
          data model  Accuracy
0  communities   MLP  0.448769
1  communities   MLP  0.451854
2  communities   MLP  0.333314
           data    method criteria   K  Consistency  Accuracy
0   communities  elrp_MLP      RBO   1        1.000     0.411
1   communities  elrp_MLP      RBO   2        0.750     0.411
2   communities  elrp_MLP      RBO   3        0.722     0.411
3   communities  elrp_MLP      RBO   4        0.729     0.411
4   communities  elrp_MLP      RBO   5        0.743     0.411
5   communities  elrp_MLP      RBO   6        0.744     0.411
6   communities  elrp_MLP      RBO   7        0.740     0.411
7   communities  elrp_MLP      RBO   8        0.757     0.411
8   communities  elrp_MLP      RBO   9        0.759   

## 2. Classification