Interpreting random forest:
* https://blog.datadive.net/interpreting-random-forests/
* https://blog.datadive.net/random-forest-interpretation-with-scikit-learn/
* Original paper: https://arxiv.org/pdf/1312.1121.pdf

[A Unified Approach to Interpreting Model Predictions](http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions)
* [Non-technical summary](https://towardsdatascience.com/interpretable-machine-learning-with-xgboost-9ec80d148d27)
* [Python package](https://github.com/slundberg/shap)

Other resources on general ML interpretability
* [Beware Default Random Forest Importances](https://explained.ai/rf-importance/index.html)
* [Understanding Boosted Trees using TensorFlow](https://www.tensorflow.org/tutorials/estimator/boosted_trees_model_understanding)
* https://christophm.github.io/interpretable-ml-book/
* https://arxiv.org/pdf/1707.07149.pdf
    

In [3]:
import numpy as np
from treeinterpreter import treeinterpreter as ti
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import load_boston


In [25]:
boston = load_boston()
print(boston['DESCR'])

.. _boston_dataset:

Boston house prices dataset
---------------------------

**Data Set Characteristics:**  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pu

In [27]:
rf = RandomForestRegressor()
rf.fit(boston.data[:300], boston.target[:300])



RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=None,
                      max_features='auto', max_leaf_nodes=None,
                      min_impurity_decrease=0.0, min_impurity_split=None,
                      min_samples_leaf=1, min_samples_split=2,
                      min_weight_fraction_leaf=0.0, n_estimators=10,
                      n_jobs=None, oob_score=False, random_state=None,
                      verbose=0, warm_start=False)

In [28]:
instances = boston.data[[300, 309]]

In [30]:
instances.shape

(2, 13)

In [31]:
prediction, bias, contributions = ti.predict(rf, instances)

In [32]:
prediction, bias, contributions

(array([[29.44],
        [22.79]]),
 array([25.78013333, 25.78013333]),
 array([[-7.31220696e-01,  0.00000000e+00,  1.48230191e-01,
          7.33696411e-03,  3.66332172e-01,  4.01529413e+00,
          1.28101343e-01, -4.25214290e-01,  1.99791667e-01,
         -9.74908397e-01,  3.07508321e-01,  6.43333333e-02,
          5.54281933e-01],
        [ 3.34850098e-01,  0.00000000e+00, -1.05100356e-02,
         -1.67032967e-02,  1.52500000e-02, -5.83741950e+00,
         -1.60390743e-01,  4.67545871e-01,  4.95833333e-02,
          3.08591486e-03,  1.01330809e-01, -2.49394015e-01,
          2.31263823e+00]]))

In [40]:
for i in range(len(instances)):
    print("instance:", i)
    print("Bias", bias[i])
    print("Feature contribution:")
    for c, feature in sorted(
        zip(contributions[i], boston.feature_names),
        key=lambda x: -abs(x[0])  # makes sorted descending
    ):
        print(feature, round(c,2))
    print("-------")

instance: 0
Bias 25.78013333333334
Feature contribution:
RM 4.02
TAX -0.97
CRIM -0.73
LSTAT 0.55
DIS -0.43
NOX 0.37
PTRATIO 0.31
RAD 0.2
INDUS 0.15
AGE 0.13
B 0.06
CHAS 0.01
ZN 0.0
-------
instance: 1
Bias 25.78013333333334
Feature contribution:
RM -5.84
LSTAT 2.31
DIS 0.47
CRIM 0.33
B -0.25
AGE -0.16
PTRATIO 0.1
RAD 0.05
CHAS -0.02
NOX 0.02
INDUS -0.01
TAX 0.0
ZN 0.0
-------
