Skip to content

Latest commit

 

History

History
57 lines (39 loc) · 1.95 KB

tree_interface.rst

File metadata and controls

57 lines (39 loc) · 1.95 KB

Low-level Tree Interface

The tree interface mimics that of sklearn.tree._tree.Tree. In order to get this level of detail, the argument enable_tree_details must be set to True on ensemble estimators prior to fitting. The single Tree estimators will always perform these calculations.

Since ranger does not track this level of detail on its trees, we perform extra operations in Python to provide them. These operations are currently quite slow and not well-optimized.

These operations include

  • determining which leaf node corresponds to each training sample for every tree
  • determining the sum of the weights of the training samples at each leaf node
  • determining the weighted average prediction value of each leaf node

These extra calculations deliver n_node_values, weighted_n_node_values, and value attributes of the Tree class.

Note

Since we don't have direct access to the sub-sampled training sets used in building each of the trees, we determine the above values using the full training set.

skranger.tree._tree.Tree

SHAP

RangerForestRegressor and RangerForestClassifier can be used with shap. A context manager is provided which patches skranger objects so that they work with shap.

from shap import TreeExplainer
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from skranger.ensemble import RangerForestClassifier
from skranger.utils.shap import shap_patch

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)

forest = RangerForestClassifier(enable_tree_details=True).fit(X_train, y_train)

with shap_patch():
    explainer = TreeExplainer(model=forest)

values = explainer.shap_values(X_test)