This library provides utilities for generating visual explanations of Gradient Boosting models. I recommend you jump in through the Binder link above, which renders the notebook.ipynb file. This interactive Jupyter notebook is an Explainable that showcases the value of the library and provides sample code.
Alternatively, you can run the notebook locally by cloning the repository and then performing the following:
- Navigate into the package directory.
- Install the conda environment.
conda env create binder/environment.yml
- Activate the conda environment.
conda activate ForestForTheTrees
- Run the postBuild script (this installs the appropriate jupyterlab extension required to display interactive widgets)
bash binder/postBuildor just run this command directly
jupyter labextension install @jupyter-widgets/jupyterlab-manager.
- Fire up Jupyter Lab, run all cells and begin interacting with the notebook.
jupyter lab notebook.ipynb
Note that a recent version of Jupyter Lab (included in the environment) is required to run this notebook - Jupyter notebooks will not work (at least out of the box). This is due to some peculiarities in the interaction of Altair, ipywidgets, and Jupyter.
I recommend running all cells as soon as the notebook is opened. Due to the nature of the interactive widgets, it is not possible to save the state, so the notebook is saved without output. If you are perusing the full document, each cell will have run by the time you get to it. This applies whether viewing locally or via Binder.
As mentioned above, the best way to get a sense of how Gradient Boosting models can be explained with ForestForTheTrees is to run the Binder link above. To get started quickly, adapt the minimal example below:
#load dataset dataset_df = pd.read_csv("Some_file.csv") target_column = "Target" #the value to predict #build model model = GradientBoostingRegressor( n_estimators = 100 ) #fit model model.fit( dataset_df.drop(target_column, axis = 1), dataset_df.loc[:,target_column] ) #you should build a good model here using train/test split #initialize ForestForTheTrees with dataset, model, and target f2t = ft.ForestForTheTrees( dataset = dataset_df, #pass bike instead to use the sample dataset model = model, target_col = "Ridership" ) #extract the underlying structure of the model #this must be called before displaying the visual explanation f2t.extract_components() #output the visual explanation at the selected fidelity f2t.explain( fidelity_threshold = .95 )
This library is under active development - please review the Issues tab for current priorities. Feature requests and bug reports are welcomed! If you find this library useful, please feel free to message me and let me know how it went.