Skip to content

Commit

Permalink
Merge pull request #99 from oegedijk/dev
Browse files Browse the repository at this point in the history
v0.3.3
  • Loading branch information
oegedijk committed Mar 11, 2021
2 parents 4a4aa57 + d0f7a91 commit f5bd4a5
Show file tree
Hide file tree
Showing 13 changed files with 797 additions and 356 deletions.
42 changes: 41 additions & 1 deletion RELEASE_NOTES.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,44 @@
# Release Notes
## Version 0.3.3:

Highlights:
* Adding support for cross validated metrics
* Better support for pipelines by using kernel explainer
* Making explainer threadsafe by adding locks
* Remove outliers from shap dependence plots

### Breaking Changes
- parameter `permutation_cv` has been deprecated and replaced by parameter `cv` which
now also works to calculate cross-validated metrics besides cross-validated
permutation importances.

### New Features
- metrics now get calculated with cross validation over `X` when you pass the
`cv` parameter to the explainer, this is useful when for some reason you
want to pass the training set to the explainer.
- adds winsorization to shap dependence and shap interaction plots
- If `shap='guess'` fails (unable to guess the right type of shap explainer),
then default to the model agnostic `shap='kernel'`.
- Better support for sklearn `Pipelines`: if not able to extract transformer+model,
then default to `shap.KernelExplainer` to explain the entire pipeline
- you can now remove outliers from shap dependence/interaction plots with
`remove_outliers=True`: filters all outliers beyond 1.5*IQR

### Bug Fixes
- Sets proper `threading.Locks` before making calls to shap explainer to prevent race
conditions with dashboards calling for shap values in multiple threads.
(shap is unfortunately not threadsafe)
-

### Improvements
- single shap row KernelExplainer calculations now go without tqdm progress bar
- added cutoff tpr anf fpr to roc auc plot
- added cutoff precision and recall to pr auc plot
- put a loading spinner on shap contrib table

### Other Changes
-
-


## Version 0.3.2.2:
Expand All @@ -12,7 +52,7 @@
### Bug Fixes
- bug fix to make `shap.KernelExplainer` (used with explainer parameter`shap='kernel'`)
work with `RegressionExplainer`
- bug fix when no explicit `labels` are passed with index selector
- bug fix when no explicit `labels` are based with index selector
- component only update if `explainer.index_exists()`: no `IndexNotFoundErrors` anymore.
- fixed title for regression index selector labeled 'Custom' bug
- `get_y()` now returns `.item()` when necessary
Expand Down
6 changes: 4 additions & 2 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
## Plots:
- add SHAP decision plots:
https://towardsdatascience.com/introducing-shap-decision-plots-52ed3b4a1cba
- add winsor to shap dependence
- make plot background transparent?
- Only use ScatterGl above a certain cutoff
- seperate standard shap plots for shap_interaction plots
Expand All @@ -24,6 +23,7 @@
### Regression plots:

## Explainers:
- Turn print statements into logging
- pass n_jobs to pdp_isolate
- add ExtraTrees and GradientBoostingClassifier to tree visualizers
- add plain language explanations
Expand All @@ -37,6 +37,7 @@


## Dashboard:
- Turn print statements into logging
- make poweredby right align
- more flexible instantiate_component:
- no explainer needed (if explainer component detected, pass otherwise ignore)
Expand All @@ -59,7 +60,7 @@


### Components
- add winsor to shap dependence
- add feature descriptions component
- add predictions list to whatif composite:
- https://github.com/oegedijk/explainerdashboard/issues/85
- add circular callbacks to cutoff - cutoff percentile
Expand All @@ -86,6 +87,7 @@
- Add this method? : https://arxiv.org/abs/2006.04750?

## Tests:
- add cv metrics tests
- add tests for InterpretML EBM (shap 0.37)
- write tests for explainerhub CLI add user
- test model_output='probability' and 'raw' or 'logodds' seperately
Expand Down
20 changes: 11 additions & 9 deletions docs/source/explainers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -233,13 +233,15 @@ An example of using setting ``X_background`` and ``model_output`` with a
ExplainerDashboard(explainer).run()


permutation_cv
--------------
cv
--

Normally metrics and permutation importances get calculated over a single fold
(assuming the data ``X`` is the test set). However if you pass the training set
to the explainer, you may wish to cross-validate calculate the permutation
importances and metrics. In that case pass the number of folds to ``cv``.
Note that custom metrics do not work with cross validation for now.

Normally permutation importances get calculated over a single fold (assuming the
data is the test set). However if you pass the training set to the explainer,
you may wish to cross-validate calculate the permutation importances. In that
case pass the number of folds to ``permutation_cv``.

na_fill
-------
Expand Down Expand Up @@ -505,7 +507,7 @@ get_importances_df
.. automethod:: explainerdashboard.explainers.BaseExplainer.get_importances_df

get_contrib_df
^^^^^^^^^^
^^^^^^^^^^^^^^

.. automethod:: explainerdashboard.explainers.BaseExplainer.get_contrib_df

Expand Down Expand Up @@ -614,12 +616,12 @@ with the following additional methods::


get_decisionpath_df
^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^

.. automethod:: explainerdashboard.explainers.RandomForestExplainer.get_decisionpath_df

get_decisionpath_summary_df
^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. automethod:: explainerdashboard.explainers.RandomForestExplainer.get_decisionpath_summary_df

Expand Down
11 changes: 7 additions & 4 deletions explainerdashboard/dashboard_components/classifier_components.py
Original file line number Diff line number Diff line change
Expand Up @@ -357,9 +357,10 @@ def update_output_div(index, pos_label):
preds_df = self.explainer.prediction_result_df(index, round=self.round, logodds=True)
preds_df.probability = np.round(100*preds_df.probability.values, self.round).astype(str)
preds_df.probability = preds_df.probability + ' %'
preds_df.logodds = np.round(preds_df.logodds.values, self.round).astype(str)
if 'logodds' in preds_df.columns:
preds_df.logodds = np.round(preds_df.logodds.values, self.round).astype(str)

if self.explainer.model_output!='logodds':
if self.explainer.model_output != 'logodds':
preds_df = preds_df[['label', 'probability']]

preds_table = dbc.Table.from_dataframe(preds_df,
Expand All @@ -379,7 +380,8 @@ def update_output_div(pos_label, *inputs):
preds_df = self.explainer.prediction_result_df(X_row=X_row, round=self.round, logodds=True)
preds_df.probability = np.round(100*preds_df.probability.values, self.round).astype(str)
preds_df.probability = preds_df.probability + ' %'
preds_df.logodds = np.round(preds_df.logodds.values, self.round).astype(str)
if 'logodds' in preds_df.columns:
preds_df.logodds = np.round(preds_df.logodds.values, self.round).astype(str)

if self.explainer.model_output!='logodds':
preds_df = preds_df[['label', 'probability']]
Expand Down Expand Up @@ -527,7 +529,8 @@ def layout(self):
marks={0.01: '0.01', 0.25: '0.25', 0.50: '0.50',
0.75: '0.75', 0.99: '0.99'},
included=False,
tooltip = {'always_visible' : False}),
tooltip = {'always_visible' : False},
updatemode='drag'),
], id='precision-cutoff-div-'+self.name),
dbc.Tooltip(f"Scores above this cutoff will be labeled positive",
target='precision-cutoff-div-'+self.name,
Expand Down

0 comments on commit f5bd4a5

Please sign in to comment.