Fix permutation importance failing when target is categorical #3017

angela97lin · 2021-11-07T19:37:13Z

codecov · 2021-11-07T19:40:38Z

Codecov Report

Merging #3017 (2c1f99c) into main (79b9d22) will increase coverage by 0.1%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main   #3017     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        312     312             
  Lines      30137   30143      +6     
=======================================
+ Hits       30041   30047      +6     
  Misses        96      96

Impacted Files	Coverage Δ
...alml/model_understanding/permutation_importance.py	`100.0% <100.0%> (ø)`
...understanding_tests/test_permutation_importance.py	`100.0% <100.0%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 79b9d22...2c1f99c. Read the comment docs.

jeremyliweishih

nice!

eccabay

LGTM, thanks for picking this up! Super useful in enabling human readable pipeline explanations :)

chukarsten

Do we need to address the classification aspect that freddy mentioned in the issue?

freddyaboulton

Looks good to me @angela97lin !

freddyaboulton · 2021-11-08T16:21:18Z

evalml/model_understanding/permutation_importance.py

@@ -324,6 +325,7 @@ def _fast_scorer(pipeline, features, X, y, objective):
        preds = pipeline.estimator.predict_proba(features)
    else:
        preds = pipeline.estimator.predict(features)
-        preds = pipeline.inverse_transform(preds)
+        if is_regression(pipeline.problem_type):


I think we're ok for classification here. In _fast_permutation_importance we encode the target with the pipeline._encode_targets method. So the estimator predictions will also be encoded.

Of course, if the pipeline does not have an encoder but has string-valued targets this would fail but I would say it's a pipeline definition bug as opposed to a permutation importance bug.

The fact that our classification objectives only supports integer-valued targets and that to score a pipeline it should have an encoder may not be super clear though.

Agreed! I wish there were a way to make this more clear. Pipelines require a label encoder for not only scoring but also fit so it still feel consistent, but I wonder if moving forward we could make these types of error messages more clear.

init

2429d73

angela97lin self-assigned this Nov 7, 2021

release notes

205f9fe

angela97lin marked this pull request as ready for review November 8, 2021 15:46

angela97lin requested review from freddyaboulton, bchen1116, chukarsten, christopherbunn, eccabay, jeremyliweishih and ParthivNaresh November 8, 2021 15:46

jeremyliweishih approved these changes Nov 8, 2021

View reviewed changes

eccabay approved these changes Nov 8, 2021

View reviewed changes

chukarsten reviewed Nov 8, 2021

View reviewed changes

freddyaboulton approved these changes Nov 8, 2021

View reviewed changes

angela97lin added 3 commits November 8, 2021 12:55

Merge branch 'main' into 3012_perm_importance_inv_transform

813ebba

Merge branch 'main' into 3012_perm_importance_inv_transform

7548250

Merge branch 'main' into 3012_perm_importance_inv_transform

2c1f99c

angela97lin merged commit 75b07c7 into main Nov 8, 2021

angela97lin deleted the 3012_perm_importance_inv_transform branch November 8, 2021 20:05

chukarsten mentioned this pull request Nov 9, 2021

Release v0.37.0 #3029

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix permutation importance failing when target is categorical #3017

Fix permutation importance failing when target is categorical #3017

angela97lin commented Nov 7, 2021

codecov bot commented Nov 7, 2021 •

edited

Loading

jeremyliweishih left a comment

eccabay left a comment

chukarsten left a comment

freddyaboulton left a comment

freddyaboulton Nov 8, 2021

angela97lin Nov 8, 2021

Fix permutation importance failing when target is categorical #3017

Fix permutation importance failing when target is categorical #3017

Conversation

angela97lin commented Nov 7, 2021

codecov bot commented Nov 7, 2021 • edited Loading

Codecov Report

jeremyliweishih left a comment

Choose a reason for hiding this comment

eccabay left a comment

Choose a reason for hiding this comment

chukarsten left a comment

Choose a reason for hiding this comment

freddyaboulton left a comment

Choose a reason for hiding this comment

freddyaboulton Nov 8, 2021

Choose a reason for hiding this comment

angela97lin Nov 8, 2021

Choose a reason for hiding this comment

codecov bot commented Nov 7, 2021 •

edited

Loading