Add a “cost-benefit curve” util method to graph objective score vs binary classification threshold #1081

angela97lin · 2020-08-20T06:19:58Z

Closes #1026, continuation and update from #1066

(From https://evalml.alteryx.com/en/1026_cost_curve/user_guide/model_understanding.html#Graphing-Utilities)

…1026_cost_curve

codecov · 2020-08-20T07:28:00Z

Codecov Report

Merging #1081 into main will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #1081      +/-   ##
==========================================
+ Coverage   99.89%   99.91%   +0.01%     
==========================================
  Files         192      191       -1     
  Lines       10981    10701     -280     
==========================================
- Hits        10970    10692     -278     
+ Misses         11        9       -2

Impacted Files	Coverage Δ
evalml/demos/breast_cancer.py	`100.00% <ø> (ø)`
evalml/model_understanding/__init__.py	`100.00% <ø> (ø)`
evalml/objectives/cost_benefit_matrix.py	`100.00% <ø> (ø)`
evalml/utils/__init__.py	`100.00% <ø> (ø)`
evalml/model_understanding/graphs.py	`100.00% <100.00%> (ø)`
evalml/tests/component_tests/test_components.py	`100.00% <100.00%> (+0.39%)`	⬆️
...lml/tests/model_understanding_tests/test_graphs.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 52ea6d9...4a69dc7. Read the comment docs.

docs/source/user_guide/model_understanding.ipynb

evalml/model_understanding/graphs.py

evalml/tests/component_tests/test_components.py

freddyaboulton

@angela97lin Looks good! I left some comments that I think would be good to resolve before merge. I also left a not-blocking comment to hear your thoughts on an issue we may run into in the near future lol. No need to resolve that comment in this PR.

docs/source/user_guide/model_understanding.ipynb

evalml/tests/model_understanding_tests/test_graphs.py

evalml/model_understanding/graphs.py

evalml/tests/model_understanding_tests/test_graphs.py

freddyaboulton · 2020-08-21T14:32:05Z

evalml/model_understanding/graphs.py

@@ -319,3 +319,65 @@ def graph_permutation_importance(pipeline, X, y, objective, show_all_features=Fa

    fig = go.Figure(data=data, layout=layout)
    return fig
+
+
+def binary_objective_vs_threshold(pipeline, X, y, objective, steps=100):


Not-blocking but is there a way to unify this function with the threshold optimization happening in AutoMLSearch? I think we're going to start computing model understanding methods within the CV folds soon-ish and it'd be nice to not optimize the same function twice.

This is outside the scope of this PR but I want to start the conversation and hear what you guys think.

That's a really good question, and something I was thinking about too. Right now, automl does a maximum of 100 iterations of trying to minimize our cost function (optimize_threshold in binary_classification_objective.py). Although this method currently just steps through and calculates the objective function at each threshold, I can see us then finding the threshold step that gives us the best score and setting that as a threshold for our pipeline.

I think it'd be a really neat idea to support a threshold_optimization_method parameter to specify how we should optimize for the threshold, which can then allow us to either use our current implementation or this implementation (or anything else we come up with later down the road). That's my two cents!

@freddyaboulton @angela97lin we do support threshold optimization. Are you talking about doing that inside this graph data method? I do think it could be a nice addition to compute and plot the optimal threshold, and display the numeric threshold value in the plot. And include that in the plot data of course.

We don't currently expose the optimization method in the API but we could.

@dsherry Yup, we currently support threshold optimization but we use minimize_scalar to do so; my comment was just suggesting that we could expose the threshold_optimization_method and allow users to choose whether we optimize via minimize_scalar or what we've done here (sweep through possible threshold values, choose one that gives best score). All future work of course!

dsherry

Great stuff! I liked the writeups. The plot looks nice :)

Stuff which I think should be addressed before merge:

Resolve the discussion on the plot title. I left a suggested change.
Left a question about test_estimator_needs_fitting_false.
I left a recommendation for adding a section to the Objectives guide and linking to it in the Model Understanding guide. I guess this isn't required before merge, but it feels important to get to!

docs/source/user_guide/model_understanding.ipynb

dsherry · 2020-08-21T22:41:36Z

docs/source/user_guide/model_understanding.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Graphing Utilities"


Let's delete this header? There isn't another "##"-level header in this file, just "###"

@dsherry The other "##" header is for ## Explaining Predictions; I wanted to separate out the Graphing Utilities from the "Explaining Predictions" section 😁

dsherry · 2020-08-21T22:43:36Z

docs/source/user_guide/model_understanding.ipynb

-    "## Explaining Individual Predictions\n",
+    "### Binary Objective Score vs. Threshold Graph\n",
+    "\n",
+    "For binary classification objectives that we can tune the decision threshold for (objectives that have `score_needs_proba` set to False), we can obtain and graph the scores for thresholds from zero to one, calculated at evenly-spaced intervals determined by `steps`."


I like it.

I see we don't mention threshold optimization in the guide yet. I think we should add something about that.

Let's add something like this to the Objectives guide, Core Objectives section, under a sub-heading

Some binary classification objectives like log loss and AUC are unaffected by the choice of binary classification threshold, because they score based on the predicted probabilities or examine a range of threshold values. These metrics are defined with score_needs_proba set to False. For all other binary classification objectives, we can compute the optimal binary classification threshold from the predicted probabilities and the target.

import evalml class RFBinaryClassificationPipeline(evalml.pipelines.BinaryClassificationPipeline): component_graph = ['Simple Imputer', 'Random Forest Classifier'] X, y = evalml.demos.load_breast_cancer() pipeline = RFBinaryClassificationPipeline({}) pipeline.fit(X, y) print(pipeline.threshold) print(pipeline.score(X, y, objectives=['F1'])) pipeline.threshold = evalml.objectives.F1.optimize_threshold(pipeline.predict_proba(X), y) print(pipeline.threshold) print(pipeline.score(X, y, objectives=['F1']))

Then, on this page we can link to that part of the guide.

Also, one suggestion here: "For binary classification objectives that we can tune the decision threshold for" --> "For binary classification objectives which are sensitive to the decision threshold"

dsherry · 2020-08-21T23:05:23Z

evalml/model_understanding/graphs.py

@@ -319,3 +319,65 @@ def graph_permutation_importance(pipeline, X, y, objective, show_all_features=Fa

    fig = go.Figure(data=data, layout=layout)
    return fig
+
+
+def binary_objective_vs_threshold(pipeline, X, y, objective, steps=100):


@freddyaboulton @angela97lin we do support threshold optimization. Are you talking about doing that inside this graph data method? I do think it could be a nice addition to compute and plot the optimal threshold, and display the numeric threshold value in the plot. And include that in the plot data of course.

We don't currently expose the optimization method in the API but we could.

evalml/model_understanding/graphs.py

evalml/objectives/cost_benefit_matrix.py

evalml/tests/component_tests/test_components.py

evalml/tests/utils_tests/test_graph_utils.py

evalml/tests/model_understanding_tests/test_graphs.py

angela97lin added 25 commits August 14, 2020 14:41

init but need to figure out api

7324e38

add graph

2f070ef

update graph_cost_benefit_thresholds

8b107a1

release notes

a1f5084

linting

58fa2df

update docstr

c5958f4

add tests and separate out functionality

bc4f334

clean up

c57d9d4

add to api ref

9b4c4f1

Merge branch 'main' into 1026_cost_curve

b2507d7

update init file

8d8efde

Merge branch 'main' into 1026_cost_curve

46cf9c0

merging

f4be228

remove file

006e750

lint

e4ac731

fix patch

58285b8

more cleanup

5840315

Merge branch 'main' into 1026_cost_curve

c805d12

update docstr

facce98

Merge branch '1026_cost_curve' of github.com:FeatureLabs/evalml into …

b4a2f19

…1026_cost_curve

update to generalize bin class objs

1e398fe

update docs and implementation

ccc5562

remove outputs from jupyter

a26f004

Merge branch 'main' into 1026_cost_curve

34740cd

update col name

baae0f0

angela97lin added this to the August 2020 milestone Aug 20, 2020

angela97lin self-assigned this Aug 20, 2020

fix

d1588f1

linting

ce51b68

add mock class for codecov

3fef40e

angela97lin changed the title ~~Add a “cost-benefit curve” util method to graph cost-benefit matrix score vs binary classification threshold~~ Add a “cost-benefit curve” util method to graph objective score vs binary classification threshold Aug 20, 2020

update axes

e91ec73

angela97lin requested review from dsherry, freddyaboulton and jeremyliweishih and removed request for dsherry and freddyaboulton August 20, 2020 15:53

angela97lin marked this pull request as ready for review August 20, 2020 15:53

angela97lin requested a review from bchen1116 August 20, 2020 15:57

angela97lin commented Aug 20, 2020

View reviewed changes

docs/source/user_guide/model_understanding.ipynb Show resolved Hide resolved

angela97lin commented Aug 20, 2020

View reviewed changes

evalml/model_understanding/graphs.py Outdated Show resolved Hide resolved

angela97lin commented Aug 20, 2020

View reviewed changes

evalml/tests/component_tests/test_components.py Show resolved Hide resolved

Merge branch 'main' into 1026_cost_curve

7e53d6b

freddyaboulton approved these changes Aug 21, 2020

View reviewed changes

revert notebook version and add another test

01caa21

dsherry approved these changes Aug 21, 2020

View reviewed changes

angela97lin added 8 commits August 22, 2020 02:30

Merge branch 'main' into 1026_cost_curve

12974e2

Merge branch 'main' into 1026_cost_curve

b76f225

some cleanup, more to address

7b74e2d

merging

811a348

cleaning up

0771c2b

fix docstr

8de8bd1

attempt fix

d3dd304

attempt fix

4a69dc7

angela97lin merged commit 4870c83 into main Aug 25, 2020

angela97lin deleted the 1026_cost_curve branch August 25, 2020 00:59

dsherry mentioned this pull request Aug 25, 2020

Release v0.13.1 #1101

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a “cost-benefit curve” util method to graph objective score vs binary classification threshold #1081

Add a “cost-benefit curve” util method to graph objective score vs binary classification threshold #1081

angela97lin commented Aug 20, 2020 •

edited

codecov bot commented Aug 20, 2020 •

edited

freddyaboulton left a comment

freddyaboulton Aug 21, 2020

angela97lin Aug 21, 2020

dsherry Aug 21, 2020

angela97lin Aug 24, 2020

dsherry left a comment

dsherry Aug 21, 2020

angela97lin Aug 24, 2020

dsherry Aug 21, 2020

dsherry Aug 21, 2020

Add a “cost-benefit curve” util method to graph objective score vs binary classification threshold #1081

Add a “cost-benefit curve” util method to graph objective score vs binary classification threshold #1081

Conversation

angela97lin commented Aug 20, 2020 • edited

codecov bot commented Aug 20, 2020 • edited

Codecov Report

freddyaboulton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsherry left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angela97lin commented Aug 20, 2020 •

edited

codecov bot commented Aug 20, 2020 •

edited