Find rows near the decision boundary #2908

bchen1116 · 2021-10-14T15:22:37Z

Design doc here

Walkthrough use in doc here. The link in this walkthrough should work once we publish the doc to main

API doc here

codecov · 2021-10-14T15:28:09Z

Codecov Report

Merging #2908 (db16985) into main (2767e33) will increase coverage by 0.1%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main   #2908     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        302     302             
  Lines      28433   28587    +154     
=======================================
+ Hits       28340   28494    +154     
  Misses        93      93

Impacted Files	Coverage Δ
evalml/pipelines/utils.py	`99.5% <100.0%> (+0.1%)`	⬆️
evalml/tests/pipeline_tests/test_pipeline_utils.py	`99.7% <100.0%> (+0.3%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2767e33...db16985. Read the comment docs.

freddyaboulton

@bchen1116 Thanks for this! I left some minor comments for improving the implementation and tests. I am only "blocking" because I would like to discuss whether "epsilon" is a more useful parameter than "num_rows". I think it's confusing we return all rows by default.

docs/source/user_guide/pipelines.ipynb

evalml/pipelines/utils.py

evalml/tests/pipeline_tests/test_pipeline_utils.py

freddyaboulton · 2021-10-15T15:10:24Z

evalml/tests/pipeline_tests/test_pipeline_utils.py

+    )
+    assert all(vals.values == expected_vals)
+
+    if types == "all":


Isn't this check redundant?

This is just double-checking that we can exclude passing in y and still get the same results

Thanks for explaining!

chukarsten

Super thorough testing. Some of the tests took me a minute to realize what exactly was being tested. Perhaps this could be remedied with maybe more specific names for the tests? I feel like test_rows_of_interest_threshold was the one that I spent the most time on. Anyway, nothing blocking. Just food for thought.

evalml/pipelines/utils.py

chukarsten · 2021-10-18T15:44:00Z

evalml/pipelines/utils.py

+
+    if threshold is not None and (threshold < 0 or threshold > 1):
+        raise ValueError(
+            "Provided threshold {} must be between [0, 1]".format(threshold)


nit: hard brackets is for inclusive, might want to switch to (0, 1).

@chukarsten I was thinking we should allow the user to set the threshold as 0 or 1 if they wanted to get the rows closest to those values. What do you think?

freddyaboulton

Thank you for making the changes @bchen1116 ! This looks good to me!

evalml/tests/pipeline_tests/test_pipeline_utils.py

evalml/pipelines/utils.py

evalml/tests/pipeline_tests/test_pipeline_utils.py

… bc_decision_boundary

initial commit with code

9950792

bchen1116 self-assigned this Oct 14, 2021

bchen1116 added 2 commits October 14, 2021 11:23

update release note

6d69145

Merge branch 'main' into bc_decision_boundary

a8ce329

bchen1116 added 4 commits October 14, 2021 11:36

add raises docs

b616bbb

remove space

59daca3

add docs

4012be9

update docs

395fb66

bchen1116 requested review from chukarsten, dsherry, freddyaboulton, angela97lin, christopherbunn, jeremyliweishih and ParthivNaresh October 14, 2021 17:44

freddyaboulton suggested changes Oct 15, 2021

View reviewed changes

bchen1116 added 4 commits October 18, 2021 11:37

address comments

34e2792

fix release notes

d856f50

fix test

958fbee

fix docstring

b2f1227

chukarsten approved these changes Oct 18, 2021

View reviewed changes

bchen1116 added 3 commits October 18, 2021 13:18

update doc

f57bb12

remove link

c98cc46

Merge branch 'main' into bc_decision_boundary

d605988

bchen1116 requested a review from freddyaboulton October 18, 2021 21:16

freddyaboulton approved these changes Oct 18, 2021

View reviewed changes

evalml/tests/pipeline_tests/test_pipeline_utils.py Show resolved Hide resolved

evalml/pipelines/utils.py Outdated Show resolved Hide resolved

evalml/tests/pipeline_tests/test_pipeline_utils.py Show resolved Hide resolved

bchen1116 added 3 commits October 19, 2021 01:01

fix abs

515e8b5

Merge branch 'bc_decision_boundary' of github.com:alteryx/evalml into…

3400d05

… bc_decision_boundary

Merge branch 'main' into bc_decision_boundary

db16985

bchen1116 merged commit 6f8d37a into main Oct 19, 2021

chukarsten mentioned this pull request Oct 27, 2021

Release v0.36.0 #2974

Merged

freddyaboulton deleted the bc_decision_boundary branch May 13, 2022 15:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Find rows near the decision boundary #2908

Find rows near the decision boundary #2908

bchen1116 commented Oct 14, 2021 •

edited

Loading

codecov bot commented Oct 14, 2021 •

edited

Loading

freddyaboulton left a comment

freddyaboulton Oct 15, 2021

bchen1116 Oct 18, 2021

freddyaboulton Oct 18, 2021

chukarsten left a comment

chukarsten Oct 18, 2021

bchen1116 Oct 18, 2021

freddyaboulton left a comment

Find rows near the decision boundary #2908

Find rows near the decision boundary #2908

Conversation

bchen1116 commented Oct 14, 2021 • edited Loading

codecov bot commented Oct 14, 2021 • edited Loading

Codecov Report

freddyaboulton left a comment

Choose a reason for hiding this comment

freddyaboulton Oct 15, 2021

Choose a reason for hiding this comment

bchen1116 Oct 18, 2021

Choose a reason for hiding this comment

freddyaboulton Oct 18, 2021

Choose a reason for hiding this comment

chukarsten left a comment

Choose a reason for hiding this comment

chukarsten Oct 18, 2021

Choose a reason for hiding this comment

bchen1116 Oct 18, 2021

Choose a reason for hiding this comment

freddyaboulton left a comment

Choose a reason for hiding this comment

bchen1116 commented Oct 14, 2021 •

edited

Loading

codecov bot commented Oct 14, 2021 •

edited

Loading