-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pipeline threshold to confusion matrix returns #3080
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3080 +/- ##
=======================================
+ Coverage 99.8% 99.8% +0.1%
=======================================
Files 313 313
Lines 30470 30483 +13
=======================================
+ Hits 30380 30393 +13
Misses 90 90
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, thanks for outlining these two concerns! Is the first issue because of the way pandas internally handles things? aka it just rounds up rather than printing the actual value? If so, I'd say it's not a big deal given the actual index is still what we expect it to be.
@angela97lin yep, that was the first concern! Just an aesthetic issue. Filed an issue to address the second here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good man, thanks for filing an issue for the second point. I think we're fine with the aesthetic issue for now.
fix #3079
There are 2 potential issues that I want to raise, although they are not severe enough to block this PR from being merged.
First is the aesthetic of the index when it is very granular, as can be seen here:
Note that the threshold is
0.99999...
, but looking at the dataframe, it just shows as1.0
, which could be confusing for OS users. By grabbing the index itself,, we can see the actual value. We could stringify the index to have them appear, but this might be annoying for our internal use:The other is when our optimized pipeline threshold doesn't match the best threshold chosen through this method. This can be seen in the problem above, but also can be seen here:
Above, the pipeline is optimized using
accuracy binary
, and we see the pipeline threshold0.360321
actually has a worse performance value compared to0.5
with accuracy, which is the ideal valuefind_confusion_matrix_per_thresholds
finds. The differences here are likely how we're finding the optimal threshold, with ouroptimize_thresholds
using gradient descent, and our current method using a simple linear scan. Is this disparity an issue for our users? Discussing with @freddyaboulton, it could be confusing when our optimal thresholds don't match up. However, what would be the best approach to fix this, if necessary?Again, these two issues shouldn't block the merge of this PR, but are both things I wanted to bring up for discussion.