-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add precision/recall curve #794
Conversation
Codecov Report
@@ Coverage Diff @@
## master #794 +/- ##
=======================================
Coverage 99.52% 99.52%
=======================================
Files 159 159
Lines 6257 6306 +49
=======================================
+ Hits 6227 6276 +49
Misses 30 30
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really neat! :D
I took a quick glance and this is looking pretty good. Would you be able to add an example to the documentation? We currently have the other graph utils listed under search_results.ipynb
so that'd probably be a good place (https://evalml.featurelabs.com/en/latest/automl/search_results.html)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left a comment on the docs formatting but otherwise, LGTM! Great work :D
"source": [ | ||
"## Precision-Recall Curve\n", | ||
"\n", | ||
"For binary classification, you can view the precision-recall curve of a classifier" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
evalml/pipelines/graph_utils.py
Outdated
dict: Dictionary containing metrics used to generate a precision-recall plot, with the following keys: | ||
* `precision`: Precision values. | ||
* `recall`: Recall values. | ||
* `thresholds`: Threshold values used to produce the precision and recall. | ||
* `auc_score`: The area under the ROC curve. | ||
""" | ||
precision, recall, thresholds = sklearn_precision_recall_curve(y_true, y_pred_proba) | ||
auc_score = sklearn_auc(recall, precision) | ||
return {'precision': precision, | ||
'recall': recall, | ||
'thresholds': thresholds, | ||
'auc_score': auc_score} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a little bit funky in the docs (just the line for "Dictionary containing metrics used to generate a precision-recall plot, with the following keys") being bolded the same way "Returns"/other headers are. I think maybe using -
for bullets might fix this but not sure off the top of my head?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@angela97lin is there a spot in the docs I can use as a model? I can't seem to find anywhere that looks the way you describe, the other functions in graph_utils.py
look the same to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, not sure if there's an example in the docs currently. Let me try to create one, but otherwise this isn't a big deal at all 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, I just recreated it in a branch that I'm working on: https://evalml.featurelabs.com/en/710_target_check/generated/methods/evalml.data_checks.DetectInvalidTargetsDataCheck.validate.html#evalml.data_checks.DetectInvalidTargetsDataCheck.validate
(Adding a screenshot in case this disappears when I update my branch)
Here's the docstring I used to create this:
Returns:
list (DataCheckError): list with DataCheckErrors if any invalid data is found in target labels.
- abc
- def
I think the newline between the paragraph and the bullet list is important to it rendering correctly!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works perfectly, thanks!
Closes #792