New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEA Top k accuracy metric #16625
FEA Top k accuracy metric #16625
Conversation
Handle multiclass case not multilabel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @gbolmier , a few comments ;)
Raise errors for `k`=1, `k`=`n_classes`, cover binary `y_true` as it can be a subset of the possible classes, fix doc and update tests and doctest to match changes
Thank you so much for taking the time of the review @NicolasHug! |
Add test for case when `y_true` = [0]*4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gbolmier a few more.
Let's also add a small section in the User Guide!
sklearn/metrics/_ranking.py
Outdated
@@ -1419,3 +1419,94 @@ def ndcg_score(y_true, y_score, k=None, sample_weight=None, ignore_ties=False): | |||
_check_dcg_target_type(y_true) | |||
gain = _ndcg_sample_scores(y_true, y_score, k=k, ignore_ties=ignore_ties) | |||
return np.average(gain, weights=sample_weight) | |||
|
|||
|
|||
def top_k_accuracy_score(y_true, y_score, k=5, normalize=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about the default, should it be 2
, e.g. the minimum value for which the function can be called? It would work in all cases. 5 would fail if n_classes < 5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that we want ValueError
here. I believe that for k>=n_classes
we should raise a warning with the same message probably but we probably want to return a score. No?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should raise an error if k >= n_classes. This will always output an accuracy of 1 which will lure inexperienced users into thinking their estimator is doing great, where in reality they're just misusing a metric.
In general, we try to make misuses hard / impossible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, we try to make misuses hard / impossible
Agree.
will lure inexperienced users into thinking their estimator is doing great
If they have a binary problem and they ask for top_5 accuracy then their estimator will be doing great obviously!
I dunno. IMHO a warning is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they have a binary problem and they ask for top_5 accuracy then their estimator will be doing great obviously!
Indeed. Also +1 to increase the default to at least 3. I think top_5 is good when you have a very large number of classes (e.g. ImageNet), but 30-60 classes which is more common with tabular data, top_3 is already quite useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they have a binary problem and they ask for top_5 accuracy then their estimator will be doing great obviously!
Obvious to you, maybe not to them. Also note that we error in the binary case too.
Co-Authored-By: Nicolas Hug <contact@nicolas-hug.com>
Co-Authored-By: Nicolas Hug <contact@nicolas-hug.com>
Change `k` default to 2. Update docstring and error msg
…k_accuracy_metric
Don't we want a default scorer for that metric? I understand that |
Add `y_true = [0]*5` case
Add `normalize == False` case
Change `k` default to 3 as it represents better the usual case. Make `y_true` value error message more explicit. Check `y_score` number of columns. Fix `k` check with n_classes
Should be better like that @jnothman |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work and patience @gbolmier, a few more minor comments below otherwise looks good.
Also please add it to https://scikit-learn.org/stable/modules/model_evaluation.html#common-cases-predefined-values
# More detailed explanatory text, if necessary. Wrap it to about 72 # characters or so. In some contexts, the first line is treated as the # subject of the commit and the rest of the text as the body. The # blank line separating the summary from the body is critical (unless # you omit the body entirely); various tools like `log`, `shortlog` # and `rebase` can get confused if you run the two together. # Explain the problem that this commit is solving. Focus on why you # are making this change as opposed to how (the code explains that). # Are there side effects or other unintuitive consequences of this # change? Here's the place to explain them. # Further paragraphs come after blank lines. # - Bullet points are okay, too # - Typically a hyphen or asterisk is used for the bullet, preceded # by a single space, with blank lines in between, but conventions # vary here # If you use an issue tracker, put references to them at the bottom, # like this: # Resolves: scikit-learn#123 # See also: scikit-learn#456, scikit-learn#789
# More detailed explanatory text, if necessary. Wrap it to about 72 # characters or so. In some contexts, the first line is treated as the # subject of the commit and the rest of the text as the body. The # blank line separating the summary from the body is critical (unless # you omit the body entirely); various tools like `log`, `shortlog` # and `rebase` can get confused if you run the two together. # Explain the problem that this commit is solving. Focus on why you # are making this change as opposed to how (the code explains that). # Are there side effects or other unintuitive consequences of this # change? Here's the place to explain them. # Further paragraphs come after blank lines. # - Bullet points are okay, too # - Typically a hyphen or asterisk is used for the bullet, preceded # by a single space, with blank lines in between, but conventions # vary here # If you use an issue tracker, put references to them at the bottom, # like this: # Resolves: scikit-learn#123 # See also: scikit-learn#456, scikit-learn#789
# More detailed explanatory text, if necessary. Wrap it to about 72 # characters or so. In some contexts, the first line is treated as the # subject of the commit and the rest of the text as the body. The # blank line separating the summary from the body is critical (unless # you omit the body entirely); various tools like `log`, `shortlog` # and `rebase` can get confused if you run the two together. # Explain the problem that this commit is solving. Focus on why you # are making this change as opposed to how (the code explains that). # Are there side effects or other unintuitive consequences of this # change? Here's the place to explain them. # Further paragraphs come after blank lines. # - Bullet points are okay, too # - Typically a hyphen or asterisk is used for the bullet, preceded # by a single space, with blank lines in between, but conventions # vary here # If you use an issue tracker, put references to them at the bottom, # like this: # Resolves: scikit-learn#123 # See also: scikit-learn#456, scikit-learn#789
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jnothman, do you mind checking if your comments have been addressed? Two approvals here already. Thanks for your time. |
Two approvals here! Time to merge? |
Thanks @gblomier. Sorry I've not found time to follow up!
|
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com> Co-authored-by: Jeremiah Johnson <jwjohnson314@gmail.com> Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com> Co-authored-by: Jeremiah Johnson <jwjohnson314@gmail.com> Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com>
Reference Issues/PRs
Closes #10488
Fixes #10144
Fixes #8234
What does this implement/fix? Explain your changes.
This implements a top-k accuracy classification metric, for use with predicted class scores in multiclass classification settings. A prediction is considered top-k accurate if the correct class is one of the k classes with the highest predicted scores.