Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC Separate predefined scorer names from the ones requiring make_scorer #28750
DOC Separate predefined scorer names from the ones requiring make_scorer #28750
Changes from 8 commits
c025b34
981b7c5
bececea
08dbee8
45f6e38
9801391
91e0935
cca45a9
c9bce0c
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last thing before to merge: I think that we should move the example below above the table that you added because it does not rely on
make_scorer
. However, we could add a similar example to show that we can pass a scorer object obtained viamake_scorer
.In this case, we have a usage example for each configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had some back and forth already about the exact position of the table. I feel if it is below the example, and just above "Defining your scoring strategy from metric functions" section, then it might as well be within that section. Perhaps @ogrisel can chime in too?
And if I'm understanding correctly what you mean by "similar example to show that we can pass a scorer object obtained via make_scorer", then that example is in the 3.3.1.2. section already too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the example right now is:
I feel strange that it comes right after the table that mentioning that you should
make_scorer
Would it not be more friendly to have:
It might be slightly redundant with the section below but this is just a short example usage.
But indeed @ogrisel had maybe another opinion, so let's see what I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that this would be a more logical way to structure it, and yes, also agree on the redundancy since the next section talks about make_scorer in more detail and gives that exact same example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, let's move back the second table at the beginning of the "3.3.1.2. Defining your scoring strategy from metric functions" section. We can start the section with a short sentence such as the one you wrote:
(without link to the section since we are already in it).
Then you can simplify the paragraph that starts with the "Many metrics are not given names to ..." to remove redundancies but keep introducing the usage example with the snippet that shows how to combine make_scorer with user settable parameters for a grid search.
Then, after the example snippet, finally insert the paragraph with the two bullet points:
This is not as important as the rest so I would move it to the end of the section but it's still useful information to speak about the naming conventions of the metric functions and how to set the
higher_is_better
keyword parameter in a consistent way.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are missing
d2_absolute_error_score
here. But this PR made me wonder if we should actually accept it as a named scorer. Thoughts on this @glemaitre, @ogrisel ?Of course, that would be done in another PR. As this one already looks in good shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep this would be for another PR. I'm not sure that there is a meaningful default for all use-case and it might be better to make sure that a user choose the parameter each time.
@ogrisel will have better insights on these metrics indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a discussion in: #28750 (comment)
I think it was removed because you do not have to provide a parameter to
make_scorer
. I also first thought thatd2_absolute_error_score
got forgotten. Maybe that is a sign we should add a sentence (in a future PR?) to the table to say "there are more metrics, this is just a subset"?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is that
d2_absolute_error_score
is already equivalent tomake_scorer(d2_pinball_score, alpha=0.5)
. Accepting it as a named scorer (a string to be passed toscoring
) is redundant but could be a common enough practice to be worth the shortcut. Indeed, its presence in model_evaluation.rst L105 makes me think it was meant to be the case, but the OP from #22118 forgot to add it to the_SCORERS
dict.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do this in a dedicated follow-up PR (and document it as part of the first table for named scorers that do not require setting parameters).