New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Support for list of lists or list of arrays multilabel indicator (continuation) #14865
[MRG] Support for list of lists or list of arrays multilabel indicator (continuation) #14865
Conversation
Yes, let's do this! |
@jnothman, fixed it |
Thanks, it's looking pretty good.
Please add an Enhancement
entry to the change log at doc/whats_new/v0.22.rst
. Like the other entries there, please reference this pull request with :pr:
and credit yourself (and other contributors if applicable) with :user:
0d3797a
to
e18e836
Compare
Please append commits rather than force pushing. Much easier to see changes between reviews that way |
doc/whats_new/v0.22.rst
Outdated
:pr:`14336` by :user:`Gregory Dexter <gdex1>`. | ||
|
||
- |Enhancement| label_binarize now supports list of lists for multilabel data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be more helpful to advertise the changes in terms of how they affect metrics and other users of type_of_target
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how would you put it? one entry per impacted function or a more generic explanation on what is considered as multiclass-multioutput/multilabel-indicator?
I've not looked at the code or codecov's complaint this time |
I think this mostly affects sklearn.metrics, so commenting there that
"multilabel metrics now accept ..." would do; or add a "multiple modules"
section to the change log.
|
thanks for your help and patience @jnothman, you're the victim of my first contribution to scikit-learn |
I'm very happy you're finishing this off. It's something we should have done years ago, having removed sequence of sequence support around 2015 |
|
It looks like you've introduced a syntax error. And it's worth looking at the codecov error |
Maybe merging in the latest master will fix the codecov issue |
Thanks @leonardbinet , only super nitpicks from me
The PR has merge conflicts but apart from that LGTM
@@ -227,7 +228,7 @@ def type_of_target(y): | |||
>>> type_of_target(np.array([[1, 2], [3, 1]])) | |||
'multiclass-multioutput' | |||
>>> type_of_target([[1, 2]]) | |||
'multiclass-multioutput' | |||
'multilabel-indicator' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: is this a bugfix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is more-or-less a bug fix. Either way, type_of_target is expected to return "the most specific type that can be inferred". Multilabel is a specific subtype of multiclass-multioutput. So this is correct
@leonardbinet please resolve conflicts with master and address @NicolasHug's teeny weeny comments so that we can merge this for release. If that's looking unlikely in the next week or so, please let us know. |
@jnothman @NicolasHug sure, thanks for your feedbacks I'll fix this this week-end |
f0d8b94
to
a71cac1
Compare
@leonardbinet, force pushing makes it much harder for reviewers to track what changes have occurred since last review. |
Thank you @leonardbinet, @herilalaina and @srivatsan-ramesh!! |
Reference Issues/PRs
Rebase of #9158 to solve #7931
What does this implement/fix? Explain your changes.
Any other comments?