Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX Allow sparse input data for OutputCodeClassifier #17233

Merged
merged 3 commits into from May 26, 2020
Merged

FIX Allow sparse input data for OutputCodeClassifier #17233

merged 3 commits into from May 26, 2020

Conversation

zoj613
Copy link
Contributor

@zoj613 zoj613 commented May 15, 2020

Reference Issues/PRs

Fixes #17218

What does this implement/fix? Explain your changes.

This PR fixes a TypeError bug that occurs when fitting OutputCodeClassifier using sparse input data.
As suggested by @glemaitre I added the allow_sparse=True keyword argument to _validate_data and check_array functions inside the methods fit and predict. This allows the base estimator to handle the data check and throw the appropriate exception if sparse input is not allowed. I added a test case as well.

@glemaitre glemaitre added this to TO REVIEW in Guillaume's pet May 18, 2020
@zoj613 zoj613 changed the base branch from master to 0.23.X May 18, 2020 21:32
@zoj613 zoj613 changed the base branch from 0.23.X to master May 18, 2020 21:32
@zoj613 zoj613 requested a review from glemaitre May 18, 2020 22:07
Copy link
Contributor

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 small changes (more for the style I should admit).
It remains a thing. Could you add an entry in doc/whats_new/0.24.rst in a section sklearn.multiclass and state this change as a |Fix|. You should credit yourself and this PR as well.

sklearn/tests/test_multiclass.py Outdated Show resolved Hide resolved
sklearn/tests/test_multiclass.py Outdated Show resolved Hide resolved
Copy link
Contributor

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We will need a second review on this.
@rth @jeremiedbb @jnothman @thomasjpfan @NicolasHug @adrinjalali can you give a look at it.

@glemaitre glemaitre moved this from TO REVIEW to TO BE MERGED in Guillaume's pet May 20, 2020
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise, thanks @zoj613

sklearn/tests/test_multiclass.py Outdated Show resolved Hide resolved
sklearn/tests/test_multiclass.py Show resolved Hide resolved
@glemaitre glemaitre self-requested a review May 26, 2020 10:08
@glemaitre
Copy link
Contributor

I merge master where we add some changes that simplified the tests.
I slightly modify the test using the new features from CheckingClassifier.

If the tests pass I will merge it.

@glemaitre glemaitre changed the title [MRG] Allow sparse input data for OutputCodeClassifier FIX Allow sparse input data for OutputCodeClassifier May 26, 2020
@glemaitre glemaitre merged commit 6b68144 into scikit-learn:master May 26, 2020
@glemaitre
Copy link
Contributor

Thanks @zoj613

@glemaitre glemaitre moved this from TO BE MERGED to MERGED in Guillaume's pet May 27, 2020
viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020
)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

OutputCodeClassifier does not work with sparse input data
3 participants