Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Raise error in OneHotEncoder.inverse_transform #14982

Merged

Conversation

kwinata
Copy link
Contributor

@kwinata kwinata commented Sep 14, 2019

Reference Issues/PRs

Fixes #14934

What does this implement/fix? Explain your changes.

Add case for throwing exception as expected.

Any other comments?

NA

@kwinata kwinata force-pushed the fix_inverse_transform_OneHotEncoder branch 2 times, most recently from c51565d to d9bce70 Compare September 14, 2019 11:11
@kwinata kwinata changed the title Raise error in OneHotEncoder.inverse_transform [MRG] Raise error in OneHotEncoder.inverse_transform Sep 14, 2019
@kwinata
Copy link
Contributor Author

kwinata commented Sep 25, 2019

Hi @jnothman, please review my PR :)

@cmarmo
Copy link
Member

cmarmo commented May 20, 2020

Hi @kwinata, please, forgive us! Your PR clearly needed more attention!
@NicolasHug, @thomasjpfan maybe you are available for a review despite the conflicts? Thanks a lot!
@kwinata if you are still interested in working on it, could you please resolve the conflicts? Sorry again!

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kwinata !

sklearn/preprocessing/tests/test_encoders.py Outdated Show resolved Hide resolved
sklearn/preprocessing/tests/test_encoders.py Outdated Show resolved Hide resolved
dropped = np.asarray(sub.sum(axis=1) == 0).flatten()
if dropped.any():
X_tr[dropped, i] = self.categories_[i][self.drop_idx_[i]]
if self.drop is None:
raise ValueError("Unknown value in row {}".format(i))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i is the feature index not the row. The rows are in np.flatnonzero(dropped). The message can be more explicit:

The follow samples do not have an inverse when drop=None and handle_unknown='error' because they are all zeros: [0, 1, 2].

X_tr[dropped, i] = self.categories_[i][self.drop_idx_[i]]
if self.drop is None:
raise ValueError("Unknown value in row {}".format(i))
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I prefer to remove the else to un-indent the X_tr[dropped, ...] line

@kwinata kwinata force-pushed the fix_inverse_transform_OneHotEncoder branch 3 times, most recently from 12eb07d to b64a96d Compare September 4, 2020 09:22
sklearn/preprocessing/_encoders.py Outdated Show resolved Hide resolved
sklearn/preprocessing/tests/test_encoders.py Outdated Show resolved Hide resolved
sklearn/preprocessing/tests/test_encoders.py Outdated Show resolved Hide resolved
sklearn/preprocessing/tests/test_encoders.py Outdated Show resolved Hide resolved
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an entry to the change log at doc/whats_new/v0.24.rst with tag |Fix|. Like the other entries there, please reference this pull request with :pr: and credit yourself (and other contributors if applicable) with :user:.

@kwinata kwinata force-pushed the fix_inverse_transform_OneHotEncoder branch from dee5876 to 0a03d2e Compare September 6, 2020 14:56
@kwinata
Copy link
Contributor Author

kwinata commented Sep 6, 2020

Please add an entry to the change log at doc/whats_new/v0.24.rst with tag |Fix|. Like the other entries there, please reference this pull request with :pr: and credit yourself (and other contributors if applicable) with :user:.

Hi @thomasjpfan , I've added the changelog. Sorry for the messy force-pushes 🙏

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

doc/whats_new/v0.24.rst Outdated Show resolved Hide resolved
Copy link
Member

@cmarmo cmarmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kwinata!

@cmarmo
Copy link
Member

cmarmo commented Oct 8, 2020

Thanks for your patience @kwinata! Do you mind fixing conflicts? Then maybe @glemaitre could have a look? Thanks!

@kwinata
Copy link
Contributor Author

kwinata commented Oct 13, 2020

@cmarmo I've fixed the merge conflict. (@glemaitre )

@glemaitre glemaitre self-requested a review October 15, 2020 07:57
Copy link
Contributor

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the conflict and just isolated the test in a specific test to have a link to the original issue.
I will wait that the CIs are passing (apart of the MacOS that is currently failing for some other reasons).

@glemaitre glemaitre merged commit 5d5329c into scikit-learn:master Oct 19, 2020
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

inverse_transform of the OneHotEncoder: specified behaviour is not default
4 participants