Skip to content

ENH Add 'warn' option to 'handle_unknown' parameter in OneHotEncoder#28637

Merged
OmarManzoor merged 11 commits into
scikit-learn:mainfrom
glevv:ohe-warn
Oct 11, 2024
Merged

ENH Add 'warn' option to 'handle_unknown' parameter in OneHotEncoder#28637
OmarManzoor merged 11 commits into
scikit-learn:mainfrom
glevv:ohe-warn

Conversation

@glevv

@glevv glevv commented Mar 15, 2024

Copy link
Copy Markdown
Contributor

Reference Issues/PRs

Fixes #28619. See also #16881

What does this implement/fix? Explain your changes.

Added 'warn' option to OneHotEncoder that behaves same as option 'ignore', but raises a warning for every unknown category.

Any other comments?

@github-actions

github-actions Bot commented Mar 15, 2024

Copy link
Copy Markdown

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 9c739be. Link to the linter CI: here

@glevv glevv marked this pull request as ready for review March 15, 2024 19:54
@glevv glevv changed the title [ENH] Add 'warn' option to OneHotEncoder ENH Add 'warn' option to 'handle_unknown' parameter in OneHotEncoder Mar 16, 2024

@ogrisel ogrisel left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I would rather make handle_unknown="warn" behave like handle_unknown="infrequent_if_exist" instead of handle_unknown="ignore". I think this would give more control to the user.

Note that when #28619 was originally open, handle_unknown="infrequent_if_exist" did not exist yet.

Comment thread sklearn/preprocessing/_encoders.py Outdated
@glevv glevv requested review from glemaitre and ogrisel April 15, 2024 06:37
@glevv

glevv commented May 7, 2024

Copy link
Copy Markdown
Contributor Author

@glemaitre @ogrisel
if you would have some spare time to review the PR

@glemaitre glemaitre added this to the 1.6 milestone May 18, 2024
@glemaitre

Copy link
Copy Markdown
Member

Let me add the milestone for the next release and prioritize this feature.

@glemaitre glemaitre left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the PR. I just quickly pushed some style changes and move the entry in 1.6 without requesting any changes because it was straightforward.

@ogrisel you might want to have a look since you already request a first change. This one is pretty straightforward.

Thanks @glevv

@glemaitre

Copy link
Copy Markdown
Member

ping @OmarManzoor or @Charlie-XIAO maybe?

@OmarManzoor OmarManzoor left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @glevv

@OmarManzoor OmarManzoor merged commit e760a3e into scikit-learn:main Oct 11, 2024
@glemaitre

Copy link
Copy Markdown
Member

Thanks @OmarManzoor

@glevv glevv deleted the ohe-warn branch October 13, 2024 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

Add an option handle_unknown="warn" in OneHotEncoder

4 participants