Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include boolean as a categorical type when selecting Oversampler #2980

Merged
merged 5 commits into from Oct 28, 2021

Conversation

eccabay
Copy link
Contributor

@eccabay eccabay commented Oct 27, 2021

Closes #2947

Fixes error where having exclusively categorical and boolean columns would have our oversampler selection code pick SMOTENC to use, when it should have been SMOTEN, throwing an error SMOTE-NC is not designed to work only with categorical features

@codecov
Copy link

codecov bot commented Oct 27, 2021

Codecov Report

Merging #2980 (0c41c99) into main (ceeb0d9) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2980     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        307     307             
  Lines      29244   29257     +13     
=======================================
+ Hits       29153   29166     +13     
  Misses        91      91             
Impacted Files Coverage Δ
...es/components/transformers/samplers/oversampler.py 100.0% <100.0%> (ø)
evalml/tests/component_tests/test_oversampler.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ceeb0d9...0c41c99. Read the comment docs.

@eccabay eccabay marked this pull request as ready for review October 27, 2021 21:00
Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, good catch! Thank you 🙏

Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eccabay Thank you for the quick fix! I think we just need the "Fixes #2947" in the pr description so that the original issue is closed out.

Copy link
Collaborator

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, quick fix. Thanks!!

@eccabay eccabay merged commit 6b204dc into main Oct 28, 2021
@eccabay eccabay deleted the smotenc_cat_error branch October 28, 2021 18:04
@chukarsten chukarsten mentioned this pull request Nov 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SMOTENC Oversampler fix
4 participants