Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to_encode option in encode_features #1123

Merged
merged 8 commits into from
Sep 4, 2020
Merged

Fix to_encode option in encode_features #1123

merged 8 commits into from
Sep 4, 2020

Conversation

rwedge
Copy link
Contributor

@rwedge rwedge commented Aug 26, 2020

Fixes #1115

encode_features has an option to_encode that allows users to specify a list of features they want encoded. encode_features was only partially using the info provided from to_encode -- it was only creating encoded features from the features specified by to_encode, but it was converting all features to a numeric pandas dtype instead of just the encoded ones.

@codecov
Copy link

codecov bot commented Aug 26, 2020

Codecov Report

Merging #1123 into main will decrease coverage by 0.01%.
The diff coverage is 93.93%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1123      +/-   ##
==========================================
- Coverage   98.37%   98.36%   -0.02%     
==========================================
  Files         126      126              
  Lines       13466    13477      +11     
==========================================
+ Hits        13247    13256       +9     
- Misses        219      221       +2     
Impacted Files Coverage Δ
featuretools/synthesis/encode_features.py 96.05% <92.59%> (-2.52%) ⬇️
...aturetools/tests/synthesis/test_encode_features.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 804df26...283880a. Read the comment docs.

try:
new_X[c] = pd.to_numeric(new_X[c], errors='raise')
except (TypeError, ValueError):
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can't figure out why this is here/if it's still necessary, maybe we should add a warning so that we can at least tell when it does end up throwing this error instead of quietly passing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do add a warning, I think it'd be good to also test the warning to increase code coverage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our current tests don't cover this except block - we would need to find an example where this could occur

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of getting this fix into today's release, I'm open to doing a separate issue / PR cycle on the usefulness of this block / how we want to handle the exceptions.

Copy link
Contributor

@frances-h frances-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍

@rwedge rwedge merged commit 5e929e2 into main Sep 4, 2020
@rwedge rwedge deleted the issue-1115-rw branch September 4, 2020 23:20
@tamargrey tamargrey mentioned this pull request Sep 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

featuretools, encode_features, date type column, wrong output issue
4 participants