Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug with encode_features and features that create multiple columns #622

Merged
merged 5 commits into from Jul 2, 2019

Conversation

rwedge
Copy link
Contributor

@rwedge rwedge commented Jun 21, 2019

Pull Request Description

Fixes issue #621

Features create multiple columns in the feature matrix aren't handled by encode_features. This PR fixes a bug that would cause some columns of a feature with multiple outputs to be added to the column list twice, resulting in duplicate columns in the resulting matrix.


After creating the pull request: in order to pass the changelog_updated check you will need to update the "Future Release" section of docs/source/changelog.rst to include this pull request.

@codecov
Copy link

codecov bot commented Jun 21, 2019

Codecov Report

Merging #622 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #622      +/-   ##
==========================================
+ Coverage   97.43%   97.43%   +<.01%     
==========================================
  Files         118      118              
  Lines        9534     9535       +1     
==========================================
+ Hits         9289     9290       +1     
  Misses        245      245
Impacted Files Coverage Δ
featuretools/synthesis/encode_features.py 98.33% <100%> (ø) ⬆️
...aturetools/tests/synthesis/test_encode_features.py 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0257f52...190c772. Read the comment docs.

@rwedge rwedge requested a review from CJStadler July 2, 2019 14:07
Copy link
Contributor

@CJStadler CJStadler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@@ -74,7 +74,7 @@ def encode_features(feature_matrix, features, top_n=10, include_unknown=True,
assert fname in X.columns, (
"Feature %s not found in feature matrix" % (fname)
)
feature_names.append(fname)
feature_names.append(fname)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐍 + ⬜️ 👾 = 🐛

@rwedge rwedge merged commit 7203f2d into master Jul 2, 2019
@rwedge rwedge deleted the fix-encode-multioutput branch July 2, 2019 15:16
@rwedge rwedge mentioned this pull request Jul 3, 2019
johnnyheineken pushed a commit to johnnyheineken/featuretools that referenced this pull request Jul 7, 2019
alteryx#622)

* fix indent

* tests for duplicate columns

* update changelog

* Update changelog.rst
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants