Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug with encode_features and features that create multiple columns #622

Merged
merged 5 commits into from Jul 2, 2019

Conversation

Projects
None yet
2 participants
@rwedge
Copy link
Contributor

commented Jun 21, 2019

Pull Request Description

Fixes issue #621

Features create multiple columns in the feature matrix aren't handled by encode_features. This PR fixes a bug that would cause some columns of a feature with multiple outputs to be added to the column list twice, resulting in duplicate columns in the resulting matrix.


After creating the pull request: in order to pass the changelog_updated check you will need to update the "Future Release" section of docs/source/changelog.rst to include this pull request.

rwedge added some commits Jun 21, 2019

@codecov

This comment has been minimized.

Copy link

commented Jun 21, 2019

Codecov Report

Merging #622 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #622      +/-   ##
==========================================
+ Coverage   97.43%   97.43%   +<.01%     
==========================================
  Files         118      118              
  Lines        9534     9535       +1     
==========================================
+ Hits         9289     9290       +1     
  Misses        245      245
Impacted Files Coverage Δ
featuretools/synthesis/encode_features.py 98.33% <100%> (ø) ⬆️
...aturetools/tests/synthesis/test_encode_features.py 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0257f52...190c772. Read the comment docs.

@rwedge rwedge requested a review from CJStadler Jul 2, 2019

@CJStadler
Copy link
Contributor

left a comment

Looks good to me!

@@ -74,7 +74,7 @@ def encode_features(feature_matrix, features, top_n=10, include_unknown=True,
assert fname in X.columns, (
"Feature %s not found in feature matrix" % (fname)
)
feature_names.append(fname)
feature_names.append(fname)

This comment has been minimized.

Copy link
@CJStadler

CJStadler Jul 2, 2019

Contributor

🐍 + ⬜️ 👾 = 🐛

@rwedge rwedge merged commit 7203f2d into master Jul 2, 2019

4 checks passed

codecov/patch 100% of diff hit (target 97.43%)
Details
codecov/project 97.43% (+<.01%) compared to 0257f52
Details
license/cla Contributor License Agreement is signed.
Details
test_all_python_versions Workflow: test_all_python_versions
Details

@rwedge rwedge deleted the fix-encode-multioutput branch Jul 2, 2019

@rwedge rwedge referenced this pull request Jul 3, 2019

Merged

v0.9.1 #640

johnnyheineken pushed a commit to johnnyheineken/featuretools that referenced this pull request Jul 7, 2019

Fix bug with encode_features and features that create multiple columns (
Featuretools#622)

* fix indent

* tests for duplicate columns

* update changelog

* Update changelog.rst
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.