Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow where clauses on direct features in Deep Feature Synthesis #279

merged 4 commits into from Oct 15, 2018


Copy link

@kmax12 kmax12 commented Oct 7, 2018

DFS will automatically add where clauses to aggregation features based on the values in the interesting_valuesproperty of another variable within that entity.

This PR allows DFS to add where clauses using the interesting values of a direct feature. To accomplish this I added a variable property to direct features that used to only be defined for identity features.

First reported by @favstats on stackoverflow:

Copy link

codecov-io commented Oct 7, 2018

Codecov Report

Merging #279 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #279      +/-   ##
+ Coverage   94.45%   94.45%   +<.01%     
  Files          71       71              
  Lines        7698     7705       +7     
+ Hits         7271     7278       +7     
  Misses        427      427
Impacted Files Coverage Δ
featuretools/tests/testing_utils/ 87.4% <ø> (ø) ⬆️
featuretools/synthesis/ 93.29% <100%> (+0.01%) ⬆️
featuretools/primitives/ 95.83% <100%> (+0.37%) ⬆️
...ols/tests/dfs_tests/ 98.45% <100%> (ø) ⬆️
...sts/feature_function_tests/ 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fcb9cda...b1e7f9a. Read the comment docs.

Copy link

favstats commented Oct 7, 2018

This fixed my issue! Thank you so much for the very quick help, this is really amazing!

@kmax12 kmax12 requested a review from rwedge October 10, 2018 21:49
@@ -34,6 +34,8 @@ def make_ecommerce_files(with_integer_time_index=False, base_path=None, file_loc
product_df = pd.DataFrame({'id': ['Haribo sugar-free gummy bears', 'car',
'toothpaste', 'brown bag', 'coke zero',
'taco clock'],
'department': ["food", "electronics", "food",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "health" for the toothpaste's department type?

@@ -544,6 +549,7 @@ def test_where_different_base_feats(es):
assert hashed not in where_feats

# TODO: not clear what this tests
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add this as a backlog issue?

Copy link
Contributor Author

kmax12 commented Oct 15, 2018

@rwedge addressed your comments. does this look good to merge?

Copy link

rwedge commented Oct 15, 2018

Looks good

@kmax12 kmax12 merged commit fcc93e7 into master Oct 15, 2018
@gsheni gsheni deleted the interesting-values-direct-features branch October 24, 2018 15:37
@rwedge rwedge mentioned this pull request Oct 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

Successfully merging this pull request may close these issues.

None yet

4 participants