Skip to content

Allow user to pass in desired feature return types#372

Merged
kmax12 merged 19 commits intoalteryx:masterfrom
RogerTangos:feature-return-types
Feb 6, 2019
Merged

Allow user to pass in desired feature return types#372
kmax12 merged 19 commits intoalteryx:masterfrom
RogerTangos:feature-return-types

Conversation

@RogerTangos
Copy link
Copy Markdown
Contributor

@RogerTangos RogerTangos commented Jan 15, 2019

  • dfs() can be passed a list of allowed_variable_types, None or 'all'. This filters returned features based on their output types.
  • None defaults to [Numeric, Discrete, Boolean]
  • Docstrings updated accordingly
  • Completes work begun (but unexposed in the api) in ff70cbe, but
  • Tests changes

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 15, 2019

Codecov Report

Merging #372 into master will increase coverage by 0.09%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #372      +/-   ##
==========================================
+ Coverage   95.99%   96.09%   +0.09%     
==========================================
  Files          92       92              
  Lines        7996     8021      +25     
==========================================
+ Hits         7676     7708      +32     
+ Misses        320      313       -7
Impacted Files Coverage Δ
featuretools/synthesis/deep_feature_synthesis.py 96.57% <100%> (+2%) ⬆️
...ols/tests/dfs_tests/test_deep_feature_synthesis.py 98.49% <100%> (+0.12%) ⬆️
featuretools/synthesis/dfs.py 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update de2ff0e...c10c4b3. Read the comment docs.

@RogerTangos
Copy link
Copy Markdown
Contributor Author

There were some initial strange linting errors here, which are now fixed. (strange, because the lines in question were from the master branch.)

Now this seems to be failing on pandas 0.24.0. Once #383 fixes that, I'll merge master back in.

@RogerTangos
Copy link
Copy Markdown
Contributor Author

I fixed a bunch of indentation errors that were being picked up by the linter update, and now codecov/patch is knocking the coverage score for it. Many of the fixes it's penalizing were on commented lines.

Would you like me to fix these, or can we go ahead and merge? On the original PR, the coverage was unchanged.

@kmax12
Copy link
Copy Markdown
Contributor

kmax12 commented Jan 30, 2019

@RogerTangos thanks for investigating the linting issues and resolving. we still need to review this before it's ready to merge. we're currently preparing the next release of featuretools (likely to go out this week), but after that we will return to review this.

@RogerTangos RogerTangos changed the title Feature return types Allow user to pass in desired feature return types Feb 4, 2019
Copy link
Copy Markdown
Contributor

@kmax12 kmax12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank for working on this. It is looking good overall. The main thing missing is a test case so we can verify behavior works as intended.

Comment thread featuretools/synthesis/deep_feature_synthesis.py Outdated
variable_type=[Numeric,
Categorical,
Ordinal],
variable_type=allowed_variable_types,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we actually shouldn't be filtering any features based on type here since happens on line 216.

That means the correct change here would be to actually to update _features_by_type to not filter based on variable_type if variable_type == None.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll update this.


def _run_dfs(self, entity, entity_path, all_features, max_depth):
def _run_dfs(self, entity, entity_path, all_features, max_depth,
allowed_variable_types=None):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once we remove allowed_variable_types from _build_forward_features, we no longer need this parameter to _run_dfs

Comment thread featuretools/synthesis/dfs.py Outdated
dask_kwargs=None,
verbose=False):
verbose=False,
allowed_variable_types=None):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to return_variable_types per comment in deep_feature_synthesis.py

Comment thread featuretools/synthesis/dfs.py Outdated
dask_kwargs=None,
verbose=False):
verbose=False,
allowed_variable_types=None):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add test cases for this change before we can merge

@RogerTangos
Copy link
Copy Markdown
Contributor Author

Thanks @kmax12. It's the end of the day here, so I'll get to these updates and the tests tomorrow. Thank you for the review.

@RogerTangos
Copy link
Copy Markdown
Contributor Author

@kmax12 - requested changes made.

Comment thread featuretools/synthesis/deep_feature_synthesis.py
Comment thread featuretools/tests/dfs_tests/test_deep_feature_synthesis.py
self._handle_new_feature(new_f, all_features)

def _features_by_type(self, all_features, entity, variable_type, max_depth):
def _features_by_type(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to keep the style of our other code - let's keep this as one line or break up the arguments over two lines

@RogerTangos
Copy link
Copy Markdown
Contributor Author

Updates to everything but line 212. I did actually write some code for this, but IMO, it made the file harder to understand, not easier. If you have an opinion on how that should be done, feel free to copy past and I'll add it.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Feb 5, 2019

CLA assistant check
All committers have signed the CLA.

@kmax12
Copy link
Copy Markdown
Contributor

kmax12 commented Feb 6, 2019

Looks good. Merging!

@kmax12 kmax12 merged commit d0bad62 into alteryx:master Feb 6, 2019
@RogerTangos RogerTangos deleted the feature-return-types branch February 6, 2019 08:43
kmax12 pushed a commit that referenced this pull request Feb 6, 2019
* Make S3 dependencies optional

* Allow user to pass in desired feature return types (#372)

* change method var variable_types to prevent overloading

* add and pass allowed_variable_types var to _run_dfs and _build_forward_features

* allowed_variable_types can be passed to dfs

* correct docstrings for allowed_variable_types in dfs and deep_feature_synthesis.build_features

* fix linting issues.

* fix linting errors

* fix another linting error

* fix F632 linter error

* add test. rename build_features return_variable_types arg.

* remove redundant return_variable_types attr from deep_feature_synthesis._run_dfs()

* add datetime tests. minor formatting change.

* remove bad line break

* Update deep_feature_synthesis.py

* Make S3 dependencies optional

* Move boto3 back to regular dependencies

* Move s3fs usage into NoCredentialsError

* Make S3 dependencies optional

* Move boto3 back to regular dependencies

* Move s3fs usage into NoCredentialsError
@rwedge rwedge mentioned this pull request Feb 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants