Skip to content

Add Featuretools Component #1454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Dec 8, 2020
Merged

Add Featuretools Component #1454

merged 27 commits into from
Dec 8, 2020

Conversation

bchen1116
Copy link
Contributor

@bchen1116 bchen1116 commented Nov 23, 2020

fix #470

Adding featuretools component to evalml.

Quip doc here
Perf tests here
Docs here

The perf test are run on AutoMLSearch cv folds, not holdout data. The perf tests include more info.

@bchen1116 bchen1116 self-assigned this Nov 23, 2020
@CLAassistant
Copy link

CLAassistant commented Nov 23, 2020

CLA assistant check
All committers have signed the CLA.

@codecov
Copy link

codecov bot commented Nov 23, 2020

Codecov Report

Merging #1454 (f95b71d) into main (d6c3f8e) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #1454     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         228      230      +2     
  Lines       15658    15776    +118     
=========================================
+ Hits        15650    15768    +118     
  Misses          8        8             
Impacted Files Coverage Δ
evalml/pipelines/__init__.py 100.0% <ø> (ø)
evalml/pipelines/components/__init__.py 100.0% <ø> (ø)
...alml/pipelines/components/transformers/__init__.py 100.0% <100.0%> (ø)
.../components/transformers/preprocessing/__init__.py 100.0% <100.0%> (ø)
...ponents/transformers/preprocessing/featuretools.py 100.0% <100.0%> (ø)
evalml/tests/component_tests/test_components.py 100.0% <100.0%> (ø)
evalml/tests/component_tests/test_featuretools.py 100.0% <100.0%> (ø)
evalml/tests/component_tests/test_utils.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d6c3f8e...f95b71d. Read the comment docs.

if 'index' not in X.columns:
es = self._ft_es.entity_from_dataframe(entity_id="X", dataframe=X, index='index', make_index=True)
else:
es = self._ft_es.entity_from_dataframe(entity_id="X", dataframe=X, index='index')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Featuretools automatically uses the first column as the index if an index column isn't provided.

@bchen1116 bchen1116 marked this pull request as ready for review November 24, 2020 18:08
Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bchen1116 This is sweet! I agree with your analysis that we should hold off on adding this to _make_preprocessing_components.

I think the implementation looks good. My main comment is about letting users specify the index column in the entity set to avoid name collisions with columns already named index that aren't intended to be used as indices.

@bchen1116
Copy link
Contributor Author

Perf test doc has been updated with new results after Looking Glass's recent bug fix

Copy link
Collaborator

@jeremyliweishih jeremyliweishih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me Bryan. I agree with you on not adding this to AutoML in the interim and wait for a deeper implementation. I'm excited to see the experimentation results then!

@bchen1116 bchen1116 merged commit c3eb47b into main Dec 8, 2020
Copy link
Contributor

@dsherry dsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bchen1116 wonderful! Left one comment about component name, impl and tests look great.

@dsherry dsherry mentioned this pull request Dec 29, 2020
@freddyaboulton freddyaboulton deleted the bc_470_featuretools branch May 13, 2022 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add featuretools component
5 participants