-
Notifications
You must be signed in to change notification settings - Fork 89
Add Featuretools Component #1454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d8b2d1c
to
105169b
Compare
Codecov Report
@@ Coverage Diff @@
## main #1454 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 228 230 +2
Lines 15658 15776 +118
=========================================
+ Hits 15650 15768 +118
Misses 8 8
Continue to review full report at Codecov.
|
if 'index' not in X.columns: | ||
es = self._ft_es.entity_from_dataframe(entity_id="X", dataframe=X, index='index', make_index=True) | ||
else: | ||
es = self._ft_es.entity_from_dataframe(entity_id="X", dataframe=X, index='index') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Featuretools
automatically uses the first column as the index if an index
column isn't provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bchen1116 This is sweet! I agree with your analysis that we should hold off on adding this to _make_preprocessing_components
.
I think the implementation looks good. My main comment is about letting users specify the index column in the entity set to avoid name collisions with columns already named index
that aren't intended to be used as indices.
evalml/pipelines/components/transformers/preprocessing/featuretools.py
Outdated
Show resolved
Hide resolved
evalml/pipelines/components/transformers/preprocessing/featuretools.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me Bryan. I agree with you on not adding this to AutoML in the interim and wait for a deeper implementation. I'm excited to see the experimentation results then!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bchen1116 wonderful! Left one comment about component name, impl and tests look great.
fix #470
Adding featuretools component to evalml.
Quip doc here
Perf tests here
Docs here
The perf test are run on AutoMLSearch cv folds, not holdout data. The perf tests include more info.