New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primitives as strings in DFS parameters #129

Merged
merged 15 commits into from Apr 11, 2018

Conversation

Projects
None yet
4 participants
@bschreck
Contributor

bschreck commented Apr 4, 2018

dfs() and DeepFeatureSynthesis() can now accept strings (instead of Python classes) for all primitives built into Featuretools.

API looks like this:

ft.dfs(entityset=es, target_entity='test',
          agg_primitives=['sum'],
          trans_primitives=['percentile'],
          where_primitives=['sum', 'percentile'])

@bschreck bschreck requested review from Seth-Rothschild and kmax12 and removed request for Seth-Rothschild Apr 4, 2018

@Seth-Rothschild

This comment has been minimized.

Contributor

Seth-Rothschild commented Apr 4, 2018

Very excited for this. I can take a last pass through the docs once you're all set.

@bschreck

This comment has been minimized.

Contributor

bschreck commented Apr 5, 2018

@Seth-Rothschild ready for you to do that

transform_primitives = get_transform_primitives()
agg_primitives = get_aggregation_primitives()
transform_df = pd.DataFrame({'name': list(transform_primitives.keys()),
'description': [prim.__doc__.split("\n")[0] for prim in transform_primitives.values()]})

This comment has been minimized.

@Seth-Rothschild

Seth-Rothschild Apr 5, 2018

Contributor

This gives different results between docstrings that are

"""
Words go here
"""
"""Words go here
"""

and

""" Words go here
"""

We have all three but should probably settle on one.

This comment has been minimized.

@kmax12

kmax12 Apr 5, 2018

Member

@Seth-Rothschild we use the Google style docstrings for all public functions. for some private functions, we may be inconsistent and should fix those as we see it.

here's the Google style docstrings guide: http://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html

This comment has been minimized.

@Seth-Rothschild

Seth-Rothschild Apr 5, 2018

Contributor

I've pushed a version which has nonempty first line for all of the primitives. It looks like there's some extra space being added by something in the description (as shown in this image).

EDIT: It's actually right aligned. Pandas style doesn't work with a non-unique index, but if we give it a unique index we can use .style.set_properties(**{'text-align': 'left'})

image

Seth-Rothschild and others added some commits Apr 5, 2018

Seth-Rothschild
@codecov-io

This comment has been minimized.

codecov-io commented Apr 5, 2018

Codecov Report

Merging #129 into master will decrease coverage by 0.11%.
The diff coverage is 81.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #129      +/-   ##
==========================================
- Coverage   88.47%   88.36%   -0.12%     
==========================================
  Files          73       73              
  Lines        7558     7598      +40     
==========================================
+ Hits         6687     6714      +27     
- Misses        871      884      +13
Impacted Files Coverage Δ
featuretools/synthesis/dfs.py 100% <ø> (ø) ⬆️
featuretools/primitives/aggregation_primitives.py 92.57% <ø> (ø) ⬆️
featuretools/primitives/transform_primitive.py 97.77% <ø> (ø) ⬆️
featuretools/primitives/binary_transform.py 85.71% <ø> (ø) ⬆️
featuretools/primitives/api.py 100% <ø> (ø) ⬆️
.../feature_function_tests/test_transform_features.py 87.84% <100%> (ø) ⬆️
...ols/tests/dfs_tests/test_deep_feature_synthesis.py 98.32% <100%> (+0.05%) ⬆️
featuretools/primitives/cum_transform_feature.py 97.63% <100%> (ø) ⬆️
...ols/tests/feature_function_tests/test_agg_feats.py 98.51% <100%> (ø) ⬆️
featuretools/primitives/utils.py 85.36% <30%> (-7.88%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 59fc551...5528b45. Read the comment docs.

Seth-Rothschild added some commits Apr 5, 2018

where_primitives (list[:class:`.primitives.AggregationPrimitive`], optional):
only add where clauses to these types of Aggregation
Features.
Default: ["sum", \

This comment has been minimized.

@kmax12

kmax12 Apr 5, 2018

Member

Nitpick - I dont think these all need their own line. can we reform to be 80 chars?

This comment has been minimized.

@Seth-Rothschild

Seth-Rothschild Apr 6, 2018

Contributor

It's 114 chars (agg defaults) and 104 chars (trans defaults). Keep it on one line anyways? Even removing " Default:" isn't enough to get it under 80.

if isinstance(p, basestring):
prim_obj = agg_prim_dict.get(p.lower(), None)
if prim_obj is None:
prim_obj = trans_prim_dict.get(p.lower(), None)

This comment has been minimized.

@kmax12

kmax12 Apr 5, 2018

Member

where's only apply to aggregations, no need to check transform too

:class:`Mode <.primitives.Mode>`]
trans_primitives (list[TransformPrimitive], optional):
Default: ["sum", \

This comment has been minimized.

@kmax12

kmax12 Apr 5, 2018

Member

same comment as above about single lines

Seth-Rothschild and others added some commits Apr 6, 2018

@kmax12

This comment has been minimized.

Member

kmax12 commented Apr 11, 2018

Great work. Merging in

@kmax12 kmax12 merged commit ff70cbe into master Apr 11, 2018

2 checks passed

ci/circleci Your tests passed on CircleCI!
Details
license/cla Contributor License Agreement is signed.
Details

@rwedge rwedge referenced this pull request Apr 13, 2018

Merged

Release v0.1.20 #131

rwedge added a commit that referenced this pull request Apr 13, 2018

Release v0.1.20 (#131)
**v0.1.20** Apr 13, 2018
* Improved chunking when calculating feature matrices  (#121)
* Primitives as strings in DFS parameters (#129)
* Integer time index bugfixes (#128)
* Add make_temporal_cutoffs utility function (#126)
* Show all entities, switch shape display to row/col (#124)
* fixed num characters nan fix (#118)
* modify ignore_variables docstring (#117)

@kmax12 kmax12 deleted the dfs-prim-strings branch Jun 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment