Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update DeepFeatureSynthesis to support multiple paths #600

Merged
merged 18 commits into from Jun 21, 2019

Conversation

Projects
None yet
2 participants
@CJStadler
Copy link
Contributor

commented Jun 17, 2019

For entitysets with multiple paths between two entities DFS will now
create features through all paths. Entitysets with cycles in the
directed graph are not supported.

  • Use feature.unique_name instead of hash in dicts
  • Change EntitySet.get_backward_entities to yield paths.

Once #572 is merged the base branch should be changed to master.

@CJStadler CJStadler changed the base branch from feature-trie to master Jun 18, 2019

Update DeepFeatureSynthesis to support multiple paths
For entitysets with multiple paths between two entities DFS will now
create features through all paths. Entitysets with cycles in the
directed graph are not supported.

- Use feature.unique_name instead of hash in dicts
- Change EntitySet.get_backward_entities to yield paths.

@CJStadler CJStadler force-pushed the multipath-dfs-no-cycles branch from 4e97867 to 42bc4bf Jun 18, 2019

@codecov

This comment was marked as outdated.

Copy link

commented Jun 18, 2019

Codecov Report

Merging #600 into master will increase coverage by 0.01%.
The diff coverage is 99.5%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #600      +/-   ##
==========================================
+ Coverage    96.4%   96.41%   +0.01%     
==========================================
  Files         119      119              
  Lines        9643     9654      +11     
==========================================
+ Hits         9296     9308      +12     
+ Misses        347      346       -1
Impacted Files Coverage Δ
featuretools/wrappers/sklearn.py 95.55% <ø> (ø) ⬆️
featuretools/computational_backends/api.py 100% <ø> (ø) ⬆️
featuretools/tests/entityset_tests/test_es.py 100% <ø> (ø) ⬆️
featuretools/entityset/timedelta.py 76.59% <ø> (ø) ⬆️
featuretools/synthesis/deep_feature_synthesis.py 96.73% <100%> (-0.02%) ⬇️
featuretools/computational_backends/utils.py 95.75% <100%> (ø) ⬆️
featuretools/entityset/entityset.py 95.12% <100%> (-0.03%) ⬇️
featuretools/utils/api.py 100% <100%> (ø) ⬆️
featuretools/feature_base/feature_base.py 97.61% <100%> (ø) ⬆️
...s/tests/primitive_tests/test_transform_features.py 98.37% <100%> (ø) ⬆️
... and 46 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b84b66...42bc4bf. Read the comment docs.

@codecov

This comment has been minimized.

Copy link

commented Jun 18, 2019

Codecov Report

Merging #600 into master will decrease coverage by 0.15%.
The diff coverage is 98.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #600      +/-   ##
==========================================
- Coverage   97.27%   97.11%   -0.16%     
==========================================
  Files         118      118              
  Lines        9589     9545      -44     
==========================================
- Hits         9328     9270      -58     
- Misses        261      275      +14
Impacted Files Coverage Δ
featuretools/tests/entityset_tests/test_es.py 100% <ø> (ø) ⬆️
...uretools/tests/entityset_tests/test_es_metadata.py 98.13% <100%> (-0.11%) ⬇️
featuretools/entityset/entityset.py 95.5% <100%> (+0.35%) ⬆️
featuretools/tests/testing_utils/__init__.py 100% <100%> (ø) ⬆️
featuretools/tests/testing_utils/features.py 100% <100%> (ø) ⬆️
...retools/tests/entityset_tests/test_relationship.py 100% <100%> (ø) ⬆️
...ols/tests/synthesis/test_deep_feature_synthesis.py 100% <100%> (ø) ⬆️
featuretools/entityset/relationship.py 98.68% <100%> (+0.19%) ⬆️
featuretools/synthesis/deep_feature_synthesis.py 96.06% <95.55%> (-0.69%) ⬇️
featuretools/__main__.py 0% <0%> (-50%) ⬇️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 177089f...4b49d5e. Read the comment docs.

Show resolved Hide resolved featuretools/entityset/entityset.py Outdated
Show resolved Hide resolved featuretools/synthesis/deep_feature_synthesis.py Outdated
Show resolved Hide resolved featuretools/synthesis/deep_feature_synthesis.py Outdated

CJStadler added some commits Jun 18, 2019

Fix get_backward_entities when deep
Previously was always doing a deep search.
Replace entity_path with a RelationshipPath
- Add RelationshipPath.entities
- Change EntitySet.get_forward_entities to yield entity_ids and paths.

@CJStadler CJStadler requested a review from kmax12 Jun 18, 2019

@kmax12
Copy link
Member

left a comment

This is looking good. Just one comment for now.

Show resolved Hide resolved featuretools/entityset/entityset.py Outdated

CJStadler added some commits Jun 19, 2019

CJStadler added some commits Jun 19, 2019

Remove max_hlevel from DFS (#608)
This was not exposed in the dfs method so it always used the default
value of 2, it was undocumented, and it was unclear if it provided any
value.

Also remove EntitySet.find_path as it is no longer used.
Remove redundant max_depth checks
This condition can never be True because it were True then the
condition in _run_dfs would have been True and would have returned.
@CJStadler

This comment has been minimized.

Copy link
Contributor Author

commented Jun 19, 2019

Back to you @kmax12. I made once change in 3634849 that you might want to look at.

@kmax12
Copy link
Member

left a comment

Some minor changes, otherwise I think this is good to go

Show resolved Hide resolved featuretools/synthesis/deep_feature_synthesis.py
Show resolved Hide resolved docs/source/changelog.rst Outdated

CJStadler added some commits Jun 20, 2019

@CJStadler CJStadler requested a review from kmax12 Jun 20, 2019

@kmax12

kmax12 approved these changes Jun 21, 2019

Copy link
Member

left a comment

LGTM

@CJStadler CJStadler merged commit 321a0f4 into master Jun 21, 2019

3 checks passed

changelog_updated Workflow: changelog_updated
Details
license/cla Contributor License Agreement is signed.
Details
test_all_python_versions Workflow: test_all_python_versions
Details

@CJStadler CJStadler deleted the multipath-dfs-no-cycles branch Jun 21, 2019

@rwedge rwedge referenced this pull request Jul 3, 2019

Merged

v0.9.1 #640

johnnyheineken pushed a commit to johnnyheineken/featuretools that referenced this pull request Jul 7, 2019

Update DeepFeatureSynthesis to support multiple paths (Featuretools#600)
For entitysets with multiple paths between two entities DFS will now
create features through all paths. Entitysets with cycles in the
directed graph are not supported.

- Use feature.unique_name instead of hash in dicts
- Change EntitySet.get_backward_entities  and get_forward_entities
  to yield paths.
- Add RelationshipPath which represents a directed path through
  relationships.
- Add RelationshipPath.entities
- Warn if dfs produces a feature multiple times
- Remove max_hlevel from DFS (Featuretools#608).
- Remove EntitySet.find_path
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.