New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of all feature calculations #224

Merged
merged 36 commits into from Aug 24, 2018

Conversation

Projects
None yet
2 participants
@kmax12
Member

kmax12 commented Aug 21, 2018

Feature calculation makes several calls to Entity.query_by_values. This pull request optimizes these calls using a pandas merge instead of isin.

This is still a work in progress as we benchmark it across different datasets.

kmax12 added some commits Aug 14, 2018

@kmax12 kmax12 changed the base branch from master to agg-functions Aug 21, 2018

kmax12 added some commits Aug 21, 2018

@kmax12 kmax12 changed the title from WIP Improve performance of all feature calculations to (WIP) Improve performance of all feature calculations Aug 22, 2018

@codecov-io

This comment has been minimized.

codecov-io commented Aug 22, 2018

Codecov Report

Merging #224 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #224      +/-   ##
==========================================
- Coverage   93.64%   93.62%   -0.03%     
==========================================
  Files          71       71              
  Lines        7679     7654      -25     
==========================================
- Hits         7191     7166      -25     
  Misses        488      488
Impacted Files Coverage Δ
featuretools/tests/entityset_tests/test_es.py 99.34% <ø> (-0.01%) ⬇️
...utational_backend/test_calculate_feature_matrix.py 99.27% <ø> (-0.02%) ⬇️
featuretools/entityset/entityset.py 93.59% <ø> (-0.16%) ⬇️
featuretools/entityset/entity.py 89.71% <100%> (+0.17%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8462046...383ef17. Read the comment docs.

@kmax12 kmax12 changed the title from (WIP) Improve performance of all feature calculations to Improve performance of all feature calculations Aug 23, 2018

kmax12 added some commits Aug 24, 2018

Merge branch 'master' into agg-functions
# Conflicts:
#	featuretools/entityset/entityset.py

@kmax12 kmax12 changed the base branch from agg-functions to master Aug 24, 2018

@kmax12 kmax12 merged commit 676b7cc into master Aug 24, 2018

2 checks passed

ci/circleci Your tests passed on CircleCI!
Details
license/cla Contributor License Agreement is signed.
Details

@rwedge rwedge referenced this pull request Aug 28, 2018

Merged

v0.3.0 #235

@kmax12 kmax12 deleted the isin-to-merge branch Oct 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment