Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only calculate features for instances before cutoff #523

Merged
merged 3 commits into from Apr 30, 2019

Conversation

Projects
None yet
2 participants
@CJStadler
Copy link
Contributor

commented Apr 30, 2019

When a single cutoff_time is provided.

Resolves #437

Only calculate features for instances before cutoff
When a single cutoff_time is provided.
@codecov

This comment has been minimized.

Copy link

commented Apr 30, 2019

Codecov Report

Merging #523 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #523      +/-   ##
=========================================
+ Coverage    96.1%   96.1%   +<.01%     
=========================================
  Files         108     108              
  Lines        8898    8913      +15     
=========================================
+ Hits         8551    8566      +15     
  Misses        347     347
Impacted Files Coverage Δ
...imitive_tests/test_groupby_transform_primitives.py 100% <100%> (ø) ⬆️
...computational_backends/calculate_feature_matrix.py 97.09% <100%> (+0.01%) ⬆️
...utational_backend/test_calculate_feature_matrix.py 99.33% <100%> (+0.01%) ⬆️
featuretools/tests/dfs_tests/test_dfs_method.py 98.41% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4cbca08...fd9c909. Read the comment docs.

@rwedge

This comment has been minimized.

Copy link
Contributor

commented Apr 30, 2019

From the coverage report it looks like some code related to missing data relied on the old way of using all the instances in order to be tested.

if missing_ids:
default_df = self.generate_default_df(instance_ids=missing_ids,
extra_columns=df.columns)
df = df.append(default_df, sort=True)

if extra_columns is not None:
for c in extra_columns:
if c not in default_df.columns:
default_df[c] = [np.nan] * len(instance_ids)

We might be able to test this code with a test using a single cutoff time and a list of instance ids that includes instances whose time index is after the cutoff time.

CJStadler added some commits Apr 30, 2019

@CJStadler

This comment has been minimized.

Copy link
Contributor Author

commented Apr 30, 2019

@rwedge I added a test and I think the coverage is good now. Back to you.

@rwedge

rwedge approved these changes Apr 30, 2019

Copy link
Contributor

left a comment

Looks good

@CJStadler CJStadler merged commit f2feb3f into master Apr 30, 2019

4 checks passed

codecov/patch 100% of diff hit (target 96.1%)
Details
codecov/project 96.1% (+<.01%) compared to 4cbca08
Details
license/cla Contributor License Agreement is signed.
Details
test_all_python_versions Workflow: test_all_python_versions
Details

@CJStadler CJStadler deleted the remove-instances-after-cutoff branch Apr 30, 2019

@rwedge rwedge referenced this pull request May 17, 2019

Merged

v0.8.0 #548

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.