Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling time documentation page #512

Merged
merged 14 commits into from May 3, 2019

Conversation

Projects
None yet
3 participants
@CharlesBradshaw
Copy link
Contributor

commented Apr 25, 2019

Improved the handling time documentation page

CharlesBradshaw added some commits Apr 23, 2019

@codecov

This comment has been minimized.

Copy link

commented Apr 25, 2019

Codecov Report

Merging #512 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #512      +/-   ##
=========================================
+ Coverage    96.1%   96.1%   +<.01%     
=========================================
  Files         108     108              
  Lines        8898    8900       +2     
=========================================
+ Hits         8551    8553       +2     
  Misses        347     347
Impacted Files Coverage Δ
...turetools/computational_backends/pandas_backend.py 98.07% <ø> (ø) ⬆️
featuretools/entityset/entity.py 96.1% <ø> (ø) ⬆️
...computational_backends/calculate_feature_matrix.py 97.08% <ø> (ø) ⬆️
featuretools/synthesis/dfs.py 100% <ø> (ø) ⬆️
featuretools/tests/demo_tests/test_demo_data.py 100% <100%> (ø) ⬆️
featuretools/demo/flight.py 95.06% <100%> (+0.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 269307e...e1b6e7e. Read the comment docs.

@kmax12 kmax12 changed the title Cutoff time docs Improved handling time documentation page Apr 30, 2019

kmax12 added some commits Apr 30, 2019

@kmax12 kmax12 changed the title Improved handling time documentation page Improve handling time documentation page Apr 30, 2019

@codecov

This comment has been minimized.

Copy link

commented May 2, 2019

Codecov Report

Merging #512 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #512      +/-   ##
=========================================
+ Coverage    96.1%   96.1%   +<.01%     
=========================================
  Files         108     108              
  Lines        8913    8915       +2     
=========================================
+ Hits         8566    8568       +2     
  Misses        347     347
Impacted Files Coverage Δ
...turetools/computational_backends/pandas_backend.py 98.07% <ø> (ø) ⬆️
featuretools/entityset/entity.py 96.1% <ø> (ø) ⬆️
...computational_backends/calculate_feature_matrix.py 97.09% <ø> (ø) ⬆️
featuretools/synthesis/dfs.py 100% <ø> (ø) ⬆️
featuretools/tests/demo_tests/test_demo_data.py 100% <100%> (ø) ⬆️
featuretools/demo/flight.py 95.06% <100%> (+0.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 26dd292...9151dcd. Read the comment docs.

@@ -163,13 +164,34 @@ def _clean_data(data):
clean_data.loc[:, 'flight_id'] = clean_data['carrier'] + '-' + \
clean_data['flight_num'].apply(lambda x: str(x)) + ':' + clean_data['origin'] + '->' + clean_data['dest']

column_order = [

This comment has been minimized.

Copy link
@kmax12

kmax12 May 3, 2019

Member

updated column order to improve print out in the docs

@@ -148,7 +149,7 @@ def _clean_data(data):
clean_data = _reconstruct_times(clean_data)

# Create a time index 6 months before scheduled_dep
clean_data.loc[:, 'time_index'] = clean_data['scheduled_dep_time'] - \
clean_data.loc[:, 'date_scheduled'] = clean_data['scheduled_dep_time'].dt.date - \

This comment has been minimized.

Copy link
@kmax12

kmax12 May 3, 2019

Member

renamed to something more meaningful

Show resolved Hide resolved docs/source/automated_feature_engineering/handling_time.rst Outdated

Let's make features at some varying times in the flight example. Trip ``14`` is a flight from CLT to PHX on January 31 2017 and trip ``92`` is a flight from PIT to DFW on January 1. We can set any cutoff time before the flight is scheduled to depart, emulating how we would make the prediction at that point in time.
In this computation, features that can be approximated will be calculated at 1 day intervals, while features that cannot be approximated (e.g "what is the destination of this flight?") will be calculated at the exact cutoff time.

This comment has been minimized.

Copy link
@rwedge

rwedge May 3, 2019

Contributor

The flight destination example was a little disorienting since the rest of this section is talking about a fraud detection problem

Show resolved Hide resolved featuretools/tests/demo_tests/test_demo_data.py
@rwedge

rwedge approved these changes May 3, 2019

@kmax12 kmax12 merged commit 7e1e47a into master May 3, 2019

4 checks passed

codecov/patch 100% of diff hit (target 96.1%)
Details
codecov/project 96.1% (+<.01%) compared to 26dd292
Details
license/cla Contributor License Agreement is signed.
Details
test_all_python_versions Workflow: test_all_python_versions
Details

@gsheni gsheni deleted the cutoff_time_docs branch May 3, 2019

@rwedge rwedge referenced this pull request May 17, 2019

Merged

v0.8.0 #548

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.