-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling time documentation page #512
Conversation
Codecov Report
@@ Coverage Diff @@
## master #512 +/- ##
=========================================
+ Coverage 96.1% 96.1% +<.01%
=========================================
Files 108 108
Lines 8898 8900 +2
=========================================
+ Hits 8551 8553 +2
Misses 347 347
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #512 +/- ##
=========================================
+ Coverage 96.1% 96.1% +<.01%
=========================================
Files 108 108
Lines 8913 8915 +2
=========================================
+ Hits 8566 8568 +2
Misses 347 347
Continue to review full report at Codecov.
|
@@ -163,13 +164,34 @@ def _clean_data(data): | |||
clean_data.loc[:, 'flight_id'] = clean_data['carrier'] + '-' + \ | |||
clean_data['flight_num'].apply(lambda x: str(x)) + ':' + clean_data['origin'] + '->' + clean_data['dest'] | |||
|
|||
column_order = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated column order to improve print out in the docs
@@ -148,7 +149,7 @@ def _clean_data(data): | |||
clean_data = _reconstruct_times(clean_data) | |||
|
|||
# Create a time index 6 months before scheduled_dep | |||
clean_data.loc[:, 'time_index'] = clean_data['scheduled_dep_time'] - \ | |||
clean_data.loc[:, 'date_scheduled'] = clean_data['scheduled_dep_time'].dt.date - \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to something more meaningful
|
||
Let's make features at some varying times in the flight example. Trip ``14`` is a flight from CLT to PHX on January 31 2017 and trip ``92`` is a flight from PIT to DFW on January 1. We can set any cutoff time before the flight is scheduled to depart, emulating how we would make the prediction at that point in time. | ||
In this computation, features that can be approximated will be calculated at 1 day intervals, while features that cannot be approximated (e.g "what is the destination of this flight?") will be calculated at the exact cutoff time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flight destination example was a little disorienting since the rest of this section is talking about a fraud detection problem
Improved the handling time documentation page