Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time series documentation #1896

Merged
merged 12 commits into from
Feb 18, 2022
Merged

Time series documentation #1896

merged 12 commits into from
Feb 18, 2022

Conversation

tamargrey
Copy link
Contributor

Adds a separate time series guide for how to do feature engineering for time series problems.

closes #1758

@codecov
Copy link

codecov bot commented Feb 10, 2022

Codecov Report

Merging #1896 (cb60a24) into main (70ff652) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1896   +/-   ##
=======================================
  Coverage   98.78%   98.78%           
=======================================
  Files         149      149           
  Lines       16424    16424           
=======================================
  Hits        16224    16224           
  Misses        200      200           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 70ff652...cb60a24. Read the comment docs.

Copy link
Contributor

@thehomebrewnerd thehomebrewnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small things, but overall this is a nice guide!

docs/source/release_notes.rst Outdated Show resolved Hide resolved
"In multi-table datasets, a feature engineering window for a single row in the target DataFrame extends forward in time over observations in child DataFrames starting at the time index and ending when either th cutoff time or last time index is reached. \n",
"\n",
"<p style=\"margin:30px\">\n",
" <img style=\"display:inline; margin-right:50px\" width=100% src=\"../_static/images/multi_table_FE_timeline.png\" alt=\"Featuretools\" />\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This image is not displaying properly in the readthedocs build.

docs/source/guides/time_series.ipynb Outdated Show resolved Hide resolved
"We also need to determine how far back in time before `t - 7` we can go. Too far back, and we may lose the potency of our recent observations, but too recent, and we may not capture the full spectrum of behaviors displayed by the data. In this example, let's say that we only want to look at 5 days worth of data at a time. We'll call this our `window_length`. \n",
"\n",
"<p style=\"margin:30px\">\n",
" <img style=\"display:inline; margin-right:50px\" width=100% src=\"../_static/images/time_series_FE_timeline.png\" alt=\"Featuretools\" />\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image not displaying correctly.

"Let's take a look at an actual feature engineering window as we defined with `gap` and `window_length` above. Below is an example of how we can extract many features using the same feature engineering window without exposing our target value.\n",
"\n",
"<p style=\"margin:30px\">\n",
" <img style=\"display:inline; margin-right:50px\" width=100% src=\"../_static/images/window_calculations.png\" alt=\"Featuretools\" />\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image not displaying correctly.

"id": "a8104f18",
"metadata": {},
"source": [
"# Time Series Problems"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about calling this Feature Engineering for Time Series Problems since that is really the focus of this guide?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes a lot of sense to me!

@rwedge
Copy link
Contributor

rwedge commented Feb 17, 2022

Where did all the github checks go?

@tamargrey
Copy link
Contributor Author

Where did all the github checks go?

@rwedge not sure! Let me try getting the latest from main and pushing and seeing if that kicks off the CI run

docs/source/guides/time_series.ipynb Outdated Show resolved Hide resolved
"source": [
"### Rolling Transform Primitives\n",
"\n",
"Since we have access to the entire feature engineering window, we can aggregate over that window. Featuretools has several rolling primitives with which we can achieve this. Here, we'll use the `RollingMean` primitives `RollingMin`, setting the `gap` and `window_length` accordingly. Here, the gap is incredibly important, because when the gap is zero, it means the current observation's taret value is present in the window, which exposes our target.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we'll use the RollingMean primitives RollingMin,

awkward phrasing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing to "Here, we'll use the RollingMean and RollingMin primitives"

rwedge
rwedge previously approved these changes Feb 17, 2022
Copy link
Contributor

@rwedge rwedge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, assuming tests pass after merge conflict resolved

@tamargrey tamargrey force-pushed the time-series-documentation branch 2 times, most recently from f058121 to 4fd5704 Compare February 17, 2022 20:21
@tamargrey tamargrey enabled auto-merge (squash) February 18, 2022 16:19
@@ -27,7 +29,7 @@ v1.6.0 Feb 17, 2022
* Fix URL deserialization file (:pr:`1909`)

Thanks to the following people for contributing to this release:
:user:`jeff-hernandez`, :user:`rwedge`, :user:`thehomebrewnerd`
:user:`jeff-hernandez`, :user:`rwedge`, :user:`thehomebrewnerd`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra whitespace between rwedge and thehomebrewnerd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed!

@tamargrey tamargrey enabled auto-merge (squash) February 18, 2022 17:41
Copy link
Contributor

@rwedge rwedge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tamargrey tamargrey merged commit 6da4a5f into main Feb 18, 2022
@thehomebrewnerd thehomebrewnerd mentioned this pull request Mar 15, 2022
@rwedge rwedge deleted the time-series-documentation branch June 16, 2022 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add rolling gap primitives to the Handling Time guide in documentation
4 participants