Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upAggregations: Add serial differencing aggregation #10190
Conversation
polyfractal
added
>feature
v2.0.0-beta1
WIP
:Search/Aggregations
labels
Mar 20, 2015
colings86
referenced this pull request
Mar 20, 2015
Closed
Add ability to perform computations on aggregations #9876
s1monw
assigned
colings86
Mar 20, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
polyfractal
Mar 24, 2015
Member
Just a note: I think I want to rename periods parameter to lags. More standardized naming, and I think a bit more descriptive.
|
Just a note: I think I want to rename |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
colings86
Mar 25, 2015
Member
+1 to renaming periods to lags, but what is the affect of having multiple lags? or is that not the reason why it's plural? We can still only get one output per bucket right?
|
+1 to renaming |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
polyfractal
Mar 25, 2015
Member
Oh, good point. It should be lag. I think supporting multiple lags would be very confusing and unnecessary.
|
Oh, good point. It should be |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Agreed, we should not try to support multiple lags in one reducer |
bleskes
and others
added some commits
Mar 27, 2015
rmuir
and others
added some commits
May 14, 2015
polyfractal
removed
the
v2.0.0-beta1
label
May 15, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Closing since this is against an outdated branch ( |
polyfractal
closed this
May 15, 2015
polyfractal
referenced this pull request
May 15, 2015
Merged
Aggregations: add serial differencing pipeline aggregation #11196
polyfractal
removed
:Search/Aggregations
>feature
WIP
labels
May 19, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kingaj
commented
Feb 25, 2016
|
any java example if have then please give me a link please |
polyfractal commentedMar 20, 2015
This is still a WIP, just putting it up for discussion. We may want to roll this functionality into a different Agg.
Serial Differencing
Serial differencing (or just differencing) is a technique where values in a time series are subtracted from itself at different time lags or periods. For example, the datapoint f(x) = f(xt) - f(xt-n), where
nis the period being used.A period of 1 is equivalent to a derivative: it is simply the change from one point to the next. Single periods are useful for removing constant, linear trends.
Single periods are also useful for transforming data into a stationary series. In this example, the Dow Jones is plotted over ~250 days. The raw data is not stationary, which would make it difficult to use with some techniques.
But once we plot the first-difference, it becomes a stationary series (we know this because the first difference is randomly distributed around zero, and doesn't seem to exhibit any pattern/behavior). The transformation reveals that the dataset is a random-walk model, which allows us to use further analysis.
Larger periods can be used to remove seasonal / cyclic behavior. In this example, a population of lemmings was synthetically generated with a sine wave + constant linear trend + random noise. The sine wave has a period of 30 days.
The first-difference removes the constant trend, leaving just a sine wave. The 30th-difference is then applied to the first-difference to remove the cyclic behavior, leaving a stationary series which is amenable to other analysis.
API
{ "aggs": { "my_date_histo": { "date_histogram": { "field": "timestamp", "interval": "day" }, "aggs": { "the_sum": { "sum": { "field": "lemmings" } }, "first_difference": { "diff": { "bucketsPath": "the_sum", "periods" : 1 } }, "thirtieth_difference": { "diff": { "bucketsPath": "first_difference", "periods" : 30 } } } } } }TODO