New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exponentially smoothed moving average as aggregate function. #27511
Comments
I doubt that it's possible to combine multiple states (which holds only last value + timestamp) and get some reasonable answer. And just aggregate function, which returns only last value would have limited usage due: Main problem right now, is: without recursive functions like (arrayScan) it's impossible to calculate multiple values efficiently. (W/O calculation of whole sequence for each value) |
No doubt. This is how exponential smoothing typically works.
It will be used for aggregated materialized views (antifraud, user summaries) and with window functions (smoothing over a window). |
I've received this reply in email but cannot see it on GitHub (probably a bug on GitHub).
|
We should probably have |
@akuzm @alexey-milovidov @alz @filimonov , what about : So, we have N columns, which are calculated by:
In case of exponential smoothing,
where C is |
BTW, aggregate functions can not use lambdas in its arguments currently 😞 |
Yes, it cannot be calculated deterministically if data is arrived in arbitrary order. |
Exponential smoothing moving sum can be calculated independent on the order of values. |
See laginframe - it is not really an agg function |
ema is like avgweighted, so the sql sentence 'arraySum(arrayMap(pair -> pair.1 * pair.2, arrayZip(arrayResize(grouped_numbers, n, 0), weights))) / n as ema' not divided by n,but arraySum(weights), like this: arraySum(arrayMap(pair -> pair.1 * pair.2, arrayZip(arrayResize(grouped_numbers, n, 0), weights))) / arraySum(weights) as ema |
The function will take two arguments: value and time and also parameter - half-decay period.
Example:
exponentialMovingAverage(300)(temperature, timestamp)
- exponentially smoothed moving average of the temperature for the past five minutes at the latest point of time.
The state of the aggregate function is current averaged value and the latest time: (v, t).
Whenever new value or new state is appeared, the state is updated as:
(a sort of - did I write the formula correctly?)
(does this way of calculation depend on the order of updates?)
The text was updated successfully, but these errors were encountered: