Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upDoes holt_winters function name match the implementation? #2458
Comments
This comment has been minimized.
This comment has been minimized.
|
@eliothedeman As the implementer of the function, could you have a look and comment? |
This comment has been minimized.
This comment has been minimized.
|
Holt winters can be double or triple per wikipedia. |
This comment has been minimized.
This comment has been minimized.
|
@brian-brazil Could you share the link to the article, please? I know there is Exponential smoothing article where, in Double exponential smoothing section, it is said:
I checked the article linked in note 12 and I didn't find such reference in it. Even more, I think it reaffirms my proposal. |
This comment has been minimized.
This comment has been minimized.
|
That's the reference. |
This comment has been minimized.
This comment has been minimized.
|
@brian-brazil While the Wikipedia article mentions double exponential smoothing as "Holt-Winters", the linked reference only calls the triple exponential smoothing under section 3.3 "Holt Winters", not the double smoothing in section 3.2. Other sources that I find also always refer to "triple smoothing" or "three terms". Are we missing something here? |
This comment has been minimized.
This comment has been minimized.
|
I talked to @eliothedeman this weekend and I believe he said that his implementation in Prometheus is the "double exponential smoothing" variety. Right, Eliot? |
This comment has been minimized.
This comment has been minimized.
morganhankins
commented
May 15, 2017
|
+1 to @mikefonted's suggestion that the existing code be renamed 'holt_linear' |
This comment has been minimized.
This comment has been minimized.
|
Hey, sorry about not seeing this. Have had notifications turned off for a while. The name I hear most often in reference to these functions is holt-winters double and holt-winters triple. Maybe rename to holt_winters_double? I wouldn't know what holt_linear was by reading it. |
brian-brazil
added
priority/Pmaybe
component/promql
help wanted
labels
Jul 14, 2017
gouthamve
added
the
hacktoberfest
label
Sep 28, 2017
This comment has been minimized.
This comment has been minimized.
|
I'd be happy to take this on, but I assume we'd want to leave it alone until 2.0? |
This comment has been minimized.
This comment has been minimized.
|
@mdlayher I don't know whether this should be changed (in place, without introducing a new function), but if it should be changed, it needs to happen before 2.0. Otherwise it'd have to wait until Prometheus 3.0, since it's a breaking change :) |
This comment has been minimized.
This comment has been minimized.
|
Makes sense! I just wanted to ping since I saw @gouthamve apply the hacktoberfest tag. |
This comment has been minimized.
This comment has been minimized.
|
We could exclude this function from the 2.0 stability guarantees. |
This comment has been minimized.
This comment has been minimized.
ghost
commented
Oct 18, 2017
|
I have played with this a bit. There are some decent reasons to not do triple exponential. You have to capture at least twice the expected seasonality to pretrain the model, and if you have to specify the seasonality in advance, getting it wrong gives essentially an expensive approximate double exponential. Beyond that, its a really fundamentally different object than p8s calculations broadly need to store. That said, I'd be happy to help out with this if there was interest. |
This comment has been minimized.
This comment has been minimized.
|
For our use case (consumer platform), we have a pretty strong use case for "real" Holt Winters: our usage cycles are pretty regular on a 24-hour and 7-day basis. For almost all our Prometheus servers, we have the required 2 periods of the latter, as is the default. I stumbled over this naming mismatch as well, when trying to make use of this seasonality for alerting on irregularities. The double-exponential was only mildly useful for my case. +1 for renaming the current function. We could just sidestep the Holt/Holt-Winters issue by calling them |
This comment has been minimized.
This comment has been minimized.
|
also +1 for implementing triple exponential, yes it's expensive, but I would be surprised if it's not still IO bound for most reasonable intervals. |
This comment has been minimized.
This comment has been minimized.
ghost
commented
Oct 18, 2017
|
@matthiasr So That actually touches on something that I've been curious about. Seasonality detection usually requires multiple units of the timespan of the seasonality to handle appropriately, but at the same time prometheus is typically more short-term, so I'd never expect to detect or visualize anything more than 2 weeks or so within the platform. Have you had any thoughts about how to handle this? I think the renaming issue is slightly more complicated, though. There are 2 triple exponential versions and they're different in ways that cause them to behave very differently from a mathematical perspective. Which one of the two did you envision including? |
This comment has been minimized.
This comment has been minimized.
|
If we're going to rename the existing function, we should do it soon. |
This comment has been minimized.
This comment has been minimized.
This seems reasonable to me. |
This comment has been minimized.
This comment has been minimized.
ghost
commented
Oct 26, 2017
|
Ok. It seems that my group's planning and prioritization process will make it possible for me to contribute both holt winters 3 algorithms, but perhaps not before the final 2.0 release is cut. What should I plan on the names to be? Should the holt winters 2 method name be changed before final in anticipation of these extras being there soon? |
This comment has been minimized.
This comment has been minimized.
|
If we want to rename the function, it must happen before the 2.0 release (planned for November 1st). Adding more functions can happen in a 2.1 release. |
This comment has been minimized.
This comment has been minimized.
ghost
commented
Oct 26, 2017
|
That sounds great. I think calling the current holt winters double_exponential would be fine, and the two new ones could be triple_exponential_additive and triple_exponential_multiplicative. |
This comment has been minimized.
This comment has been minimized.
Or we could exclude it from the stability guarantees. |
This comment has been minimized.
This comment has been minimized.
ghost
commented
Oct 27, 2017
|
I am fine with that solution, too. Seems easier for now and I could do the renaming/aliasing as part of the triple holt winters work. |
This comment has been minimized.
This comment has been minimized.
|
+1 for deprecating, but not touching it now, and also keeping it around for
a few minor versions via the alias
…On Fri, Oct 27, 2017 at 12:56 PM Chris ***@***.***> wrote:
I am fine with that solution, too. Seems easier for now and I could do the
renaming/aliasing as part of the triple holt winters work.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2458 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAICBqxYMEJBux99px_sC9iT7Z43al3Dks5swdL4gaJpZM4MPh8N>
.
|
This comment has been minimized.
This comment has been minimized.
That's already 2 seasonalities for us – year-over-year is less pronounced than week-over-week in our case. We also have some Prometheus servers that collect pre-aggregated data for much longer, and where we would be able to do predictions based on a years' worth of data or more. It would be impossible to process the raw data over this timeframe anyway because of the IO needed. In the end, Prometheus doesn't really have an upper bound on retention (that I've encountered so far). It's up to the user to decide how to balance storage, retention, dimensions and seasonalities. |
This comment has been minimized.
This comment has been minimized.
|
From a general reliability standpoint, I'd be wary of anything that needed to pull in data from remote storage (say more than a week/month of history) in order to be useful. As many users seem to want to use this for alerting (for better or worse) we should be targeting time ranges that we can reasonably expect to be handled entirely via local storage. Once we get beyond that we're pretty much in reporting territory, which is best done outside Prometheus. |
This comment has been minimized.
This comment has been minimized.
|
Yes, we can document the dangers of doing queries over long time ranges. I don't plan on using this for alerting, but "how much traffic should I expect right now" is something I would like to use in alerting. |
This comment has been minimized.
This comment has been minimized.
ghost
commented
Oct 27, 2017
|
I'm happy to indicate those dangers in whatever documentation. I can think of nothing less useful, in this case, than trying to query a year worth of data to build a model that crashes/slows something about the monitoring platform. |
This comment has been minimized.
This comment has been minimized.
|
Prometheus can probably handle it (and if not, it should be obvious fairly quickly), the issue is more if the remote storage goes down and suddenly a critical alert stops working. Keep in mind as well that by default recording rules and alerts will not use remote storage for this reason (though that's not implemented yet). |
This comment has been minimized.
This comment has been minimized.
ghost
commented
Oct 27, 2017
|
Ah, I was thinking more of having tons of alerts that all rely through the new HW on remote storage to work and have to pay the additional penalties for this (or force loading an enormous amount of data into local storage) and the way that this would put unexpected pressures on the platform, compared to the normal alerting concerns. |
This comment has been minimized.
This comment has been minimized.
Oh it was implemented in my initial remote read PR https://github.com/prometheus/prometheus/pull/2499/files, but in 2.0 that functionality seems still missing? |
This comment has been minimized.
This comment has been minimized.
|
That's a concern in general, but we'll have advice to be very careful with remote storage in rules and it'll be an opt-in thing. If however the outcome of this issue was a function that effectively required remote storage, then that'd be a problem as we'd be sending users mixed messages. |
This comment has been minimized.
This comment has been minimized.
ghost
commented
Nov 13, 2017
|
I think that might be the central point of a huge problem. You can definitely construct a 3hw model without using too much historical data, but it won't be good for a long time, and the state will be lost and need to be re-accumulated if you restart p8s or lose the model for whatever reason. Beyond that, getting the parameters just right (the triple exponential models will do nonsense on a scale well beyond what the second order ones will if they have bad parameters) will likely require people to start by making a huge set of models and they'll want optimum performance up front so they can differentiate between them. To me, this sounds like a great way to cause problems with the storage backend in a way that contradicts your messaging. |
This comment has been minimized.
This comment has been minimized.
xkilian
commented
Mar 6, 2019
|
I do not think this function should make use of remote storage, but it should persist its internal series in the prometheus TSDB so that the model data is available and not have to restart from nothing. Documentation should state it should only be used on busy series that actually display seasonality (ex. corporate internet access, corporate backbone network links, back-office processing systems related series, etc. Anyone still willing to get this implemented in some form? |
This comment has been minimized.
This comment has been minimized.
|
That would be done via recording rules, which you should already be able to do today. |
mikefonted commentedMar 1, 2017
Hi,
First of all, I've just started looking at metrics and time series databases.
Some metrics libraries have a kind of metric named Meter that includes EWMA (Exponentially Weighted Moving Average) on it. So, I started searching for something similar on Prometheus querying functions webpage section and I've found holt_winters().
Then, I needed to read more on these stats topics and I found the Elasticsearch documentation very useful. But there is something I'm not sure about:
It seems that Holt-Winters method (triple exponential smoothing) is not the same thing that it's implemented in this repo, which looks more like the Holt’s Linear method (double exponential smoothing).
So, if my suspicions are true, my proposal is to change the name of the function from holt_winters to holt_linear.
Some related links:
+ Holt-Winters seasonal method
+ Holt's linear trend method