Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does holt_winters function name match the implementation? #2458

Open
mikefonted opened this Issue Mar 1, 2017 · 35 comments

Comments

Projects
None yet
@mikefonted
Copy link

mikefonted commented Mar 1, 2017

Hi,

First of all, I've just started looking at metrics and time series databases.

Some metrics libraries have a kind of metric named Meter that includes EWMA (Exponentially Weighted Moving Average) on it. So, I started searching for something similar on Prometheus querying functions webpage section and I've found holt_winters().

Then, I needed to read more on these stats topics and I found the Elasticsearch documentation very useful. But there is something I'm not sure about:

It seems that Holt-Winters method (triple exponential smoothing) is not the same thing that it's implemented in this repo, which looks more like the Holt’s Linear method (double exponential smoothing).

So, if my suspicions are true, my proposal is to change the name of the function from holt_winters to holt_linear.

Some related links:

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Mar 1, 2017

@eliothedeman As the implementer of the function, could you have a look and comment?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 1, 2017

Holt winters can be double or triple per wikipedia.

@mikefonted

This comment has been minimized.

Copy link
Author

mikefonted commented Mar 1, 2017

@brian-brazil Could you share the link to the article, please?

I know there is Exponential smoothing article where, in Double exponential smoothing section, it is said:

One method, sometimes referred to as "Holt-Winters double exponential smoothing"[12] works as follows

I checked the article linked in note 12 and I didn't find such reference in it. Even more, I think it reaffirms my proposal.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 1, 2017

That's the reference.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Mar 1, 2017

@brian-brazil While the Wikipedia article mentions double exponential smoothing as "Holt-Winters", the linked reference only calls the triple exponential smoothing under section 3.3 "Holt Winters", not the double smoothing in section 3.2. Other sources that I find also always refer to "triple smoothing" or "three terms". Are we missing something here?

@mdlayher

This comment has been minimized.

Copy link
Member

mdlayher commented Apr 10, 2017

I talked to @eliothedeman this weekend and I believe he said that his implementation in Prometheus is the "double exponential smoothing" variety. Right, Eliot?

@morganhankins

This comment has been minimized.

Copy link

morganhankins commented May 15, 2017

+1 to @mikefonted's suggestion that the existing code be renamed 'holt_linear'

@eliothedeman

This comment has been minimized.

Copy link
Contributor

eliothedeman commented May 15, 2017

Hey, sorry about not seeing this. Have had notifications turned off for a while. The name I hear most often in reference to these functions is holt-winters double and holt-winters triple. Maybe rename to holt_winters_double? I wouldn't know what holt_linear was by reading it.

@mdlayher

This comment has been minimized.

Copy link
Member

mdlayher commented Sep 28, 2017

I'd be happy to take this on, but I assume we'd want to leave it alone until 2.0?

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Sep 28, 2017

@mdlayher I don't know whether this should be changed (in place, without introducing a new function), but if it should be changed, it needs to happen before 2.0. Otherwise it'd have to wait until Prometheus 3.0, since it's a breaking change :)

@mdlayher

This comment has been minimized.

Copy link
Member

mdlayher commented Sep 28, 2017

Makes sense! I just wanted to ping since I saw @gouthamve apply the hacktoberfest tag.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Sep 29, 2017

We could exclude this function from the 2.0 stability guarantees.

@ghost

This comment has been minimized.

Copy link

ghost commented Oct 18, 2017

I have played with this a bit. There are some decent reasons to not do triple exponential. You have to capture at least twice the expected seasonality to pretrain the model, and if you have to specify the seasonality in advance, getting it wrong gives essentially an expensive approximate double exponential. Beyond that, its a really fundamentally different object than p8s calculations broadly need to store. That said, I'd be happy to help out with this if there was interest.

@matthiasr

This comment has been minimized.

Copy link
Contributor

matthiasr commented Oct 18, 2017

For our use case (consumer platform), we have a pretty strong use case for "real" Holt Winters: our usage cycles are pretty regular on a 24-hour and 7-day basis. For almost all our Prometheus servers, we have the required 2 periods of the latter, as is the default.

I stumbled over this naming mismatch as well, when trying to make use of this seasonality for alerting on irregularities. The double-exponential was only mildly useful for my case.

+1 for renaming the current function. We could just sidestep the Holt/Holt-Winters issue by calling them double_exponential and triple_exponential, which would allow keeping the current name as a hidden/deprecated compatibility alias to double_exponential.

@matthiasr

This comment has been minimized.

Copy link
Contributor

matthiasr commented Oct 18, 2017

also +1 for implementing triple exponential, yes it's expensive, but I would be surprised if it's not still IO bound for most reasonable intervals.

@ghost

This comment has been minimized.

Copy link

ghost commented Oct 18, 2017

@matthiasr So That actually touches on something that I've been curious about. Seasonality detection usually requires multiple units of the timespan of the seasonality to handle appropriately, but at the same time prometheus is typically more short-term, so I'd never expect to detect or visualize anything more than 2 weeks or so within the platform. Have you had any thoughts about how to handle this?

I think the renaming issue is slightly more complicated, though. There are 2 triple exponential versions and they're different in ways that cause them to behave very differently from a mathematical perspective. Which one of the two did you envision including?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Oct 18, 2017

If we're going to rename the existing function, we should do it soon.

@mdlayher

This comment has been minimized.

Copy link
Member

mdlayher commented Oct 18, 2017

We could just sidestep the Holt/Holt-Winters issue by calling them double_exponential and triple_exponential, which would allow keeping the current name as a hidden/deprecated compatibility alias to double_exponential.

This seems reasonable to me.

@ghost

This comment has been minimized.

Copy link

ghost commented Oct 26, 2017

Ok. It seems that my group's planning and prioritization process will make it possible for me to contribute both holt winters 3 algorithms, but perhaps not before the final 2.0 release is cut.

What should I plan on the names to be? Should the holt winters 2 method name be changed before final in anticipation of these extras being there soon?

@grobie

This comment has been minimized.

Copy link
Member

grobie commented Oct 26, 2017

If we want to rename the function, it must happen before the 2.0 release (planned for November 1st). Adding more functions can happen in a 2.1 release.

@ghost

This comment has been minimized.

Copy link

ghost commented Oct 26, 2017

That sounds great. I think calling the current holt winters double_exponential would be fine, and the two new ones could be triple_exponential_additive and triple_exponential_multiplicative.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Oct 27, 2017

it must happen before the 2.0 release

Or we could exclude it from the stability guarantees.

@ghost

This comment has been minimized.

Copy link

ghost commented Oct 27, 2017

I am fine with that solution, too. Seems easier for now and I could do the renaming/aliasing as part of the triple holt winters work.

@matthiasr

This comment has been minimized.

Copy link
Contributor

matthiasr commented Oct 27, 2017

@matthiasr

This comment has been minimized.

Copy link
Contributor

matthiasr commented Oct 27, 2017

so I'd never expect to detect or visualize anything more than 2 weeks or so within the platform. Have you had any thoughts about how to handle this?

That's already 2 seasonalities for us – year-over-year is less pronounced than week-over-week in our case. We also have some Prometheus servers that collect pre-aggregated data for much longer, and where we would be able to do predictions based on a years' worth of data or more. It would be impossible to process the raw data over this timeframe anyway because of the IO needed.

In the end, Prometheus doesn't really have an upper bound on retention (that I've encountered so far). It's up to the user to decide how to balance storage, retention, dimensions and seasonalities.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Oct 27, 2017

From a general reliability standpoint, I'd be wary of anything that needed to pull in data from remote storage (say more than a week/month of history) in order to be useful. As many users seem to want to use this for alerting (for better or worse) we should be targeting time ranges that we can reasonably expect to be handled entirely via local storage.

Once we get beyond that we're pretty much in reporting territory, which is best done outside Prometheus.

@matthiasr

This comment has been minimized.

Copy link
Contributor

matthiasr commented Oct 27, 2017

Yes, we can document the dangers of doing queries over long time ranges. I don't plan on using this for alerting, but "how much traffic should I expect right now" is something I would like to use in alerting.

@ghost

This comment has been minimized.

Copy link

ghost commented Oct 27, 2017

I'm happy to indicate those dangers in whatever documentation. I can think of nothing less useful, in this case, than trying to query a year worth of data to build a model that crashes/slows something about the monitoring platform.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Oct 27, 2017

Prometheus can probably handle it (and if not, it should be obvious fairly quickly), the issue is more if the remote storage goes down and suddenly a critical alert stops working. Keep in mind as well that by default recording rules and alerts will not use remote storage for this reason (though that's not implemented yet).

@ghost

This comment has been minimized.

Copy link

ghost commented Oct 27, 2017

Ah, I was thinking more of having tons of alerts that all rely through the new HW on remote storage to work and have to pay the additional penalties for this (or force loading an enormous amount of data into local storage) and the way that this would put unexpected pressures on the platform, compared to the normal alerting concerns.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Oct 27, 2017

Keep in mind as well that by default recording rules and alerts will not use remote storage for this reason (though that's not implemented yet).

Oh it was implemented in my initial remote read PR https://github.com/prometheus/prometheus/pull/2499/files, but in 2.0 that functionality seems still missing?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Oct 27, 2017

That's a concern in general, but we'll have advice to be very careful with remote storage in rules and it'll be an opt-in thing. If however the outcome of this issue was a function that effectively required remote storage, then that'd be a problem as we'd be sending users mixed messages.

@ghost

This comment has been minimized.

Copy link

ghost commented Nov 13, 2017

I think that might be the central point of a huge problem. You can definitely construct a 3hw model without using too much historical data, but it won't be good for a long time, and the state will be lost and need to be re-accumulated if you restart p8s or lose the model for whatever reason.

Beyond that, getting the parameters just right (the triple exponential models will do nonsense on a scale well beyond what the second order ones will if they have bad parameters) will likely require people to start by making a huge set of models and they'll want optimum performance up front so they can differentiate between them.

To me, this sounds like a great way to cause problems with the storage backend in a way that contradicts your messaging.

@xkilian

This comment has been minimized.

Copy link

xkilian commented Mar 6, 2019

I do not think this function should make use of remote storage, but it should persist its internal series in the prometheus TSDB so that the model data is available and not have to restart from nothing.
Is it still in the plan to get real Hol-Winters function. I was successfully using this, and tuning values to meet actual seasonality with Cricket and RRDtool.

Documentation should state it should only be used on busy series that actually display seasonality (ex. corporate internet access, corporate backbone network links, back-office processing systems related series, etc.

Anyone still willing to get this implemented in some form?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 7, 2019

That would be done via recording rules, which you should already be able to do today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.