Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

You should be able to specify scalars such as 13h #419

Open
brian-brazil opened this Issue Dec 9, 2014 · 29 comments

Comments

Projects
None yet
4 participants
@brian-brazil
Copy link
Member

brian-brazil commented Dec 9, 2014

Duration thresholds are common in alerts, you should be able to easily and readably provide them as scalars.

For example 13h, 10m, 500s, 5d, 200ms. I don't think we need full ParseDuration support, just one unit of ns/us/ms/s/h/d.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Dec 9, 2014

Not sure what you mean. We have FOR 5m, FOR 1h, and so on for ALERT statements already. What are you suggesting?

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Dec 9, 2014

I want to be able to durations inside expressions, for example: time() - last_success_time > 13h

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Dec 9, 2014

A-ha! Got it. One option could be to extend the grammar. Another could be to add a function for that. What would you think about a function like seconds("13h") or to_seconds("13h")? Better name suggestions welcome.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Feb 4, 2015

Seems straightforward. Seconds only seems limiting, though.
Many metrics are in milliseconds. A function duration(d, unit='s') with unit defaulting to seconds could be useful as well.

duration("3h") == 10800
duration("1m30s", "ms") == 60000

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Feb 4, 2015

That's a bit verbose, I was thinking also a ms suffix which would multiply things by .001, similarly for us and ns.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Feb 4, 2015

You mean like duration_(s|ms|us|ns)?
Or back to the syntax of your second comment?

Edit: I see what you mean. So there are two basic options - which one do wo pick?

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Feb 4, 2015

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Feb 4, 2015

I'd propose that job:http_latency_seconds:mean5m{job="myjob"} > 150ms would do what you expect.

Having all your metrics have the same units rather than a arbitrary mix has it's uses 😃

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Feb 4, 2015

That's definitely an option, though I'd always think twice before adding something to the syntax itself. Just want to make sure we thought carefully about this first. I'm always less worried about adding functions because of how orthogonal they are to everything else. Then again, the lexer/parser could simply handle this as another way of specifying a number (which would lose the information about its original formatting though - e.g. if Prometheus prints a rule expression, it'd show "3600" instead of "1h").

However, as you rightly point out, adding it as a syntax would be most helpful if all metrics were in seconds. Given that that's not the case (for better or worse - but it would be impossible to enforce globally anyways), should we still do it vs. adding a function? Of course, the function could also be shorter, like d(). Hmmmm... I have to admit, putting it into the syntax is tempting though.

Btw., the same concept (grammar or not) could be useful for things other than time. Like 8M = 8 million, 8Mi = 8*1024^2.

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Feb 4, 2015

Yeah, I was thinking the big metric prefixes anyway. Not sure about the smaller ones, we'd have a clash on 'm' anyway for milli vs. minute.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Feb 4, 2015

So basically, all units are resolved to one fixed base unit, i.e., 150ms => 0.15, 3m => 180?
I can see it working with all metrics having the specified base unit, but as you said, that's not enforceable and prometheus exposes some *_milliseconds metrics itself, which kind of suggests what anyone else being new to the system will do (and drag along as legacy.)
If everyone ends up cluttering queries with stuff like 150us / 1000 this feature failed its purpose.

Then again, I cannot think of any example where a similar issue would realistically come up as for time units.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Feb 4, 2015

@brian-brazil Oh, good point! That could actually be super confusing for people if they see "8M", "8T", etc. and assume they can do "8m" (as in 8 milli) as well, and end up getting 8 minutes worth of seconds instead. I'm more inclined to not support other units then and have them use scientific notation instead (which we still need to add).

@fabxc As you might notice, this still needs some sleeping over it, though I think it'll be a worthwhile feature in the end once it's thought through. The question of always keeping everything in seconds was/is an interesting debate where @brian-brazil prefers always-seconds, but an early community decision leaned towards more flexible units, but of course both have ups and downs. My main gripe with requiring every time metric to be in seconds is that it's so implicit, and not explicitly enforced by any part of the system. Documentation can help, but only up to a certain point... given that at the moment at least we haven't decided to convert everything into seconds, I tend to agree that this feature becomes less useful.

Hm.

@fabxc Btw., since this is somewhat contentious for now, if you're interested in less contentious starter tasks, feel free to join us on any of our communications channels: http://prometheus.io/community/ - we're quite active on IRC especially.

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Feb 4, 2015

There's other places, kB, MB, Mb (network engineers mostly work in bits) and hours could all be exported depending on the particular range the person doing the instrumentation felt was appropriate. I'm a big fan of encouraging the same base units everywhere, as far as practical.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Feb 4, 2015

It'd be great if it was explicit. But requiring it implicitly, we're saying "you can use these functions, but they'll only do the right thing if you follow our conventions.". Feels off to me somehow. Maybe time for another community query :D

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Feb 4, 2015

Yeah, a key part would be making the easiest way to use the client libraries to be the "correct" way. The go client for example requires the user to explicitly think about their time units, whereas the java simpleclient does not.

A counter-argument is that many exports of the jmx/collectd/node class will be in a variety of units as they're mostly straight proxies for the values (which leads to things like https://github.com/prometheus/prometheus/blob/master/consoles/node-overview.html#L41 where you have to explicitly scale). We could change the node_exporter, and offer scaling in jmx. Not so sure about collected, as the exporter doesn't have transforms.

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Feb 4, 2015

@jrv Scientific notation for things other than m, h, d and w sounds good. We'd still need to ensure that the relevant durations are exported as seconds rather than minutes or hours (which is the case now AFAIK, as anything that big will be a unixtime).

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Feb 4, 2015

Thanks for the clarification and background info, @juliusv. I might join you on IRC when I've more time on my hands.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jul 6, 2015

As we don't have units this can never be part of the language as literals. We don't want to get into guessing games based on the suffix of the metric name.

A function as suggested back then is the only realistic way I see – duration('3m', 'ns') is still better than writing down 180000000000 or 18e10.

So can we either implement that function or close this?

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Jul 6, 2015

I think it's perfectly fine to have this as language literals, by having convention for units (i.e. seconds in instrumentation)

I'm planning on implementing this.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jul 6, 2015

Convention's are not something to base types and units on in my opinion.
Especially not based on the simple assumption that everything is in seconds rather than at least looking at the suffix.

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Jul 6, 2015

I think it's fine to base things on convention, we already do so in many other places in Prometheus.

Users shouldn't have to worry about units, everything should magically work and we should do so in a simple way so that users can focus on more important things. I say that we do things simply by keeping the existing convention we already have within the clients and prometheus server that we work in terms of seconds.

The alternative is to build a system that requires everything to be explicitly plumbed through, adding complexity throughout the system for no real benefit.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jul 6, 2015

Yes, everyting should magically work – but assuming everything is in seconds even if the metric ends in _microseconds has nothing to do with 'magically works' but with 'wrong results for no apparent reason' (it should've magically gotten this right, right?).

We got the _total convention going and that we give units in full words, i.e. _milliseconds instead of _ms. Getting people to only use _seconds did not succeed - case in point client_golang.

Here's what would work: duration literals that resolve correctly against full-length unit suffixes (which we consistently have) and error strictly if the unit can not be inferred from one of _minutes, _seconds, _milliseconds, _microseconds, or _nanoseconds.
If someone gets this working on full query ASTs in a clean and maintainable way, thumbs up from my side.

The thing about 'magically works' is that it must not have unexpected side effects by the person applying the magic - understanding implementation details of magic to use magic makes it obsolete.

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Jul 6, 2015

assuming everything is in seconds even if the metric ends in _microseconds has nothing to do with 'magically works' but with 'wrong results for no apparent reason' (it should've magically gotten this right, right?).

This is the core problem really, units shouldn't be the concern of the person doing the instrumentation - instead they have to explicitly choose them in some clients which leads to variance.

I think this is what we should be tackling, as is makes the rest of the problem much much simpler then. The reason client_golang doesn't do seconds is that it predates all discussions on this, which is something we should resolve when it's reworked.

Adding logic as you propose is epicycles, for example how do I tell if a metric is meant to be a duration as against say a byte? Moving a display-level problem into the core of the language is madness in my opinion, let's leave display problems at the display level (e.g. promdash, console templates).

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Jul 6, 2015

I should mention that I'm going to adding functions that need durations passed into them, which is why I'm looking at implementing this now. This is separate from the question of how we handle units in clients.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jul 6, 2015

Yes, it is madness - the "clean and maintainable" comment was meant to imply that.
Don't get me wrong, I'd prefer to have everything in seconds but we cannot develop against a convention we don't have, or equally that's only in our heads but not actively being worked on.

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Jul 6, 2015

What I'd propose is that when the go client is reworked, we add the timing convenience functions similar to what we have in Python/Java and have those in seconds. Thus the easiest way to do things would also follow the convention we'd like to have, and things will be better for new metrics at least. We could even add such functions now.

@grobie

This comment has been minimized.

Copy link
Member

grobie commented Aug 23, 2016

So, we're converging towards seconds everywhere it seems. Being able to use such literals in predict_linear() would increase readability a lot. If someone uses a metric returning time in a unit other than seconds, they simply shouldn't use these scalars or need to take care of the necessary multiplication.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Aug 23, 2016

I agree that we have progressed quite a bit in standardizing units – it's still just by contract. Especially client_golang metrics are still around using microseconds due to vendoring. So my concerns that this will cause issue after issue where people were not reading the docs remain.

In 1.0 we cannot really remove this again if it causes too much trouble.

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Jul 14, 2017

We don't have consensus on this, so Pmaybe.

simonpasquier pushed a commit to simonpasquier/prometheus that referenced this issue Oct 12, 2017

Merge pull request prometheus#419 from dahc/addmirth
Add Mirth exporter to third-party exporters list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.