Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Day/month functions #1545

Closed
spinus opened this Issue Apr 11, 2016 · 29 comments

Comments

Projects
None yet
10 participants
@spinus
Copy link

spinus commented Apr 11, 2016

I have few workflows when I need to check metrics over specific days.

It would be nice to have few functions which return a day of the week, day of the year, month etc.

examples:

day_of_the_week(time()) == 2
month(time()) == 12

@brian-brazil brian-brazil changed the title time functions Day/month functions Apr 18, 2016

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 18, 2016

The question is what timezone would these answers be in?

@grobie

This comment has been minimized.

Copy link
Member

grobie commented Apr 18, 2016

As with all dates in Prometheus: UTC?

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Apr 19, 2016

Obviously, once you talk about days of the week, you are probably fishing for things like "if this happens during the weekend" and such, which often refers to the timezone the user is in... with hour_of_the_day, you could at least take static offsets to UTC into account (and everybody should ignore DST changes anyway ;).

Definitely more implications here…

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Apr 19, 2016

Ugh... would we want that functionality in the query language though? I thought that would conceptually better fit into AM.

@spinus

This comment has been minimized.

Copy link
Author

spinus commented Apr 21, 2016

@grobie +1 for UTC
@juliusv but AM use query language to "trigger" alerts, right?

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Apr 21, 2016

The alerting expressions are all on the Prometheus side. Alertmanager receives the alerts, de-dups them, and decides where to send them. If the "day of the week" use-case is to send alert here or there depending on the time, it would be more an Alertmanager thing.

@spinus

This comment has been minimized.

Copy link
Author

spinus commented Apr 21, 2016

IMHO this could be just a function you can use in the query, that way you can use those function to make other queries/calculations as well and I think it's simple enough to implement as well (just add a fuction) where adding this to alert manager can me more work and time as there is no such concepts there - but this is only my view on that case.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Apr 21, 2016

@spinus Perhaps explain your workflows in more detail? The Prometheus community is very use-case driven, i.e. we want to keep Prometheus components as lean as possible, and the features we do have should encourage best practices and not be prone to be used in an anti-pattern. A number of good use-cases is usually the best start to add a feature.

@barkerd427

This comment has been minimized.

Copy link

barkerd427 commented Apr 21, 2016

I'll share mine. We're a financial company, so we only have most batch jobs
run when the stock market is open. Every weekend I have alerts that I need
to silence until the next run. This includes holidays. At the least, it
would be nice to have something closer to cron functionality where I can
specify that alerts can be suppressed on weekends. Further, I'd like to
have a more configurable setup so I can stop alerts on holidays. I want
some alerts to come through on these days, but not any of our batch jobs.
I'm currently using a function that determines if the job has run in the
last day so we don't miss a failed run.
On Apr 21, 2016 7:23 AM, "Björn Rabenstein" notifications@github.com
wrote:

@spinus https://github.com/spinus Perhaps explain your workflows in
more detail? The Prometheus community is very use-case driven, i.e. we want
to keep Prometheus components as lean as possible, and the features we do
have should encourage best practices and not be prone to be used in an
anti-pattern. A number of good use-cases is usually the best start to add a
feature.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#1545 (comment)

@spinus

This comment has been minimized.

Copy link
Author

spinus commented Apr 21, 2016

@beorn7 sure thing.

I want to monitor some batch jobs.

  • task 1 runs every Tuesday
  • task 2 runs every day except weekends

I need to be notified on all failures and as well when the task won't start at all.
My current setup for 1 task is: if the job didn't run in last week -> alert, if there was any failure -> alert.

This has some downsides as if the job fails on Tuesday and somebody will rerun it on Wednesday, first alert condition is not true anymore. Any other ideas how this can be configured?

(for now I'm asking people to implement service level checks to return which means alert logic is shifted into a check/script but would be nice to be able to do that in prometheus)

@SypleMatt

This comment has been minimized.

Copy link

SypleMatt commented Jun 27, 2016

In my opinion - at least in the scenarios I have encountered directly - these time based situations are almost always specific to the TZ that the server targets. At least in the case of batch/background jobs. In the example from @barkerd427 the timezone is defined by the business (the timezone used by the stock exchange).

So I think it usually makes sense to calculate these TZ sensitive values based on the local timezone of the server. Maybe two versions of each method, one based on local time of the server being monitored and another based on UTC, would provide a lot of flexibility. Not sure what implications that'd have internally though if the time series is aggregated across multiple servers in multiple timezones.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jun 28, 2016

I would expect that the local timezone of the server would tend to be UTC in many cases, which is not the timezone in which the server is located. If we were to offer timezone functions we'd need to open it up to all timezones which I think is a bit much complexity.

@barkerd427

This comment has been minimized.

Copy link

barkerd427 commented Jun 28, 2016

I would think UTC would be acceptable. Timezones can get very complicated.
On Jun 27, 2016 21:20, "Brian Brazil" notifications@github.com wrote:

I would expect that the local timezone of the server would tend to be UTC
in many cases, which is not the timezone in which the server is located. If
we were to offer timezone functions we'd need to open it up to all
timezones which I think is a bit much complexity.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1545 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AF0uxWdrWjxEH6MacOjKd5cNCJ970K76ks5qQITogaJpZM4IEjG8
.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Jun 28, 2016

Implementing the function(s) just for UTC is fairly straight forward, and it will help many people already. For their local timezone, they could work with a fixed offset, which is just a minor annoyance in the syntax.
DST will interfere, of course, but we could sell our limited functionality as a political statement. ;->

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jun 28, 2016

I'm for these functions with just UTC.

@jkemp101

This comment has been minimized.

Copy link

jkemp101 commented Jul 24, 2016

In my use case I need to have different alert thresholds for the size of queues on the weekend versus during the work week. UTC would be fine, I can convert as necessary and don't really care about DST because I don't need that fine of designation for my current use case.

I would also like to add a request for hour_of_day() so I can determine if its during the day or at night.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jul 24, 2016

For queues it's generally best to monitor how long things are taking to get through, rather than the size of the queue which changes hour to hour and month to month.

@tcolgate

This comment has been minimized.

Copy link
Contributor

tcolgate commented Aug 18, 2016

I gave a talk on faking UTC calculations with recording rules. It's never going to be super accurate, but there's room for improvement, and it's workable for 'month end calculations.
Slides are here: http://go-talks.appspot.com/github.com/tcolgate/prometheus-rules/presentation/meetup.slide
The rules are here https://github.com/tcolgate/prometheus-rules

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Aug 19, 2016

Thanks @tcolgate . About the leap-second issue we shortly discussed in the meetup: http://www.madore.org/~david/computers/unix-leap-seconds.html https://en.wikipedia.org/wiki/Unix_time#Leap_seconds

I guess whatever time() returns depends on the system configuration. But I'd expect it to be ignorant of leap seconds in most cases.

@tcolgate

This comment has been minimized.

Copy link
Contributor

tcolgate commented Aug 19, 2016

Thanks, I'll try and get my head around that lot. In truth, the biggest
issue is correctly working out the current year. I need to actually test
the calculation against UTC and see if it actually hold true (I suspect it
does not). I've vague ideas that there are alternative options based on
mods since 1972.
26 seconds one way or another isn't a big deal if you can't work out what
year you are in :)

On Fri, 19 Aug 2016 at 10:29 Björn Rabenstein notifications@github.com
wrote:

Thanks @tcolgate https://github.com/tcolgate . About the leap-second
issue we shortly discussed in the meetup:
http://www.madore.org/~david/computers/unix-leap-seconds.html
https://en.wikipedia.org/wiki/Unix_time#Leap_seconds

I guess whatever time() returns depends on the system configuration. But
I'd expect it to be ignorant of leap seconds in most cases.


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#1545 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEo86Tp_UYM7CXwLUD7mKL-CSvrWnNXks5qhXeKgaJpZM4IEjG8
.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Aug 19, 2016

golang/go#8728 - upstream doesn't support leap seconds.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Aug 19, 2016

That's about parsing.

It doesn't answer the question if the numbers of seconds elapsed since 01-01-1970 returned by time.New().Unix() include leap seconds or not.https://golang.org/pkg/time/ is completely silence about the issue. It might very well depend on the implementation in the underlying OS.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Aug 19, 2016

It you look at the linked bugs you'll see that it doesn't include leap seconds.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Aug 19, 2016

Could you point out where that is written?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Aug 19, 2016

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Aug 19, 2016

That one is more stating the opposite: "Barring that, it seems to me that we should document that when we say UTC we actually mean TAI, and document that time.Now automatically adds the appropriate number of leap seconds."
Looking at the code, I think that time.Now adds leap seconds. This all is really confusing. Taking the time package documentation literally, leap seconds should be added, but the fact that they don't should be considered a bug. Or they should change the documentation, which they kind of don't want to do according to the linked bug. @rsc says "The calendrical calculations always assume a Gregorian calendar, with no leap seconds." but that's again not affecting unix time.

@tcolgate

This comment has been minimized.

Copy link
Contributor

tcolgate commented Aug 22, 2016

I've managed to make the start of year calculation accurate up to 2099, which will do for me. I think, I can probably get it accurate beyond that too, using a similar approach.
The updated rule has been pushed to the repository

@spinus

This comment has been minimized.

Copy link
Author

spinus commented Aug 30, 2016

Thank you!

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.