New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise an alert if a metric has not been captured in the last N seconds or minutes #136

Open
avtar opened this Issue Sep 12, 2013 · 13 comments

Comments

Projects
None yet
10 participants
@avtar

avtar commented Sep 12, 2013

It would be great if there was a way of generating alerts when a desired metric stops being available via Graphite. The use case would be to detect when a service or host goes offline. Thoughts?

@obazoud

This comment has been minimized.

Contributor

obazoud commented Sep 12, 2013

+1

@randeepbhatia

This comment has been minimized.

randeepbhatia commented Sep 12, 2013

+1

On Thu, Sep 12, 2013 at 12:17 AM, Olivier Bazoud
notifications@github.comwrote:

+1


Reply to this email directly or view it on GitHubhttps://github.com//issues/136#issuecomment-24299339
.

Randeep Singh

@paulcgt

This comment has been minimized.

paulcgt commented Oct 2, 2013

+1

@scobal

This comment has been minimized.

Owner

scobal commented Oct 3, 2013

I'd like to see this too. It's not trivial to implement because the Graphite API sometimes returns null values, see this work around:

https://github.com/scobal/seyren/blob/master/seyren-core/src/main/java/com/seyren/core/service/checker/GraphiteTargetChecker.java#L72

However, maybe we could do something if we spot multiple null values being returned...

@paulcgt

This comment has been minimized.

paulcgt commented Jan 23, 2014

Perhaps getLastValue() could return null as the target's value. When creating a Check one would have to allow "null" (or empty string) as a valid config value... but that does sound like it could be a bit of a mission. :-/

What I'll probably do to get around this is have something watch seyren.log for instances of InvalidGraphiteValueException thrown in getLastValue() and raise an alert (or record an additional "missing metric" metric) when I find those.

@anthroprose

This comment has been minimized.

anthroprose commented Mar 7, 2014

Please make this an option. I currently have blank metrics that I only want alerts when they are NOT null....

Seyren currently lists these as an 'Unknown' status as the nulls are not 0s...

If anything, it would be nice if it handled both cases...

  1. Alert on null
  2. Treat nulls as 0
@scobal

This comment has been minimized.

Owner

scobal commented Mar 7, 2014

Hi Alex,

We're always happy to accept pull requests for features like this. If you want to fork and submit a pull request we can review and hopefully merge to origin/master.

Mark

@anthroprose

This comment has been minimized.

anthroprose commented Mar 8, 2014

So, I can write code in a lot of different languages, but java is sadly not among them.... you probably don't want me mucking around with complex functionality....

I do however think that I can pull off my use case of treating nulls as 0 without breaking anything as it only involves the config, and the GraphiteTargetChecker::getLatestValue function (along with tests...)

I'll see what I can come up with. (btw... I'm coming over from tattle nonworking hell so I'm loving a decently baked tool)

@anthroprose

This comment has been minimized.

anthroprose commented Mar 13, 2014

Not sure if this will cover everyone's usecase, but this is what I came up with:

transformNull(stats_counts.production.counter, 0)

This allows me to alert on normally null values, that are suddenly 1 or greater (exceptions, etc..)

Depending on the value set that is a 'normal' range, you could do something like:

transformNull(stats_counts.production.counter, 999999999)

So if your normal readings are usually <=100, then a sudden appearance of 999999999 would be noticed, and since this is done only in the seyren system, no metrics are actually affected.

@dukebody

This comment has been minimized.

dukebody commented Nov 7, 2014

Thanks @anthroprose for the trick!

+1 for this feature. I'm not a Java programmer so I don't think I can contribute a pull request here either. :-/

I'd prefer if alerts could also be configured for UNKNOWN the same way as for ERROR or WARNING, instead of treating UNKOWNs as 0.

@CsabaSzabo

This comment has been minimized.

CsabaSzabo commented Dec 18, 2015

+1

@yukinami

This comment has been minimized.

yukinami commented Dec 30, 2016

+1

1 similar comment
@kjenney

This comment has been minimized.

kjenney commented May 7, 2017

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment