Added updating of existing Incidents so new incidents are contanstly … #15

rjr162 · 2016-12-27T05:22:23Z

…created with each new warning/critical

Added a metric update option by using the event_handler in the following fashion:

    event_handler   cachet_notify!<host> -m=true

The cachet_notify script checks for the presence of '-m=true' in the string and then breaks out the first 'word' prior to the first space as the component name

…created with each new warning/critical Added a metric update option by using the event_handler in the following fashion: event_handler cachet_notify!<host> -m=true The cachet_notify script checks for the presence of '-m=true' in the string and then breaks out the first 'word' prior to the first space as the component name Edits by Ron Rossman Jr <ronrossman@gmail.com>

rjr162 · 2016-12-27T05:26:25Z

It's been a while since the last merge and feedback, so I think I have all the original issues resolved and added in a metric component for anyone using metrics on the page load times. Downside is I don't think there's a way to tell Cachet to not auto-update at the set interval where the lowest value is 1 which can create a bit of wonkiness with the chart (although that's an issue on the cachet auto-update side of things and the fact Nagios only fires the event handler when there's a status update change). It may be better to use the python cachet utility for URL checking if you want something that just runs and updates the page load time metric on a consistent basis.

to use you just add -m=true after the component name in the event handler. Give it a test if you wish and give some feedback. You may need to pre-create the metric in Cachet for it to work right. If it's too poor, no issues yanking that part out

2Belette · 2017-04-06T08:27:12Z

It has been a while I haven't gave you feedback as I had no time to test and the server needed to be re-installed, it is done now :)
I have tested but I have an issue and I keep having multiple event created instead of having one event updated using -m=true

I also tried to test using :

./cachet_notify 'host.fr' 'dispo' CRITICAL HARD 'test service down' -m=true

I got

KO HARD: creating incident
Array
(
    [name] => nagios dispo
    [message] => test service down
    [status] => 1
    [visible] => 1
    [component_id] => 5
    [component_status] => 4
    [notify] => 1
)

But if I do a

./cachet_notify 'host.fr' 'dispo' OK HARD 'test service down' -m=true

I got:

OK Hard: creating incident
Array
(
    [name] => nagios dispo
    [message] => test service down
    [status] => 4
    [visible] => 1
    [component_id] => 5
    [component_status] => 1
    [notify] => 1
)
OK HARD: updating incident
Can't find incident "nagios dispo"

And on Cachet I still got two incident created: one for the CRITICAL, one for the OK when it goes back to normal.

For Nagios alert I got the same issue, or sometimes it does't update Cachet at all...

Any idea?
many thanks

EDIT:
I am still trying to understand the issue, in the meantime I confirm to you that the past pull request you made to solve the issue of going back to Normal status after CRITICAL or WARNING seems to work well :)

Another thing is at the begging of cachet_notify you make a test against the number of parameters, I think this has to be extended to 7 as -m=true is adding one more, I needed to change it to make it work

rjr162 · 2017-04-08T14:31:31Z

Hey! I'll have to dig in and take a look. It's been so long since I had a chance to touch the code, I can't remember what I did lol. I did think about cutting out the metrics code and resubmitting to keep things cleaner, and then maybe have another with the metrics code, although the way cachet defaults to a 1 or 0 for metrics at a set interval sort of screws up the flow/view... So the metrics part may not be worth it in the end. I'll let you know when I get a chance to play with it again as I also have to finish up our info VM (haven't touched that really since the last update either) Thanks!

…

-- Ron

On Apr 6, 2017 4:27 AM, "2Belette" ***@***.***> wrote: It has been a while I haven't gave you feedback as I had no time to test and the server needed to be re-installed, it is done now :) I have tested but I have an issue and I keep having multiple event created instead of having one event updated using -m=true I also tried to test using : ./cachet_notify 'host.fr' 'dispo' CRITICAL HARD 'test service down' -m=true I got KO HARD: creating incident Array ( [name] => nagios dispo [message] => test service down [status] => 1 [visible] => 1 [component_id] => 5 [component_status] => 4 [notify] => 1 ) But if I do a ./cachet_notify 'host.fr' 'dispo' OK HARD 'test service down' -m=true I got: OK Hard: creating incident Array ( [name] => nagios dispo [message] => test service down [status] => 4 [visible] => 1 [component_id] => 5 [component_status] => 1 [notify] => 1 ) OK HARD: updating incident Can't find incident "nagios dispo" And on Cachet I still got two incident created: one for the CRITICAL, one for the OK when it goes back to normal. For Nagios alert I got the same issue, or sometimes it does't update Cachet at all... Any idea? many thanks — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#15 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADVE4EyZGt_LEK3EXDjCRrorgtLxdBmsks5rtKHhgaJpZM4LWCaf> .

2Belette · 2017-04-19T11:38:15Z

Thanks for your reply ;)

Another thing I am thinking about is that it would be usefull to select which alerts we want to receive from Nagios. For example I have "hacked" your script to exit(0) for Warning Soft, as if I don't do that Cachet is receiving too much false positive from Nagios on my installation.

Would be great to add a parameter to say -warning or -critical where -warning includes both and -critical only critical alerts.

Just an idea

PS: I confirm the metrics are messed-up and Cachet keeps creating multiple event and doesn't update the same

Many thanks :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added updating of existing Incidents so new incidents are contanstly … #15

Added updating of existing Incidents so new incidents are contanstly … #15

rjr162 commented Dec 27, 2016

rjr162 commented Dec 27, 2016

2Belette commented Apr 6, 2017 •

edited

Loading

rjr162 commented Apr 8, 2017 via email

2Belette commented Apr 19, 2017

Added updating of existing Incidents so new incidents are contanstly … #15

Are you sure you want to change the base?

Added updating of existing Incidents so new incidents are contanstly … #15

Conversation

rjr162 commented Dec 27, 2016

rjr162 commented Dec 27, 2016

2Belette commented Apr 6, 2017 • edited Loading

rjr162 commented Apr 8, 2017 via email

2Belette commented Apr 19, 2017

2Belette commented Apr 6, 2017 •

edited

Loading