You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to have a report that tells us which Jenkins nodes are offline and why they're offline. This is offline in terms of Jenkins. We often have failures in a few nodes and it takes us a few weeks to get around to fixing them.
This bug is for a solution as well as implementing it.
Option 1: A jenkins job which makes API calls and sends us an email in case there are machines offline.
Option 2: Nagios check which alerts us. This is slightly more explosive :)
The text was updated successfully, but these errors were encountered:
Time: 20190114T10:23:14
mscherer at redhat commented:
I suspect option 2 is not what we want.
But yeah, nagios do handle this quite well, doing notification, etc, etc. But would still need to do the basic script that do the API call anyway, the difference would be between "send a email", or "do a api call to nagios to trigger a alert", and I think we could switch between thel quite easily if needed.
Time: 20190527T02:09:34
sankarshan at redhat commented:
Is there any decision on whether Option#1 can be implemented? Deepshikha, can we have Naresh to look into this?
Time: 20190527T04:00:05
dkhandel at redhat commented:
According to me we should have it on nagios rather than alerting jenkins job. Nagios is already in place for builders to alert about any memory failures or so. Though I don't receive notifications (that's a different story) but would be good to have just one such source of alerting.
Naresh can look at the script if we agree on this.
URL: https://bugzilla.redhat.com/1665361
Creator: nigelb at redhat
Time: 20190111T06:57:04
I want to have a report that tells us which Jenkins nodes are offline and why they're offline. This is offline in terms of Jenkins. We often have failures in a few nodes and it takes us a few weeks to get around to fixing them.
This bug is for a solution as well as implementing it.
Option 1: A jenkins job which makes API calls and sends us an email in case there are machines offline.
Option 2: Nagios check which alerts us. This is slightly more explosive :)
The text was updated successfully, but these errors were encountered: