Monitoring tool for queue traffic #55

jonnor · 2016-12-14T09:15:34Z

If any of the participants in a Msgflo graph deadlocks, or for other reasons* stop to process/send messages, the system will generally fail to perform intended function.
By monitoring the queues that represent the output of the system, we could detect this.

msgflo-monitor --queue sensor.NEWVALUE --expect-periodic 1s --fail-missed 3
Process should probably exit (with non-zero code) on failure. Then one can use for instance in systemd OnFailure=myservice-restart.service to try to recover.
Alternatively it may be useful if it writes a single line on stdout, or can execute some command.

For systems that don't produce data periodically by nature, it is recommended that there is a process generating test message periodically.

To have a chain of trust, the msgflo-monitor program should ideally also be monitored. For instance by using the watchdog feature of systemd.

* crashing process and restarting it usually well handled by the service management already, be it Heroku or systemd (#20).

The text was updated successfully, but these errors were encountered:

bergie · 2017-05-03T10:47:09Z

Related to queue length information in #29

jonnor added the enhancement label Dec 14, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monitoring tool for queue traffic #55

Monitoring tool for queue traffic #55

jonnor commented Dec 14, 2016

bergie commented May 3, 2017

Monitoring tool for queue traffic #55

Monitoring tool for queue traffic #55

Comments

jonnor commented Dec 14, 2016

bergie commented May 3, 2017