Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring tool for queue traffic #55

Open
jonnor opened this issue Dec 14, 2016 · 1 comment
Open

Monitoring tool for queue traffic #55

jonnor opened this issue Dec 14, 2016 · 1 comment

Comments

@jonnor
Copy link
Member

jonnor commented Dec 14, 2016

If any of the participants in a Msgflo graph deadlocks, or for other reasons* stop to process/send messages, the system will generally fail to perform intended function.
By monitoring the queues that represent the output of the system, we could detect this.

msgflo-monitor --queue sensor.NEWVALUE --expect-periodic 1s --fail-missed 3
Process should probably exit (with non-zero code) on failure. Then one can use for instance in systemd OnFailure=myservice-restart.service to try to recover.
Alternatively it may be useful if it writes a single line on stdout, or can execute some command.

For systems that don't produce data periodically by nature, it is recommended that there is a process generating test message periodically.

To have a chain of trust, the msgflo-monitor program should ideally also be monitored. For instance by using the watchdog feature of systemd.

* crashing process and restarting it usually well handled by the service management already, be it Heroku or systemd (#20).

@bergie
Copy link
Member

bergie commented May 3, 2017

Related to queue length information in #29

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants