-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log fatal errors #52
Comments
Probably there are two aspects of this, one is propagating the errors properly within sonar - add_gpu_info in ps.rs drops the error on the floor, for example - and the other is logging in a standard location in a standard way. |
It could look like it might be sufficient to use the syslog crate, https://crates.io/crates/syslog, and that whatever plumbing is necessary to adapt to the local system conventions is hidden behind the syslog service. |
An alternative view is that sonar will always be run by cron and that the logging and output handling performed by cron - mailing the output to the owner of the job - is sufficient. I think there's no rush to implement anything here, we need to examine the entire pipeline first. |
We can then follow this: https://rust-lang-nursery.github.io/rust-cookbook/development_tools/debugging/log.html |
Related to logging, I see intermittent clusters of sonar errors reported by cron of this form:
(This morning there was a cluster of six of these on ML7. It would have been useful to see more information here, to better diagnose the problem. Over the summer I had a similar cluster on another of the nodes.) |
PR #71 addresses all of this. |
See #51. We should detect unrecoverable error situations that prevent monitoring from working, and we should arguably log them to some standard syslog or other medium where they will be seen.
The text was updated successfully, but these errors were encountered: