Change offset(nan) and sync(0) to WARNING #7
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Due to network instability in some environments,
the NTP algorithm may discard all servers/peers
temporarily, triggering CRITICAL Nagios alert
"CRITICAL: offset is out of range (nan)".
There are two problems with that alert:
The message is misleading, as offset=nan does
not mean anything. The worst metric selection
hides the real alert "CRITICAL: No sync peer selected".
Therefore, the worst metric selection list is reversed,
as when "No sync peer selected" happens,
"offset is out of range (nan)" always happens as well.
The problem is usually transient and does not
require immediate attention. It may go away with
network workload changes, or may persist until the
root cause of the issue is investigated. Given that it
is of "CRITICAL" level, it obfuscates other much more
critical alerts from NTPmon and also Nagios. Therefore,
the WARNING level is more appropriate.
Closes #5