ROX-16887: always extend SLO calculation over entire 28 days #87
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Extended average over time refers to the time series effectively being extended over the entire time interval. This is in contrast to avg_over_time, which only averages over time intervals where the time series is not nil. This is important during the initial 28 days of Central instances. For example, consider a Central instance that lived for 5 minutes and was down for 2 minutes. Using avg_over_time, the availability would be
3 min / 5 min = 60%
. The extended average over 28 days would yield1 - 2 min / 28 days = 99.995%
. After the initial 28 days, both averages are equivalent, because data points will exist for the entire 28d range.Also make sure that alerts only fire as long as the Central instance still exists (
central:sli:availability >= 0
condition).