ROX-16887: always extend SLO calculation over entire 28 days #87

stehessel · 2023-05-04T17:57:17Z

Extended average over time refers to the time series effectively being extended over the entire time interval. This is in contrast to avg_over_time, which only averages over time intervals where the time series is not nil. This is important during the initial 28 days of Central instances. For example, consider a Central instance that lived for 5 minutes and was down for 2 minutes. Using avg_over_time, the availability would be 3 min / 5 min = 60%. The extended average over 28 days would yield 1 - 2 min / 28 days = 99.995%. After the initial 28 days, both averages are equivalent, because data points will exist for the entire 28d range.

Also make sure that alerts only fire as long as the Central instance still exists (central:sli:availability >= 0 condition).

Extended average over time refers to the time series effectively being extended over the entire time interval. This is in contrast to avg_over_time, which only averages over time intervals where the time series is not nil. This is important during the initial 28 days of Central instances. For example, consider a Central instance that lived for 5 minutes and was down for 2 minutes. Using avg_over_time, the availability would be 3 min / 5 min = 60%. The extended average over 28 days would yield 1 - 2 min / 28 days ~ 99.995%. After the initial 28 days, both averages are equivalent.

stehessel added 3 commits May 4, 2023 19:56

only alert if Central still exists

20d2218

fix dashboard

c8f237d

stehessel marked this pull request as ready for review May 4, 2023 18:20

stehessel requested a review from a team as a code owner May 4, 2023 18:20

stehessel requested a review from 0x656b694d May 4, 2023 18:33

0x656b694d approved these changes May 5, 2023

View reviewed changes

stehessel merged commit ae9c16b into master May 5, 2023
1 check passed

stehessel deleted the ROX-16887/extend-slo-interval branch May 5, 2023 13:22

0x656b694d mentioned this pull request May 10, 2023

ROX-16887: ignore first 5m of pod life #85

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROX-16887: always extend SLO calculation over entire 28 days #87

ROX-16887: always extend SLO calculation over entire 28 days #87

stehessel commented May 4, 2023 •

edited

Loading

ROX-16887: always extend SLO calculation over entire 28 days #87

ROX-16887: always extend SLO calculation over entire 28 days #87

Conversation

stehessel commented May 4, 2023 • edited Loading

stehessel commented May 4, 2023 •

edited

Loading