You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For a few days now I've been wondering how the implementation would look like for a Saturation SLO based on Prometheus metrics. I've come up with a design idea, so I'm opening this issue to discuss this further with the community.
The main idea here is to re-utilize the BoolGauge SLO as much as possible.
API:
typeSaturationIndicatorstruct {
// Utilization is the metric that represents the current utilization of the monitored resource.UtilizationQuery`json:"utilization"`// Capacity is the metric that represents the capacity of the monitored resource.CapacityQuery`json:"capacity"`// Threshold is the maximum utilization allowed of the monitored resource.// It should represent a percentage between Utilization and Capacity.// It should be a number between 0 and 1.Thresholdfloat64`json:"threshold"`// +optional// Grouping allows an SLO to be defined for many SLI at once, like HTTP handlers for example.Grouping []string`json:"grouping"`
}
For additional Prometheus rules, all we need to do is generate vector(1) if (Utilization / Capacity) > Threshold and vector(0) if (Utilization / Capacity) <= Threshold. From this, we can reutilize the same prometheus rules used for BoolGauge:
- record: example-saturation-boolexpr: | (vector(1) AND (Utilization / Capacity) > Threshold) OR vector(0)## Same from BoolGauge below
- record: example-saturation-bool:count1wexpr: sum (count_over_time(example-saturation-bool[1w]))
- record: example-saturation-bool:sum1wexpr: sum (sum_over_time(example-saturation-bool[1w]))
- record: example-saturation-bool:burnrate1mexpr: (sum (count_over_time(example-saturation-bool[1m])) - sum (sum_over_time(probe_success[1m]))) / sum (count_over_time(example-saturation-bool[1m]))...
The text was updated successfully, but these errors were encountered:
For a few days now I've been wondering how the implementation would look like for a Saturation SLO based on Prometheus metrics. I've come up with a design idea, so I'm opening this issue to discuss this further with the community.
The main idea here is to re-utilize the BoolGauge SLO as much as possible.
API:
For additional Prometheus rules, all we need to do is generate
vector(1)
if(Utilization / Capacity) > Threshold
andvector(0)
if(Utilization / Capacity) <= Threshold
. From this, we can reutilize the same prometheus rules used for BoolGauge:The text was updated successfully, but these errors were encountered: