You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First part should be exported as Prometheus metrics as well. In the same (compatible) format as a SLO metadata. Which can lead to simple configuration of slo-exporter.
Note that term category is used for availability, latency, etc. On the other hand slo_type must explicitly identify particular metric/SLI/SLO. Therefore in case that more that one SLI for category is used than slo_type identify the exact one:
Following example do not describe usefull SLO definition, it is intended as a showcase of possible usuage.
classes:
- version: "1"# Keys are SLO Classes and under each key is dictionary which keys define # slo_types (availability, latency90, latency99 etc. ## If value of the dict contains: # * a number then it is interpreted as a `threshold => <number>` e.g.: # `{ "availability" => 99.9 }` is only abbreviated notation to # ``` # { # "availability" => { "threshold" => 99.9, "slo_category" => "availability", "slo_class" => "availability"}# } # ``` # * an array of disct, then it expanded as example shows: # Form `"latency" => [{ "threshold" => 99, "maxDuration" => 0.5}, { "threshold" => 90, "maxDuration" => 0.2 }]` to# ``` # { # "latency99" => { "threshold" => 99, "maxDuration" => 0.5, slo_category => "latency", slo_type => "latency99" },# "latency90" => { "threshold" => 90, "maxDuration" => 0.2, slo_category => "latency", slo_type => "latency90" }# } # ``` # * a dict: # If keys `slo_category` or `slo_type` are not present then they are set # to same value as a key pointing the the dict. Then this dict is # accessible from rule expressions and `threshold` value is passed over # to the Prometheus to be used as a SLO threshold.## First version might implement only dict version.## Following lines are intentionally long without line braking # It's useful to make visually straightforward to compare individual # slo classes and categories (slo_class & slo_types) each other. #critical: { "availability" => 99.9, "latency" => [{ "threshold" => 99, "maxDuration" => 0.5}, { "threshold" => 90, "maxDuration" => 0.2 }] }high_fast: { "availability" => 99.5, "latency" => [{ "threshold" => 99, "maxDuration" => 1.5}, { "threshold" => 90, "maxDuration" => 0.5 }] }high_slow: { "availability" => 99.5, "latency" => [{ "threshold" => 99, "maxDuration" => 3.0}, { "threshold" => 90, "maxDuration" => 2.0 }] }low: { "availability" => 99.0, "latency" => [{ "threshold" => 99, "maxDuration" => 6.0}, { "threshold" => 90, "maxDuration" => 3.0 }] }
- version: "2"critical: { "availability" => { "threshold" => 99.9, "maxDuration" => 0.2 } } high_fast: { "availability" => { "threshold" => 99.0, "maxDuration" => 1.5 } } high_slow: { "availability" => { "threshold" => 99.0, "maxDuration" => 3.0 } } low: { "availability" => { "threshold" => 95.0, "maxDuration" => 6.0 } } # evaluation workflow: # 1. Input event class is determined first (e.g. `slo_class=critical`). # 2. For each version in class table: # 1. Only rules groups which group_expr results to true are evaluated. # 2. When rules are evaluated all variables form `classes` definition table are accessible.# 3. when additional_metadata are defined, then: # * all values which are string are added to the slo_event # * all dict values which are dict and has only `expr` key are evaluated and result is added to the slo_event# * otherwise an error metrics is increased. slo_domain: 'autoadmins'rule_groups:
- group_expr: 'version == "1"'rules:
- slo_type: 'availability'slo_result_exp: "statusCode < 500"
- slo_type: 'latency90'slo_result_expr: "requestDuration < class.latency99.maxDuration"additional_metadata:
percentyle: 90le: 0.2#hardcoded same number as `class.latency99.maxDuration`
- slo_type: 'latency99'slo_result_expr: "requestDuration < class.latency90.maxDuration"additional_metadata:
percentile: 99le:
- expr: 'class.latency99.maxDuration'
- group_expr: 'version == "2"'rules:
- slo_type: 'availability&latency'slo_result_expr: "statusCode < 500 && requestDuration < availablity.maxDuration"additional_metadata:
percentile: 100le:
- expr: 'class.availability.maxDuration'# example of one category (slo_type) instead of three
- slo_type: 'availability&latency'default_expr: "statusCode < 500 && requestDuration < availability.maxDuration"# example of expression defined additionals metadata # it uses result of expression as slo_event. # To the expression result is added `slo_type` key and result is checked to contains `slo_results` as boolean
- slo_type: 'availability&latency'slo_event_expr: "{ le: availability.maxDuration, percentile: availability, slo_result: (statusCode < 500 && requestDuration < availability.maxDuration) }"
The text was updated successfully, but these errors were encountered:
Follows example of
slo_rules.yaml
with new semantics.Example consists of two parts:
First part should be exported as Prometheus metrics as well. In the same (compatible) format as a SLO metadata. Which can lead to simple configuration of slo-exporter.
Note that term
category
is used foravailability
,latency
, etc. On the other handslo_type
must explicitly identify particular metric/SLI/SLO. Therefore in case that more that one SLI for category is used than slo_type identify the exact one:Following example do not describe usefull SLO definition, it is intended as a showcase of possible usuage.
The text was updated successfully, but these errors were encountered: