Skip to content

Commit

Permalink
docs: add first iteration of analysis documentation (#2167)
Browse files Browse the repository at this point in the history
Signed-off-by: realanna <anna.reale@dynatrace.com>
Signed-off-by: RealAnna <89971034+RealAnna@users.noreply.github.com>
Co-authored-by: Florian Bacher <florian.bacher@dynatrace.com>
Co-authored-by: Meg McRoberts <meg.mcroberts@dynatrace.com>
Co-authored-by: Giovanni Liva <giovanni.liva@dynatrace.com>
Co-authored-by: odubajDT <93584209+odubajDT@users.noreply.github.com>
Co-authored-by: Moritz Wiesinger <moritz.wiesinger@dynatrace.com>
  • Loading branch information
6 people committed Sep 27, 2023
1 parent 34e5384 commit 366ee1f
Show file tree
Hide file tree
Showing 4 changed files with 122 additions and 5 deletions.
119 changes: 119 additions & 0 deletions docs/content/en/docs/implementing/slo/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
---
title: Define SLOs/SLIs with Analyses
description: Understand Analyses in Keptn and how to use them
weight: 91
---

The Keptn Metrics Operator implements an SLO/SLI feature set inspired by Keptn v1 under the name of Analysis.
With an Analysis Definition you can specify multiple Service Level Objectives (SLOs) that will be evaluated in your Analysis.
At the end of the Analysis the status returns whether your objective failed or passed.

The Analysis result is exposed as an OpenTelemetry metric and can be displayed on dashboard tools, such as Grafana.

Keptn v1 users may use converters for
[SLOs](https://github.com/keptn/lifecycle-toolkit/blob/main/metrics-operator/converter/slo_converter.md#slo-converter)
and [SLIs](https://github.com/keptn/lifecycle-toolkit/blob/main/metrics-operator/converter/sli_converter.md#sli-converter)
to migrate towards Keptn Analysis.

## Keptn Analysis basics

A Keptn Analysis is implemented with three resources:

* [Analysis](../../crd-ref/metrics/v1alpha3/#analysis) --
define the specific configurations and the Analysis to report
* [AnalysisDefinition](../../crd-ref/metrics/v1alpha3/#analysisdefinition) --
define the list of SLOs for an Analysis
* [AnalysisValueTemplate](../../crd-ref/metrics/v1alpha3/#analysisvaluetemplate) --
define the SLI: the KeptnMetricsProvider and the query to perform for each SLI

### Define Analysis, Analysis Definition and AnalysisValueTemplate

An Analysis customizes the templates defined inside an AnalysisDefinition by adding configuration such as:

* a timeframe that specifies the range for the corresponding query in the AnalysisValueTemplate
* a map of key/value pairs that can be used to substitute placeholders in the AnalysisValueTemplate

An AnalysisDefinition contains a list of objectives to satisfy.
Each of these objectives:

* specifies failure or warning target criteria
* specifies whether the objective is a key objective (its failure would fail the Analysis)
* indicates the weight of the objective on the overall Analysis
* refers to an AnalysisValueTemplate that contains the SLIs, defining the data provider from which to gather the data
and how to compute the Analysis

In each AnalysisValueTemplate we store the query for the Analysis of the SLI.
You must define a
[KeptnMetricsProvider](../../yaml-crd-ref/metricsprovider.md) resource
for each instance of each data provider you are using.
The template refers to that provider and queries it.

Let's consider the following Analysis:

{{< embed path="/metrics-operator/config/samples/metrics_v1alpha3_analysis.yaml" >}}

This CR sets up the timeframe we are interested in
as between 5 am and 10 am on the 5th of May 2023,
and adds a few specific key-value pairs that will be substituted in the query.
For instance, the query could contain a `{{.nodename}}` and this value will be substituted with `test`

The definition of this Analysis is referenced by its name and namespace and can be seen here:

{{< embed path="/metrics-operator/config/samples/metrics_v1alpha3_analysisdefinition.yaml" >}}

This simple definition contains a single objective, `response-time-p95`.
For this objective, there are both
failure and warning criteria:

* the objective will fail if the percentile 95 is less than 600
* there will be a warning in case the value is between 300 and 500

The total score shows that this Analysis should have an overall score of 90% to pass or 75% to get a warning.
Since the objective is only one, this means that we either will pass with 100% (response time is less than 600) or fail
with 0% (slower response time).

The objective points to the corresponding AnalysisValueTemplate:
{{< embed path="/metrics-operator/config/samples/metrics_v1alpha3_analysisvaluetemplate.yaml" >}}

This template tells us that we will query a provider called `prometheus` using this query:

```shell
sum(kube_pod_container_resource_limits{node='{{.nodename}}'}) - sum(kube_node_status_capacity{node='{{.nodename}}'})
```

At runtime, the metrics operator will try to substitute everything in`{{.variableName}}`
format with a key-value pair in the Analysis resource,
so in this case the query would become:

```shell
sum(kube_pod_container_resource_limits{node='test'}) - sum(kube_node_status_capacity{node='test'})
```

The other key-value pairs such as 'project' and 'stage' are just examples of how one could pass to the provider
information similar to Keptn v1 objectives.
For a working example you can
check [here](https://github.com/keptn/lifecycle-toolkit/tree/main/test/testanalysis/analysis-controller-multiple-providers).

## Accessing Analysis

### Retrieve KeptnMetric values with kubectl

Use the `kubectl get` command to retrieve all the `Analyses` in your cluster:

```shell
kubectl get analyses.metrics.keptn.sh -A

```

This will return something like

```shell
NAMESPACE NAME ANALYSISDEFINITION STATE WARNING PASS
default analysis-sample ed-my-proj-dev-svc1
```

You can then describe the Analysis with:

```shell
kubectl describe analyses.metrics.keptn.sh analysis-sample -n=default
```
4 changes: 1 addition & 3 deletions docs/content/en/docs/migrate/metrics-observe/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,7 @@ and Keptn evaluations.
> **Note**
The full SLO capabilities
provided by Keptn v1 such as weighting and scoring
are currently under development for Keptn.
You can follow and participate in the design and implementation process at
[GitHub Epic 1646](https://github.com/keptn/lifecycle-toolkit/issues/1646).
have a first implementation in the [Analysis](../../implementing/slo/).

Notice the paradigm differences when implementing Keptn evaluations:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ spec:
project: my-project
stage: dev
service: svc1
foo: bar # can be any key/value pair; NOT only project/stage/service
nodename: test # can be any key/value pair; NOT only project/stage/service
analysisDefinition:
name: ed-my-proj-dev-svc1
namespace: keptn-lifecycle-toolkit-system
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ metadata:
spec:
provider:
name: prometheus
query: "sum(kube_pod_container_resource_limits{resource='{{.Resource}}'}) - sum(kube_node_status_capacity{resource='{{.Resource}}'})"
query: "sum(kube_pod_container_resource_limits{node='{{.nodename}}'}) - sum(kube_node_status_capacity{node='{{.nodename}}'})"

0 comments on commit 366ee1f

Please sign in to comment.