Skip to content

Commit

Permalink
docs: edits to metrics-operator architecture page (#1679)
Browse files Browse the repository at this point in the history
Signed-off-by: Meg McRoberts <meg.mcroberts@dynatrace.com>
Co-authored-by: Giovanni Liva <giovanni.liva@dynatrace.com>
Co-authored-by: Rakshit Gondwal <98955085+rakshitgondwal@users.noreply.github.com>
Co-authored-by: Moritz Wiesinger <moritz.wiesinger@dynatrace.com>
Co-authored-by: RealAnna <89971034+RealAnna@users.noreply.github.com>
  • Loading branch information
5 people committed Aug 10, 2023
1 parent 50dac48 commit 7eb8afe
Show file tree
Hide file tree
Showing 2 changed files with 80 additions and 26 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,30 @@ weight: 80
cascade:
---

The Keptn Metrics Operator collects, processes,
and analyzes metrics data from a variety of sources.
Once collected, this data can be used
to generate a variety of reports and dashboards
that provide insights into the health and performance
of the application and infrastructure.

The Keptn Metrics Operator collects, processes, and analyzes metrics data from a variety of sources.
Once collected, this data, can be used to generate a variety of reports and dashboards
that provide insights into the health and performance of the application and infrastructure.
While Kubernetes has ways to extend its metrics APIs, they have limitations,
especially that they only allow you to use a single observability platform
such as Prometheus, Dynatrace or Datadog.
The Keptn Metrics Operator solves this problem
by providing a single entry point for
all your metrics data, regardless of its source,
so you can use multiple instances of multiple observability platforms.

While Kubernetes does have two metrics servers, they have limitations.
The custom and external APIs only allow you to use a single observability platform.
The Keptn Metrics Operator solves this problem by providing a single entry point for
all your metrics data, regardless of its source.
Furthermore, due to the integration with the Kubernetes custom metrics API, these metrics are also
compatible with the Kubernetes HorizontalPodAutoscaler (HPA) which enables the horizontal scaling of workloads
based on metrics collected from multiple observability platforms such as Prometheus, Dynatrace or Datadog.
Keptn metrics are integrated with the Kubernetes
[Custom Metrics API](https://github.com/kubernetes/metrics#custom-metrics-api)
so they are compatible with the Kubernetes
[HorizontalPodAutoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
(HPA), which enables the horizontal scaling of workloads
based on metrics collected from multiple observability platforms.
See
[Using the HorizontalPodAutoscaler](../../../../implementing/evaluatemetrics.md/#using-the-horizontalpodautoscaler)
for instructions.

The Metrics Operator consists of the following components:

Expand All @@ -33,22 +45,49 @@ style Metrics-Adapter fill:#d8e6f4,stroke:#fff,stroke-width:px,color:#006bb8
style Metrics-Controller fill:#d8e6f4,stroke:#fff,stroke-width:px,color:#006bb8
```

**Metrics adapter** is used to expose custom metrics from an application to external monitoring and alerting tools.
The adapter exposes custom metrics on a specific endpoint where external monitoring and alerting tools can scrape them.
It is an important component of the metrics operator as it allows for the collection and exposure of custom metrics,
which can be used to gain insight into the behavior and performance of applications running on a Kubenetes cluster.
The **Metrics adapter** exposes custom metrics from an application
to external monitoring and alerting tools.
The adapter exposes custom metrics on a specific endpoint
where external monitoring and alerting tools can scrape them.
It is an important component of the metrics operator
as it allows for the collection and exposure of custom metrics,
which can be used to gain insight into the behavior and performance
of applications running on a Kubenetes cluster.

The **Metrics controller** fetches metrics from an SLI provider.
The controller reconciles a [`KeptnMetric`](../../../../yaml-crd-ref/metric.md)
resource and updates its status with the metric value
provided by the selected metric provider.
Each `KeptnMetric` is identified by `name`
and is associated with an instance of an observability platform
that is defined in a
[KeptnMetricsProvider](../../../../yaml-crd-ref/metricsprovider.md)
resource.

**Metrics controller** is used to fetch metrics from a SLI provider.
The controller reconciles a [`KeptnMetric`](../../../../yaml-crd-ref/metric.md) CR and
updates its status with the metric value provided by the selected SLI provider.
The steps in which the controller fetches metrics are given below:

* It first fetches the `KeptnMetric` object to reconcile.
* If the object is not found, it returns and lets Kubernetes handle deleting all associated resources.
* If the object is found, the code checks that if the metric has been updated within the configured
interval which is defined in the `Spec.FetchIntervalSeconds`.
If not, then it skips reconciling and requeues the request for later.
* If the metric should be reconciled, it fetches the provider defined in the `Spec.Provider.Name` field.
* If the provider is not found, it returns and requeues the request for later.
* If the provider is found, it loads the provider and evaluates the query defined in the `Spec.Query` field.
* If the evaluation is succesful, it stores the fetched value in the status of the `KeptnMetric` object.
1. When a [`KeptnMetric`](../../../../yaml-crd-ref/metric.md)
resource is found or modified,
the controller checks whether the metric has been updated
within the interval that is defined in the `spec.fetchintervalseconds` field.
* If not, it skips the reconciliation process
and queues the request for later.

1. The controller attempts to fetch the provider defined in the
`spec.provider.name` field.
* If this is not possible, the controller reconciles
and queues the request for later.

1. If the provider is found,
the controller loads the provider and evaluates the query
defined in the `spec.query` field.
* If the evaluation is successful,
it stores the fetched value
in the `status` field of the `KeptnMetric` object.
* If the evaluation fails,
the error and reason is written to the
[KeptnMetricStatus](../../../../crd-ref/metrics/v1alpha3/#keptnmetricstatus)
resource.
The error is described in both human-readable language
and as raw data to help identify the source of the problem
(such as a forbidden code).
15 changes: 15 additions & 0 deletions docs/content/en/docs/yaml-crd-ref/metric.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,12 @@ spec:
fetchIntervalSeconds: <#-seconds>
range:
interval: "<timeframe>"
status:
properties:
value: <resulting value in human-readable language>
rawValue: <resulting value, in raw format>
errMsg: <error details if the query could not be evaluated>
lastUpdated: <time when the status data was last updated>
```
## Fields
Expand Down Expand Up @@ -65,6 +71,15 @@ spec:
* **interval** -- Timeframe for which the metric would be queried.
Defaults to 5m.

* **status**
* KLT fills in this information when the metric is evaluated.
It always records the time the metric was last evaluated.
If the evaluation is successful,
this stores the result in both human-readable and raw format.
If the evaluation is not successful,
this stores error details that you can use to understand the problem
such as a forbidden code.

## Usage

A `KeptnMetric` resource must be located
Expand Down

0 comments on commit 7eb8afe

Please sign in to comment.