From 55d4ed973a0fd6701adf65fe010d82b74f35304f Mon Sep 17 00:00:00 2001 From: Adriana Villela <50256412+avillela@users.noreply.github.com> Date: Fri, 21 Jul 2023 13:36:17 -0400 Subject: [PATCH] Update OTel Operator readme and Target Allocator readme (#1951) --- .chloggen/readme-updates.yaml | 16 +++++++++++++ README.md | 22 +++++++++++------- cmd/otel-allocator/README.md | 44 ++++++++++++++++++++++++----------- 3 files changed, 60 insertions(+), 22 deletions(-) create mode 100644 .chloggen/readme-updates.yaml diff --git a/.chloggen/readme-updates.yaml b/.chloggen/readme-updates.yaml new file mode 100644 index 0000000000..af44f3b3e4 --- /dev/null +++ b/.chloggen/readme-updates.yaml @@ -0,0 +1,16 @@ +# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' +change_type: enhancement + +# The name of the component, or a single word describing the area of concern, (e.g. operator, target allocator, github action) +component: Documentation + +# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). +note: Update OTel Operator and Target Allocator readmes. + +# One or more tracking issues related to the change +issues: [1952] + +# (Optional) One or more lines of additional information to render under the primary note. +# These lines will be padded with 2 spaces and then inserted directly into the document. +# Use pipe (|) for multiline entries. +subtext: diff --git a/README.md b/README.md index da46c8ef36..2cc6d63533 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ The OpenTelemetry Operator is an implementation of a [Kubernetes Operator](https The operator manages: * [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector) -* auto-instrumentation of the workloads using OpenTelemetry instrumentation libraries +* [auto-instrumentation](https://opentelemetry.io/docs/concepts/instrumentation/automatic/) of the workloads using OpenTelemetry instrumentation libraries ## Documentation @@ -66,7 +66,7 @@ This will create an OpenTelemetry Collector instance named `simplest`, exposing The `config` node holds the `YAML` that should be passed down as-is to the underlying OpenTelemetry Collector instances. Refer to the [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector) documentation for a reference of the possible entries. -At this point, the Operator does *not* validate the contents of the configuration file: if the configuration is invalid, the instance will still be created but the underlying OpenTelemetry Collector might crash. +> 🚨 **NOTE:** At this point, the Operator does *not* validate the contents of the configuration file: if the configuration is invalid, the instance will still be created but the underlying OpenTelemetry Collector might crash. The Operator does examine the configuration file to discover configured receivers and their ports. If it finds receivers with ports, it creates a pair of kubernetes services, one headless, exposing those ports within the cluster. The headless service contains a `service.beta.openshift.io/serving-cert-secret-name` annotation that will cause OpenShift to create a secret containing a certificate and key. This secret can be mounted as a volume and the certificate and key used in those receivers' TLS configurations. @@ -83,7 +83,13 @@ The default and only other acceptable value for `.Spec.UpgradeStrategy` is `auto ### Deployment modes -The `CustomResource` for the `OpenTelemetryCollector` exposes a property named `.Spec.Mode`, which can be used to specify whether the collector should run as a `DaemonSet`, `Sidecar`, `StatefulSet` or `Deployment` (default). Look at [this sample](https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/daemonset-features/01-install.yaml) for a reference of `DaemonSet`. +The `CustomResource` for the `OpenTelemetryCollector` exposes a property named `.Spec.Mode`, which can be used to specify whether the Collector should run as a [`DaemonSet`](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/), [`Sidecar`](https://kubernetes.io/docs/concepts/workloads/pods/#workload-resources-for-managing-pods), [`StatefulSet`](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/) or [`Deployment`](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) (default). + +See below for examples of each deployment mode: +- [`Deployment`](https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/ingress/00-install.yaml) +- [`DaemonSet`](https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/daemonset-features/01-install.yaml) +- [`StatefulSet`](https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/smoke-statefulset/00-install.yaml) +- [`Sidecar`](https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/instrumentation-python/00-install-collector.yaml) #### Sidecar injection @@ -329,12 +335,12 @@ spec: In the above case, `myapp` and `myapp2` containers will be instrumented, `myapp3` will not. -**NOTE**: Go auto-instrumentation **does not** support multicontainer pods. When injecting Go auto-instrumentation the first pod should be the only pod you want instrumented. +> 🚨 **NOTE**: Go auto-instrumentation **does not** support multicontainer pods. When injecting Go auto-instrumentation the first pod should be the only pod you want instrumented. #### Use customized or vendor instrumentation By default, the operator uses upstream auto-instrumentation libraries. Custom auto-instrumentation can be configured by -overriding the image fields in a CR. +overriding the `image` fields in a CR. ```yaml apiVersion: opentelemetry.io/v1alpha1 @@ -381,7 +387,7 @@ List of all available attributes can be found at [otel-webserver-module](https:/ #### Inject OpenTelemetry SDK environment variables only -You can configure the OpenTelemetry SDK for applications which can't currently be autoinstrumented by using `inject-sdk` in place of (e.g.) `inject-python` or `inject-java`. This will inject environment variables like `OTEL_RESOURCE_ATTRIBUTES`, `OTEL_TRACES_SAMPLER`, and `OTEL_EXPORTER_OTLP_ENDPOINT`, that you can configure in the `Instrumentation`, but will not actually provide the SDK. +You can configure the OpenTelemetry SDK for applications which can't currently be autoinstrumented by using `inject-sdk` in place of `inject-python` or `inject-java`, for example. This will inject environment variables like `OTEL_RESOURCE_ATTRIBUTES`, `OTEL_TRACES_SAMPLER`, and `OTEL_EXPORTER_OTLP_ENDPOINT`, that you can configure in the `Instrumentation`, but will not actually provide the SDK. ```bash instrumentation.opentelemetry.io/inject-sdk: "true" @@ -409,7 +415,7 @@ Language not specified in the table are always supported and cannot be disabled. ### Target Allocator -The OpenTelemetry Operator comes with an optional component, the Target Allocator (TA). When creating an OpenTelemetryCollector Custom Resource (CR) and setting the TA as enabled, the Operator will create a new deployment and service to serve specific `http_sd_config` directives for each Collector pod as part of that CR. It will also change the Prometheus receiver configuration in the CR, so that it uses the [http_sd_config](https://prometheus.io/docs/prometheus/latest/http_sd/) from the TA. The following example shows how to get started with the Target Allocator: +The OpenTelemetry Operator comes with an optional component, the [Target Allocator](/cmd/otel-allocator/README.md) (TA). When creating an OpenTelemetryCollector Custom Resource (CR) and setting the TA as enabled, the Operator will create a new deployment and service to serve specific `http_sd_config` directives for each Collector pod as part of that CR. It will also change the Prometheus receiver configuration in the CR, so that it uses the [http_sd_config](https://prometheus.io/docs/prometheus/latest/http_sd/) from the TA. The following example shows how to get started with the Target Allocator: ```yaml apiVersion: opentelemetry.io/v1alpha1 @@ -482,7 +488,7 @@ Behind the scenes, the OpenTelemetry Operator will convert the Collector’s con Note how the Operator removes any existing service discovery configurations (e.g., `static_configs`, `file_sd_configs`, etc.) from the `scrape_configs` section and adds an `http_sd_configs` configuration pointing to a Target Allocator instance it provisioned. -The OpenTelemetry Operator will also convert the Target Allocator's promethueus configuration after the reconciliation into the following: +The OpenTelemetry Operator will also convert the Target Allocator's Prometheus configuration after the reconciliation into the following: ```yaml config: diff --git a/cmd/otel-allocator/README.md b/cmd/otel-allocator/README.md index e46fcd684d..a5014811ba 100644 --- a/cmd/otel-allocator/README.md +++ b/cmd/otel-allocator/README.md @@ -1,14 +1,25 @@ # Target Allocator -The TargetAllocator is an optional separately deployed component of an OpenTelemetry Collector setup, which is used to -distribute targets of the PrometheusReceiver on all deployed Collector instances. The release version matches the +Target Allocator is an optional component of the OpenTelemetry Collector [Custom Resource](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) (CR). The release version matches the operator's most recent release as well. -In essence, Prometheus Receiver configs are overridden with a http_sd_configs directive that points to the -Allocator, these are then loadbalanced/sharded to the collectors. The Prometheus Receiver configs that are overridden -are what will be distributed with the same name. In addition to picking up receiver configs, the TargetAllocator -can discover targets via Prometheus CRs (currently ServiceMonitor, PodMonitor) which it presents as scrape configs -and jobs on the `/scrape_configs` and `/jobs` endpoints respectively. +In a nutshell, the TA is a mechanism for decoupling the service discovery and metric collection functions of Prometheus such that they can be scaled independently. The Collector manages Prometheus metrics without needing to install Prometheus. The TA manages the configuration of the Collector's [Prometheus Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/prometheusreceiver/README.md). + +The TA serves two functions: +* Even distribution of Prometheus targets among a pool of Collectors +* Discovery of Prometheus Custom Resources + +## Even Distribution of Prometheus Targets + +The Target Allocator's first job is to discover targets to scrape and collectors to allocate targets to. Then it can distribute the targets it discovers among the collectors. This means that the OTel Collectors collect the metrics instead of a Prometheus [scraper](https://uzxmx.github.io/prometheus-scrape-internals.html). Metrics are ingested by the OTel Collectors by way of the [Prometheus Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/prometheusreceiver/README.md). + +## Discovery of Prometheus Custom Resources + +The Target Allocator also provides for the discovery of [Prometheus Operator CRs](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/getting-started.md), namely the [ServiceMonitor and PodMonitor](https://github.com/open-telemetry/opentelemetry-operator/tree/main/cmd/otel-allocator#target-allocator). The ServiceMonitor and the PodMonitor don’t do any scraping themselves; their purpose is to inform the Target Allocator (or Prometheus) to add a new job to their scrape configuration. These metrics are then ingested by way of the Prometheus Receiver on the OpenTelemetry Collector. + +Even though Prometheus is not required to be installed in your Kubernetes cluster to use the Target Allocator for Prometheus CR discovery, the TA does require that the ServiceMonitor and PodMonitor be installed. These CRs are bundled with Prometheus Operator; however, they can be installed standalone as well. + +The easiest way to do this is by going to the [Prometheus Operator’s Releases page](https://github.com/prometheus-operator/prometheus-operator/releases), grabbing a copy of the latest `bundle.yaml` file (for example, [this one](https://github.com/prometheus-operator/prometheus-operator/releases/download/v0.66.0/bundle.yaml)), and stripping out all of the YAML except the ServiceMonitor and PodMonitor YAML definitions. # Usage The `spec.targetAllocator:` controls the TargetAllocator general properties. Full API spec can be found here: [api.md#opentelemetrycollectorspectargetallocator](../../docs/api.md#opentelemetrycollectorspectargetallocator) @@ -44,14 +55,21 @@ spec: exporters: [logging] ``` +In essence, Prometheus Receiver configs are overridden with a `http_sd_config` directive that points to the +Allocator, these are then loadbalanced/sharded to the Collectors. The [Prometheus Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/prometheusreceiver/README.md) configs that are overridden +are what will be distributed with the same name. + ## PrometheusCR specifics + TargetAllocator discovery of PrometheusCRs can be turned on by setting -`.spec.targetAllocator.prometheusCR.enabled` to `true` +`.spec.targetAllocator.prometheusCR.enabled` to `true`, which it presents as scrape configs +and jobs on the `/scrape_configs` and `/jobs` endpoints respectively. The CRs can be filtered by labels as documented here: [api.md#opentelemetrycollectorspectargetallocatorprometheuscr](../../docs/api.md#opentelemetrycollectorspectargetallocatorprometheuscr) -The prometheus receiver in the deployed collector also has to know where the Allocator service exists. This is done by a -OpenTelemetry Collector operator specific config. +The Prometheus Receiver in the deployed Collector also has to know where the Allocator service exists. This is done by a +OpenTelemetry Collector Operator-specific config. + ```yaml config: | receivers: @@ -64,15 +82,13 @@ OpenTelemetry Collector operator specific config. interval: 30s collector_id: "${POD_NAME}" ``` -Upstream documentation here: [Prometheusreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/prometheusreceiver#opentelemetry-operator) + +Upstream documentation here: [PrometheusReceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/prometheusreceiver#opentelemetry-operator) The TargetAllocator service is named based on the OpenTelemetryCollector CR name. `collector_id` should be unique per collector instance, such as the pod name. The `POD_NAME` environment variable is convenient since this is supplied to collector instance pods by default. -The Prometheus CRDs also have to exist for the Allocator to pick them up. The best place to get them is from -prometheus-operator: [Releases](https://github.com/prometheus-operator/prometheus-operator/releases). Only the CRDs for -CRs that the Allocator watches for need to be deployed. They can be picked out from the bundle.yaml file. ### RBAC The ServiceAccount that the TargetAllocator runs as, has to have access to the CRs. A role like this will provide that