Skip to content

Commit

Permalink
Update OTel Operator readme and Target Allocator readme (#1951)
Browse files Browse the repository at this point in the history
  • Loading branch information
avillela committed Jul 21, 2023
1 parent cdaa2c4 commit 55d4ed9
Show file tree
Hide file tree
Showing 3 changed files with 60 additions and 22 deletions.
16 changes: 16 additions & 0 deletions .chloggen/readme-updates.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. operator, target allocator, github action)
component: Documentation

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Update OTel Operator and Target Allocator readmes.

# One or more tracking issues related to the change
issues: [1952]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:
22 changes: 14 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ The OpenTelemetry Operator is an implementation of a [Kubernetes Operator](https

The operator manages:
* [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector)
* auto-instrumentation of the workloads using OpenTelemetry instrumentation libraries
* [auto-instrumentation](https://opentelemetry.io/docs/concepts/instrumentation/automatic/) of the workloads using OpenTelemetry instrumentation libraries

## Documentation

Expand Down Expand Up @@ -66,7 +66,7 @@ This will create an OpenTelemetry Collector instance named `simplest`, exposing

The `config` node holds the `YAML` that should be passed down as-is to the underlying OpenTelemetry Collector instances. Refer to the [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector) documentation for a reference of the possible entries.

At this point, the Operator does *not* validate the contents of the configuration file: if the configuration is invalid, the instance will still be created but the underlying OpenTelemetry Collector might crash.
> 🚨 **NOTE:** At this point, the Operator does *not* validate the contents of the configuration file: if the configuration is invalid, the instance will still be created but the underlying OpenTelemetry Collector might crash.
The Operator does examine the configuration file to discover configured receivers and their ports. If it finds receivers with ports, it creates a pair of kubernetes services, one headless, exposing those ports within the cluster. The headless service contains a `service.beta.openshift.io/serving-cert-secret-name` annotation that will cause OpenShift to create a secret containing a certificate and key. This secret can be mounted as a volume and the certificate and key used in those receivers' TLS configurations.

Expand All @@ -83,7 +83,13 @@ The default and only other acceptable value for `.Spec.UpgradeStrategy` is `auto

### Deployment modes

The `CustomResource` for the `OpenTelemetryCollector` exposes a property named `.Spec.Mode`, which can be used to specify whether the collector should run as a `DaemonSet`, `Sidecar`, `StatefulSet` or `Deployment` (default). Look at [this sample](https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/daemonset-features/01-install.yaml) for a reference of `DaemonSet`.
The `CustomResource` for the `OpenTelemetryCollector` exposes a property named `.Spec.Mode`, which can be used to specify whether the Collector should run as a [`DaemonSet`](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/), [`Sidecar`](https://kubernetes.io/docs/concepts/workloads/pods/#workload-resources-for-managing-pods), [`StatefulSet`](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/) or [`Deployment`](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) (default).

See below for examples of each deployment mode:
- [`Deployment`](https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/ingress/00-install.yaml)
- [`DaemonSet`](https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/daemonset-features/01-install.yaml)
- [`StatefulSet`](https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/smoke-statefulset/00-install.yaml)
- [`Sidecar`](https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/instrumentation-python/00-install-collector.yaml)

#### Sidecar injection

Expand Down Expand Up @@ -329,12 +335,12 @@ spec:

In the above case, `myapp` and `myapp2` containers will be instrumented, `myapp3` will not.

**NOTE**: Go auto-instrumentation **does not** support multicontainer pods. When injecting Go auto-instrumentation the first pod should be the only pod you want instrumented.
> 🚨 **NOTE**: Go auto-instrumentation **does not** support multicontainer pods. When injecting Go auto-instrumentation the first pod should be the only pod you want instrumented.

#### Use customized or vendor instrumentation

By default, the operator uses upstream auto-instrumentation libraries. Custom auto-instrumentation can be configured by
overriding the image fields in a CR.
overriding the `image` fields in a CR.

```yaml
apiVersion: opentelemetry.io/v1alpha1
Expand Down Expand Up @@ -381,7 +387,7 @@ List of all available attributes can be found at [otel-webserver-module](https:/

#### Inject OpenTelemetry SDK environment variables only

You can configure the OpenTelemetry SDK for applications which can't currently be autoinstrumented by using `inject-sdk` in place of (e.g.) `inject-python` or `inject-java`. This will inject environment variables like `OTEL_RESOURCE_ATTRIBUTES`, `OTEL_TRACES_SAMPLER`, and `OTEL_EXPORTER_OTLP_ENDPOINT`, that you can configure in the `Instrumentation`, but will not actually provide the SDK.
You can configure the OpenTelemetry SDK for applications which can't currently be autoinstrumented by using `inject-sdk` in place of `inject-python` or `inject-java`, for example. This will inject environment variables like `OTEL_RESOURCE_ATTRIBUTES`, `OTEL_TRACES_SAMPLER`, and `OTEL_EXPORTER_OTLP_ENDPOINT`, that you can configure in the `Instrumentation`, but will not actually provide the SDK.

```bash
instrumentation.opentelemetry.io/inject-sdk: "true"
Expand Down Expand Up @@ -409,7 +415,7 @@ Language not specified in the table are always supported and cannot be disabled.

### Target Allocator

The OpenTelemetry Operator comes with an optional component, the Target Allocator (TA). When creating an OpenTelemetryCollector Custom Resource (CR) and setting the TA as enabled, the Operator will create a new deployment and service to serve specific `http_sd_config` directives for each Collector pod as part of that CR. It will also change the Prometheus receiver configuration in the CR, so that it uses the [http_sd_config](https://prometheus.io/docs/prometheus/latest/http_sd/) from the TA. The following example shows how to get started with the Target Allocator:
The OpenTelemetry Operator comes with an optional component, the [Target Allocator](/cmd/otel-allocator/README.md) (TA). When creating an OpenTelemetryCollector Custom Resource (CR) and setting the TA as enabled, the Operator will create a new deployment and service to serve specific `http_sd_config` directives for each Collector pod as part of that CR. It will also change the Prometheus receiver configuration in the CR, so that it uses the [http_sd_config](https://prometheus.io/docs/prometheus/latest/http_sd/) from the TA. The following example shows how to get started with the Target Allocator:

```yaml
apiVersion: opentelemetry.io/v1alpha1
Expand Down Expand Up @@ -482,7 +488,7 @@ Behind the scenes, the OpenTelemetry Operator will convert the Collector’s con

Note how the Operator removes any existing service discovery configurations (e.g., `static_configs`, `file_sd_configs`, etc.) from the `scrape_configs` section and adds an `http_sd_configs` configuration pointing to a Target Allocator instance it provisioned.

The OpenTelemetry Operator will also convert the Target Allocator's promethueus configuration after the reconciliation into the following:
The OpenTelemetry Operator will also convert the Target Allocator's Prometheus configuration after the reconciliation into the following:

```yaml
config:
Expand Down
44 changes: 30 additions & 14 deletions cmd/otel-allocator/README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,25 @@
# Target Allocator

The TargetAllocator is an optional separately deployed component of an OpenTelemetry Collector setup, which is used to
distribute targets of the PrometheusReceiver on all deployed Collector instances. The release version matches the
Target Allocator is an optional component of the OpenTelemetry Collector [Custom Resource](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) (CR). The release version matches the
operator's most recent release as well.

In essence, Prometheus Receiver configs are overridden with a http_sd_configs directive that points to the
Allocator, these are then loadbalanced/sharded to the collectors. The Prometheus Receiver configs that are overridden
are what will be distributed with the same name. In addition to picking up receiver configs, the TargetAllocator
can discover targets via Prometheus CRs (currently ServiceMonitor, PodMonitor) which it presents as scrape configs
and jobs on the `/scrape_configs` and `/jobs` endpoints respectively.
In a nutshell, the TA is a mechanism for decoupling the service discovery and metric collection functions of Prometheus such that they can be scaled independently. The Collector manages Prometheus metrics without needing to install Prometheus. The TA manages the configuration of the Collector's [Prometheus Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/prometheusreceiver/README.md).

The TA serves two functions:
* Even distribution of Prometheus targets among a pool of Collectors
* Discovery of Prometheus Custom Resources

## Even Distribution of Prometheus Targets

The Target Allocator's first job is to discover targets to scrape and collectors to allocate targets to. Then it can distribute the targets it discovers among the collectors. This means that the OTel Collectors collect the metrics instead of a Prometheus [scraper](https://uzxmx.github.io/prometheus-scrape-internals.html). Metrics are ingested by the OTel Collectors by way of the [Prometheus Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/prometheusreceiver/README.md).

## Discovery of Prometheus Custom Resources

The Target Allocator also provides for the discovery of [Prometheus Operator CRs](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/getting-started.md), namely the [ServiceMonitor and PodMonitor](https://github.com/open-telemetry/opentelemetry-operator/tree/main/cmd/otel-allocator#target-allocator). The ServiceMonitor and the PodMonitor don’t do any scraping themselves; their purpose is to inform the Target Allocator (or Prometheus) to add a new job to their scrape configuration. These metrics are then ingested by way of the Prometheus Receiver on the OpenTelemetry Collector.

Even though Prometheus is not required to be installed in your Kubernetes cluster to use the Target Allocator for Prometheus CR discovery, the TA does require that the ServiceMonitor and PodMonitor be installed. These CRs are bundled with Prometheus Operator; however, they can be installed standalone as well.

The easiest way to do this is by going to the [Prometheus Operator’s Releases page](https://github.com/prometheus-operator/prometheus-operator/releases), grabbing a copy of the latest `bundle.yaml` file (for example, [this one](https://github.com/prometheus-operator/prometheus-operator/releases/download/v0.66.0/bundle.yaml)), and stripping out all of the YAML except the ServiceMonitor and PodMonitor YAML definitions.

# Usage
The `spec.targetAllocator:` controls the TargetAllocator general properties. Full API spec can be found here: [api.md#opentelemetrycollectorspectargetallocator](../../docs/api.md#opentelemetrycollectorspectargetallocator)
Expand Down Expand Up @@ -44,14 +55,21 @@ spec:
exporters: [logging]
```
In essence, Prometheus Receiver configs are overridden with a `http_sd_config` directive that points to the
Allocator, these are then loadbalanced/sharded to the Collectors. The [Prometheus Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/prometheusreceiver/README.md) configs that are overridden
are what will be distributed with the same name.

## PrometheusCR specifics

TargetAllocator discovery of PrometheusCRs can be turned on by setting
`.spec.targetAllocator.prometheusCR.enabled` to `true`
`.spec.targetAllocator.prometheusCR.enabled` to `true`, which it presents as scrape configs
and jobs on the `/scrape_configs` and `/jobs` endpoints respectively.

The CRs can be filtered by labels as documented here: [api.md#opentelemetrycollectorspectargetallocatorprometheuscr](../../docs/api.md#opentelemetrycollectorspectargetallocatorprometheuscr)

The prometheus receiver in the deployed collector also has to know where the Allocator service exists. This is done by a
OpenTelemetry Collector operator specific config.
The Prometheus Receiver in the deployed Collector also has to know where the Allocator service exists. This is done by a
OpenTelemetry Collector Operator-specific config.

```yaml
config: |
receivers:
Expand All @@ -64,15 +82,13 @@ OpenTelemetry Collector operator specific config.
interval: 30s
collector_id: "${POD_NAME}"
```
Upstream documentation here: [Prometheusreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/prometheusreceiver#opentelemetry-operator)

Upstream documentation here: [PrometheusReceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/prometheusreceiver#opentelemetry-operator)

The TargetAllocator service is named based on the OpenTelemetryCollector CR name. `collector_id` should be unique per
collector instance, such as the pod name. The `POD_NAME` environment variable is convenient since this is supplied
to collector instance pods by default.

The Prometheus CRDs also have to exist for the Allocator to pick them up. The best place to get them is from
prometheus-operator: [Releases](https://github.com/prometheus-operator/prometheus-operator/releases). Only the CRDs for
CRs that the Allocator watches for need to be deployed. They can be picked out from the bundle.yaml file.

### RBAC
The ServiceAccount that the TargetAllocator runs as, has to have access to the CRs. A role like this will provide that
Expand Down

0 comments on commit 55d4ed9

Please sign in to comment.