Add documentation on how to use OpenTelemetry +collector #3005

evankanderson · 2020-11-06T08:07:47Z

I'll dig out relevant issue numbers in the morning; this provides instructions for setting up the OpenTelemetry collector and Prometheus (in a fairly small footprint; probably around 250MB of RAM based on 200MB for Prometheus and 50MB for the otc-collector.

With this documentation, I'd like to start the clock ticking on migrating existing Prometheus and Stackdriver users to the OpenTelemetry collector so that we can drastically simplify the code in pkg/metrics.

Proposed Changes

Add documentation on exporting statistics using the OpenTelemetry aggregator.

/assign @MontyCarter @mpetason

…us for serving. Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>

csantanapr · 2020-11-06T17:55:11Z

/assign

csantanapr · 2020-11-06T17:55:35Z

I will review and try the instructions @evankanderson

evankanderson · 2020-11-06T19:43:59Z

If you get stuck, feel free to ping me on Slack.

(Oh, I just realized I forgot to count the two operators in the overall RAM overhead... it looks like they add around 120MB, with the otel one being a bit bigger because it has two containers.)

docs/install/collecting-metrics/README.md

csantanapr · 2020-11-06T19:59:20Z

docs/install/collecting-metrics/README.md

+   for the collector:
+
+   ```shell
+   kubectl apply --filename collector.yaml


We usually instruct in pre-reqs to git clone the repo and change directory, if not it will be a file not found

I'm wondering about linking to raw.githubusercontent.com instead, so users don't need to do a clone. What do you think?

Yeah I think is better to provide htttp URL to raw.githubusercontent.com

docs/install/collecting-metrics/collector.yaml

csantanapr · 2020-11-07T00:15:34Z

docs/install/collecting-metrics/README.md

+1. Finally, update the `config-observability` ConfigMap in Knative Serving and
+   Eventing
+   ```shell
+   kubectl patch --namespace knative-serving configmap/config-observability \


From the diagram it looks like only queue-proxy and activator would be pushing metrics, what about the control-plane controllers? Would they also be pushing metrics or not?

Yes, they would too, I can add more elements, thanks for the callout!

We should add Eventing objects too like sources which emit metrics.

docs/install/collecting-metrics/README.md

csantanapr · 2020-11-07T13:09:56Z

Using the opentelemetry operator didn't work for me.

I think also using Prometheus Operator it hides the simplicity of configure prometheus to scrape

I think we should provide a simpler setup, maybe single yaml that defines 1 deployment and 1 ConfigMap for each agent/collector and Prometheus

If this example is only about metrics I think we should remove any references to traces to avoid confusion.

Do another docs example for traces.

evankanderson · 2020-11-09T17:50:52Z

I'll switch this to direct ConfigMap/Deployment/Service setup, rather than using the operators. Thanks for the feedback.

I'll spin a new version of this this morning, hopefully.

…the-meter

docs/install/collecting-metrics/collector.yaml

docs/install/collecting-metrics/README.md

evankanderson · 2020-11-10T00:59:13Z

@csantanapr Please take another look/try these again.

It turns out that queue-proxy doesn't seem to put the most useful labels on the exported metrics, so I added some label_replace examples.

docs/install/collecting-metrics/collector.yaml

skonto · 2020-11-26T10:33:29Z

docs/install/collecting-metrics/README.md

@@ -0,0 +1,97 @@
+This document describes how to set up the
+[OpenTelemetry Collector](https://opentelemetry.io/docs/collector/about/) to


Link seems dead, an option is https://opentelemetry.io/docs/collector

Thanks, fixed.

skonto · 2020-11-26T10:38:50Z

docs/install/collecting-metrics/README.md

+[opentelemetry-operator](https://github.com/open-telemetry/opentelemetry-operator),
+but it's also easy to manage this service directly.
+
+![Diagram of components reporting to collector, which is scraped by Prometheus](./system-diagram.svg)


I think we need to make clear that this is an example architecture and provide a link to more options here. Some users might want to use an Collector as an Agent etc, do we support that?

I think those should work; I wanted to include a minimal example here.

skonto · 2020-11-26T10:43:24Z

docs/install/collecting-metrics/README.md

+This document describes how to set up the
+[OpenTelemetry Collector](https://opentelemetry.io/docs/collector/about/) to
+receive metrics from the Knative infrastructure components and distribute them
+to Prometheus. [OpenTelemetry](https://opentelemetry.io/) is a CNCF project to


We could also use the definition of what is advertised at the official site:

OpenTelemetry (CNF project) is an observability framework for cloud native software. It provides a collection of tools, APIs, and SDKs which allow the instrumentation, generation, collection, and export of telemetry data (metrics, logs, and traces) for analysis in order to understand software's performance and behavior.

Thanks, copied it over.

abrennan89 · 2020-12-16T18:46:10Z

@evankanderson any updates on this one? Is it blocked by anything?

evankanderson · 2021-03-05T22:43:19Z

I've addressed the most recent @skonto comments. I think this is still correct -- I've used it intermittently and didn't realize I hadn't gotten the PR submitted. 😁

abrennan89 · 2021-03-17T19:45:59Z

/retest

abrennan89 · 2021-03-17T20:10:19Z

/approve
/lgtm

Thanks Evan! 🙂

knative-prow-robot · 2021-03-17T20:10:30Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abrennan89, MontyCarter

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [abrennan89]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Add documentation on how to use opentelemetry collector with promethe…

5151ce4

…us for serving. Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>

knative-prow-robot assigned MontyCarter and mpetason Nov 6, 2020

google-cla bot added the cla: yes Indicates the PR's author has signed the CLA. label Nov 6, 2020

knative-prow-robot added the approved label Nov 6, 2020

knative-prow-robot requested review from bbrowning and nachocano November 6, 2020 08:07

knative-prow-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 6, 2020

knative-prow-robot assigned csantanapr Nov 6, 2020

csantanapr reviewed Nov 6, 2020

View reviewed changes

docs/install/collecting-metrics/README.md Outdated Show resolved Hide resolved

csantanapr reviewed Nov 6, 2020

View reviewed changes

Add cert-manager to otel install instructions

c6f2408

csantanapr reviewed Nov 7, 2020

View reviewed changes

docs/install/collecting-metrics/collector.yaml Outdated Show resolved Hide resolved

csantanapr reviewed Nov 7, 2020

View reviewed changes

docs/install/collecting-metrics/README.md Outdated Show resolved Hide resolved

evankanderson added 2 commits November 9, 2020 12:16

Update to remove operators

08a9c51

Merge remote-tracking branch 'origin/reading-the-meter' into reading-…

bc7efc8

…the-meter

MontyCarter approved these changes Nov 9, 2020

View reviewed changes

Fix prometheus/otel typos, add rule examples.

d20a256

csantanapr reviewed Nov 13, 2020

View reviewed changes

docs/install/collecting-metrics/collector.yaml Show resolved Hide resolved

abrennan89 added this to To do in New observability stack documentation Nov 25, 2020

skonto reviewed Nov 26, 2020

View reviewed changes

abrennan89 added triage/needs-eng-input Engineering input is requested status/blocked labels Mar 5, 2021

Merge remote-tracking branch 'upstream/master' into reading-the-meter

75d69d1

Convert to _index.md, address @skonto comments

8511db0

knative-prow-robot removed the approved label Mar 5, 2021

Base automatically changed from master to main March 8, 2021 17:41

abrennan89 assigned abrennan89 and unassigned mpetason Mar 17, 2021

abrennan89 added this to the v0.22.0 milestone Mar 17, 2021

knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 17, 2021

knative-prow-robot added the approved label Mar 17, 2021

knative-prow-robot merged commit f2e741e into knative:main Mar 17, 2021

evankanderson deleted the reading-the-meter branch March 24, 2021 05:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add documentation on how to use OpenTelemetry +collector #3005

Add documentation on how to use OpenTelemetry +collector #3005

evankanderson commented Nov 6, 2020

csantanapr commented Nov 6, 2020

csantanapr commented Nov 6, 2020

evankanderson commented Nov 6, 2020

csantanapr Nov 6, 2020

evankanderson Nov 6, 2020

csantanapr Nov 7, 2020

csantanapr Nov 7, 2020

evankanderson Nov 9, 2020

skonto Nov 26, 2020 •

edited

csantanapr commented Nov 7, 2020

evankanderson commented Nov 9, 2020

evankanderson commented Nov 10, 2020

skonto Nov 26, 2020 •

edited

evankanderson Mar 5, 2021

skonto Nov 26, 2020 •

edited

evankanderson Mar 5, 2021

skonto Nov 26, 2020 •

edited

evankanderson Mar 5, 2021

abrennan89 commented Dec 16, 2020

evankanderson commented Mar 5, 2021

abrennan89 commented Mar 17, 2021

abrennan89 commented Mar 17, 2021

knative-prow-robot commented Mar 17, 2021

		@@ -0,0 +1,97 @@
		This document describes how to set up the
		[OpenTelemetry Collector](https://opentelemetry.io/docs/collector/about/) to

Add documentation on how to use OpenTelemetry +collector #3005

Add documentation on how to use OpenTelemetry +collector #3005

Conversation

evankanderson commented Nov 6, 2020

Proposed Changes

csantanapr commented Nov 6, 2020

csantanapr commented Nov 6, 2020

evankanderson commented Nov 6, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skonto Nov 26, 2020 • edited

Choose a reason for hiding this comment

csantanapr commented Nov 7, 2020

evankanderson commented Nov 9, 2020

evankanderson commented Nov 10, 2020

skonto Nov 26, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skonto Nov 26, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skonto Nov 26, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abrennan89 commented Dec 16, 2020

evankanderson commented Mar 5, 2021

abrennan89 commented Mar 17, 2021

abrennan89 commented Mar 17, 2021

knative-prow-robot commented Mar 17, 2021

skonto Nov 26, 2020 •

edited

skonto Nov 26, 2020 •

edited

skonto Nov 26, 2020 •

edited

skonto Nov 26, 2020 •

edited