Replies: 1 comment
-
|
Draft PR is up: #2128. Only chart-surface change, ~60 line diff, backward compatible, 11 new unit tests. Grateful for feedback on the four |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The itch
otel.tracinginhelm/kagent/values.yamllets you enable OTLP tracing and point at any endpoint, but the moment that endpoint is authenticated (Langfuse Cloud, Honeycomb, Datadog, Grafana Cloud, New Relic, Chronosphere, ...), the chart has no way to plumb the auth header. It also can't setOTEL_SERVICE_NAMEorOTEL_RESOURCE_ATTRIBUTES, which is how the OTEL spec expectsservice.name,deployment.environment.name, tenant/team/env tags to flow into resource attributes.The Go controller in
go/core/internal/telemetry/tracing.goalready understands all of these env vars viaautoexport.NewSpanExporter(ctx)andresource.WithFromEnv(). This is a pure chart-surface gap.Why post this as a Discussion
CONTRIBUTING.mdasks for large-ish changes to go through issue + Slack + agreed plan before a draft PR. The issue is the "what and why"; this discussion is the "shape of the API" — the exactvalues.yamlkeys are the design surface I'd like feedback on before anything gets merged.Concrete Langfuse example (the thing I hit in the wild)
Sending kagent controller traces to Langfuse Cloud requires:
https://cloud.langfuse.com/api/public/otel/v1/traces— supported today.http/protobuf— supported today viaotel.tracing.exporter.otlp.protocol.Authorization: Basic <base64(publicKey:secretKey)>— not supported.langfuse.environment=<env>so traces are partitioned per env — not supported.Today the only way to get all four is to bypass the chart's
otel.*config entirely and wire env vars into the controller Pod viacontroller.envFromfrom a hand-writtenConfigMap+Secret. Two things get worse with that approach:otel.tracing.enabledinvalues.yamlis effectively a lie — the truthful value lives in a sidecar ConfigMap.elementor/elementor-cloud-external-charts@1e8b9436+elementor/elementor-agents@13545eb).Proposed shape
Four new declarative knobs, all optional, all backward compatible:
key1=value1,key2=value2with sorted keys (same pattern already used forDEFAULT_AGENT_POD_LABELS).helm templateoutput is byte-identical for existing installs.controller.envFromhook is exactly the right extension point for a Basic-auth header stored in aSecret; the chart'sdata.OTEL_EXPORTER_OTLP_TRACES_HEADERSis transparently overridden by the Pod-level env var. Thevalues.yamlcomments document this.End-to-end Langfuse install after the change
That's the entire configuration — no sidecar ConfigMap/Secret, no wrapper chart required, and the same shape works for Honeycomb / Datadog / Grafana Cloud / New Relic just by swapping the header Secret and endpoint.
Open design questions
headersSecretRef? Should we also addotel.tracing.exporter.otlp.headersSecretRef: {name, key}that renders as an envvalueFrom.secretKeyRefon the controller Deployment? Pro: single-schema declarative install. Con: duplicatescontroller.envFrom, adds a template branch. I default to no for the first cut — happy to add if you want.logging.exporter.otlp.headers? Included in my draft. Cheap and consistent with the existing schema (both signals exposeinsecure/timeout). Speak up if you'd rather keep logs out of scope for round one.serviceName/resourceAttributes. I put them atotel.*root because per OTEL spec they are global resource attributes shared across signals. Alternative would beotel.tracing.*only, but that ignores logs. Preferences?helm/tools/querydoc/templates/configmap.yamlin the same PR? It has an identical gap. My instinct is no, separate PR, but I'll defer.What's ready
nirzOps/kagent:feat/otel-headers-service-name-resource-attrs(~60 line diff).helm/kagent/tests/controller-configmap_test.yamlcovering backward compat + each new field + full Langfuse render (11 tests, all pass).helm lintclean, all existing 225 unit tests still pass.Draft PR link will be added here as soon as it's opened. Grateful for any pushback on the API shape before I un-draft it.
Beta Was this translation helpful? Give feedback.
All reactions