Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to OpenTelemetry #341

Closed
charithe opened this issue Oct 4, 2021 · 2 comments · Fixed by #1887
Closed

Switch to OpenTelemetry #341

charithe opened this issue Oct 4, 2021 · 2 comments · Fixed by #1887

Comments

@charithe
Copy link
Contributor

charithe commented Oct 4, 2021

Now that the Go OpenTelemetry SDK has reached GA, investigate whether we can migrate away from OpenCensus.

The last time we looked into this Otel SDK was not production ready and there was very little support for metrics and gRPC tracing.

@charithe
Copy link
Contributor Author

Current hurdles:

  • Otel metrics are still in pre-alpha stage.
  • The gRPC plugin does not have any metrics we currently get from the OpenCensus plugin. Those metrics are quite useful and losing them would be a pity.
  • We can keep using OpenCensus and use the bridge to export the traces to Otel. This works for local traces but I couldn't get it to work for distributed traces. The spans don't link up for some reason. IIUC the propagation format has changed since OpenCensus merged with OpenTelemetry. So presumably we need to have both Otel and OC interceptors active at the same time -- which is not great and just complicates the code too much.

https://xkcd.com/927/ 🤦🏽

Basically, the various components required to make this work well are still in a state of flux and requires a lot more effort than anticipated to patch over the holes.

@charithe
Copy link
Contributor Author

charithe commented Nov 16, 2021

Distributed traces work if the propagation format is set to W3C trace context or B3.

// W3C Trace Context
otel.SetTextMapPropagator(propagation.TraceContext{})

// B3
b3propagator := b3.New(b3.WithInjectEncoding(b3.B3MultipleHeader | b3.B3SingleHeader))
otel.SetTextMapPropagator(b3propagator)

Annoyingly, the bridge library outputs a message on startup: starting span "ExportMetrics": unsupported sampler: 0xaed6e0. This error is triggered by the OC Prometheus exporter and we can't silence it.

1: running [Created by prometheus.(*Registry).Register @ registry.go:276]
    oc2otel      tracer_start_options.go:44 StartOptions([]StartOption(#1 len=1 cap=4254356))
    internal     tracer.go:40               (*Tracer).StartSpan(*Tracer(0xc0006aa440), Context{0x1f9b500, 0xc000132000}, string(0x1bb6810, len=13), StartOption{#1, 0x207064752f313731, 0xc000858758})
    trace        trace_api.go:56            StartSpan(...)
    metricexport reader.go:191              (*Reader).ReadAndExport(*Reader(0xc00063c360), Exporter{0x1f5d6c0, 0xc000370f30})
    prometheus   prometheus.go:132          (*collector).Describe(*collector(0xc0004ac620), chan *Desc(0xc0004abc80))
    prometheus   registry.go:277            (*Registry).Register.func1()

Switching to the metrics bridge might fix this but, of course, they don't support the Prometheus exporter: open-telemetry/opentelemetry-go#2204

gRPC distributed tracing requires clients to use the binary propagation format from contrib because it has been temporarily removed: open-telemetry/opentelemetry-go#628

#443 adds partial support for traces with the caveats described above. As the Otel library and the ecosystem matures, we need to revisit this task and improve our integration.

  • Switch to Otel metrics when the spec matures
  • Create equivalent gRPC Otel metrics as those provided by the OC library (if not already provided by the official plugin)
  • Replace OC interceptors and handlers with equivalent Otel components
  • Add support for more Otel exporters (Cloud Trace, AWS X-Ray, Datadog etc.)

@charithe charithe removed their assignment Jun 5, 2023
charithe added a commit that referenced this issue Nov 22, 2023
Add ability to configure OTLP trace exporters with more options such as
the choice of protocols, sampler configuration, TLS settings etc.

As part of this change, the `tracing` configuration block in the Cerbos
configuration file has been completely deprecated with the aim of
removing it in the release after next. This is because of the following
reasons:
- Jaeger native protocol is no longer supported by the Otel SDK.
- The Otel specification defines standard environment variables that can
be used to configure OTLP exporters. Trying to replicate all possible
configuration options in our configuration would be brittle and just
complicate our code and documentation for not much benefit.

Fixes #1784 
Part of #341

---------

Signed-off-by: Charith Ellawala <charith@cerbos.dev>
charithe added a commit that referenced this issue Nov 23, 2023
…cs (#1887)

OpenCensus is now EOL and OpenTelemetry is stable ('ish) enough to
migrate to.

The metrics names have not changed but, due to quirks in exporters,
there could be minor breaking changes to existing dashboards with this
change.

Also adds support for pushing metrics through OTLP. 

Fixes #341

---------

Signed-off-by: Charith Ellawala <charith@cerbos.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant