Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support B3 headers in OTEL propagation #725

Closed
treyhyde opened this issue Mar 12, 2021 · 16 comments · Fixed by #1546
Closed

Support B3 headers in OTEL propagation #725

treyhyde opened this issue Mar 12, 2021 · 16 comments · Fixed by #1546
Labels
area/data-plane good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature-request lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@treyhyde
Copy link

Problem

Tracing is instrumental in supported distributed systems such as Knative Eventing, Kafka, etc. Many systems (including Istio), by default use B3 (multi) headers. Kafka Broker appears to be configured for only W3C headers. Traces involving Knative Eventing stop when it hits the broker or only start after messages hit other systems, you don't get an end-to-end trace when using B3

Persona:
Which persona is this feature for?

  • Producer
  • Consumer

Exit Criteria
Kafka Broker propagates B3 headers (in addition to W3C)

Time Estimate (optional):
How many developer-days do you think this may take to resolve?

Additional context (optional)
Add any other context about the feature request here.

@treyhyde
Copy link
Author

treyhyde commented Mar 12, 2021

FYI, I added

- name: OTEL_PROPAGATORS
   value: b3

to the receiver and dispatcher deployments to no effect.

@knative-prow-robot
Copy link
Contributor

@pierDipi:
This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.

In response to this:

Thanks for reporting!

We should be able to easily support it by injecting a propagator in https://github.com/knative-sandbox/eventing-kafka-broker/blob/67b507aeee9f3f88864a98f2be2dc56194051e71/data-plane/core/src/main/java/dev/knative/eventing/kafka/broker/core/tracing/OpenTelemetryTracer.java#L48
and use it in place of:

/good-first-issue

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Mar 14, 2021
@arghya88
Copy link
Contributor

/assign

@slinkydeveloper
Copy link
Contributor

@arghya88 No need to fix it here, we're planning to use the new module directly from vertx eclipse-vertx/vertx-tracing#22
We'll just need to configure the tracer when using the new module

@slinkydeveloper
Copy link
Contributor

Now that this is merged eclipse-vertx/vertx-tracing#22, we're going to implement this as soon as we update to vert.x 4.1 #784

@slinkydeveloper
Copy link
Contributor

slinkydeveloper commented May 18, 2021

Just an heads up here. Once we merge #784 (hopefully soon), we need to import the sdk extensions and add this trace propagator: https://github.com/open-telemetry/opentelemetry-java/blob/50408d499f85d5761d0a5ed9bf9d77d5ff01fff5/extensions/trace-propagators/src/main/java/io/opentelemetry/extension/trace/propagation/B3Propagator.java

@skonto
Copy link

skonto commented Jul 1, 2021

What is the support plan for the OTEL stuff at the vertx side long term? We heavily depend on vertx for most stuff, should not we use the otel java lib (directly)? Also should not we add this to the effort for migrating Eventing to OTEL?

@slinkydeveloper
Copy link
Contributor

What is the support plan for the OTEL stuff at the vertx side long term?

Vert.x is going to support the vertx-opentelemetry integration, that is the implementation of the Tracing SPI, the rest is on us 😄

We heavily depend on vertx for most stuff, should not we use the otel java lib (directly)?

Why should we? The vertx-opentelemetry project is a contribution back to them started from this project, where we can help the greater vert.x community to use opentelemetry, and we can also profit back. There's no reason, nor any need, for us to do a custom implementation of the Tracing SPI. We use today vertx-opentelemetry, it works fine and fits our use case, so I don't see any need to change it.

Also should not we add this to the effort for migrating Eventing to OTEL?

For this specific eventing-kafka-broker project, there isn't anything left to do about this. Except perhaps this issue, which is not really related to otel but in general to support more tracing functionalities.

@skonto
Copy link

skonto commented Jul 1, 2021

Vert.x is going to support the vertx-opentelemetry integration
There's no reason, nor any need, for us to do a custom implementation of the Tracing SPI.

I am referring to the first part. We should minimize deps (if possible) and from what I see across all Knative components there are different client libs, which makes integration life and debugging (I care about this) worse. From what I see we implemented the OTEL support in Vert.x (which uses otel java lib if not mistaken anyway) that is why I am asking about it :)

@slinkydeveloper
Copy link
Contributor

Honestly, I don't get what are you referring to 😄 The opentelemetry library we use is the official one from the opentelemetry team.

What we do in this project, data plane wise, has nothing to do with the rest of knative, since we use Java, so we obviously have a different stack with different libs etc. In other words, if we would use opencensus here, nothing would change anyway because the java opencensus library is different from the golang opencensus library.

We should minimize deps (if possible)

I disagree, when talking about official vertx dependencies of the vertx stack. Those are well tested and fits together with vertx-core, so everything in there is good for us to use.

@slinkydeveloper
Copy link
Contributor

@slinkydeveloper
Copy link
Contributor

@treyhyde your goal is to emit b3 headers, or you just want the broker to recognize them and then still emit tracecontext?

@slinkydeveloper
Copy link
Contributor

/assign

@treyhyde
Copy link
Author

treyhyde commented Jul 7, 2021

@treyhyde your goal is to emit b3 headers, or you just want the broker to recognize them and then still emit tracecontext?

The basic goal to propagate all tracing headers so we have basic insights into how the app is working. B3 is the defacto standard ... still (ignoring ACTUAL standards) as it is the default in sooo many places due to its long tenure. If I pass B3 I expect B3 on the other side. If I pass W3C, I expect W3C on the other side. In the same deployment I could easily see different apps using different header types and I would want all of them to work. (though, in this case, we use b3 universally since that is the Istio default). We switched to the kafka-channel with the standard broker because this worked out of the box.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 6, 2021

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 6, 2021
@pierDipi pierDipi added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/data-plane good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature-request lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
6 participants