Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed tracing: Producer #1278

Closed

Conversation

CodeBlanch
Copy link

For consideration...

This PR adds a System.Diagnostics.DiagnosticSource and Start/Stop/Exception events for the Producer side of things.

Half of #1269. The other half (Consumer) will be done on a follow-up PR if the team decides to move forward with this.

/cc @SergeyKanzhelev @MikeGoldsmith @cijothomas

@ghost
Copy link

ghost commented May 9, 2020

It looks like @CodeBlanch hasn't signed our Contributor License Agreement, yet.

The purpose of a CLA is to ensure that the guardian of a project's outputs has the necessary ownership or grants of rights over all contributions to allow them to distribute under the chosen licence.
Wikipedia

You can read and sign our full Contributor License Agreement here.

Once you've signed reply with [clabot:check] to prove it.

Appreciation of efforts,

clabot

@CodeBlanch
Copy link
Author

[clabot:check]

@ghost
Copy link

ghost commented May 9, 2020

@confluentinc It looks like @CodeBlanch just signed our Contributor License Agreement. 👍

Always at your service,

clabot

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Activity Start<TKey, TValue>(string topic, Message<TKey, TValue> message)
{
if (DiagnosticSource.IsEnabled(ActivityName, topic))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new recommendation from .NET would be to use ActivitySource and StartActivity, from System.Diagnostics.DiagnosticSource package 5.0.0.
https://github.com/dotnet/designs/pull/98/files#diff-25bba2a21bc64ddeccd374c1c28a0fefR24

DiagnosticSource package supporting the new pattern is releasing stable in Nov 2020, but previews will be available this week onwards.

Should we wait for new diagnosticsource to be available, and then instrument using the new approach?
If instrumented with new approach, then there wouldn't be any need of adapters in OpenTelemetry.
@reyang also to share thoughts on this.


if (DiagnosticSource.IsEnabled(StartName))
{
DiagnosticSource.Write(StartName, new { Topic = topic, Message = message });

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of writing the new anonymous type, (thus requiring adapters to have knowledge about this type etc), the new recommendation would be to use activity.AddTag("messaging.system", "kafka")

OpenTelemetry has conventions for the tag names, which can be used:
https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/messaging.md

Headers headers = message.Headers ?? new Headers();
Headers headers = message.Headers == null
? message.Headers = new Headers()
: message.Headers;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep changes focused to only those required (particularly where they're opinionated like this one, but also white space). this is important for navigating the code later in understanding what was changed when (git blame).

re: white space, it's good to remove it, yes, but i think the above consideration wins out, so i think the policy is to fix this only when making changes for some other reason.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is actually a logical difference, but you have to look really closely to see it 🔍 The previous version sets local Headers headers variable. The new version sets local Headers headers variable AND sets message.Headers prop to the same. That was needed for something else on the PR, but I don't remember exactly what at the moment.

@mhowlett
Copy link
Contributor

this seems pretty substantial and i'm too stretched to action at the moment - but thanks for the contribution, we will get to it.

@fhubi
Copy link

fhubi commented Nov 16, 2020

Hi,

What is the plan with this PR?
@CodeBlanch do you plan to change the implementation using ActivitySource?

@CodeBlanch
Copy link
Author

I'm happy to update it. @mhowlett Would that be helpful?

@fhubi
Copy link

fhubi commented Nov 30, 2020

@mhowlett @CodeBlanch Any news on this?

@mhowlett
Copy link
Contributor

there are design discussions related to tracing going on in confluent more broadly atm. what we do with this client will depend, in part, on that. also, it will take some time to review (and develop opinions on - this area is not something I know a lot about). it's something we want, but timing isn't right yet.

@cvetomir-todorov
Copy link

@mhowlett It's been 5 months, is there any update on whether and how Kafka is going to support distributed tracing?

@ElectricVampire
Copy link

Is this something still under consideration?

@mhowlett
Copy link
Contributor

mhowlett commented May 5, 2021

yes, just prioritization of effort

@CodeBlanch CodeBlanch closed this May 5, 2021
@CodeBlanch
Copy link
Author

Closing PR because a lot has changed with OpenTelemetry since this was done. @mhowlett I can redo using ActivitySource if you would like, just let me know. For effort just FYI, producing is relatively easy. Conceptually you add the propagation headers, trace state, maybe baggage to the message. Consuming is a bit harder. Ideally, you want to read a message, start/continue trace using the propagation headers, hand message to user, let user do their thing, and THEN end the trace when user is done with the message. That last part is the tricky bit but without it the trace won't really reflect the time to process the message, only time to read off the topic and hand it back to the user.

@mhowlett
Copy link
Contributor

mhowlett commented May 5, 2021

thanks for the commentary - we will remember this issue is here when it comes time to implement.

apologies we are too time constrained to veer off what's prioritized for bigger community feature PRs ... tracing is a topic of internal consideration though, so it is likely we will get back to this at some point.

@Juandavi1
Copy link

@mhowlett It's been 2 years, is there any update on whether and how Kafka is going to support distributed tracing?

@cvetomir-todorov
Copy link

@Juandavi1 I ended up with a custom, limited implementation: https://github.com/cvetomir-todorov/CecoChat/tree/dev/source/CecoChat/Kafka. Sure, it's far from ideal (to say the least...), but it works for the things I want to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants