Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add consumer_id to identify the consumer #1810

Merged

Conversation

kenfinnigan
Copy link
Member

Resolves #1796

Changes

Adds consumer_id as a label for messaging spans to, as best as possible, uniquely identify the consumer of a message.

This is especially beneficial when the same message is "replayed" to different consumers, ensuring the spans are differentiated between the usages.

@@ -57,6 +57,14 @@ groups:
The [conversation ID](#conversations) identifying the conversation to which the message belongs,
represented as a string. Sometimes called "Correlation ID".
examples: 'MyConversationId'
- id: consumer_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be expanded to client_id to include producerId as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should consider that we currently have the used queue/topic/... as destination/destination_kind. Which also is not named perfectly, but the other way round.
I think client_id would make sense here, even if it only applicable for consumers in Kafka. But no strong opinion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, the description should be updated accordingly to also allow a "sender ID".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a "sender ID" provide any differentiation that isn't already present?

My understanding is a single message can only be sent from a single sender or producer, which enables the process detail attributes to be used for uniquely identifying where the message came from.

With consumption, the same message can be consumed by many different consumers, either in near real-time or significantly delayed. It's for this reason I'd only focused on the consumption side for a unique identifier as it's possible to use the same process/service to consume the message and they are identified differently in terms of Kafka, or other brokers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which enables the process detail attributes to be used for uniquely identifying where the message came from.

That makes sense, but doesn't the same apply, conceptually, to the consumer side? Or is it more of a "consumption" ID than a "consumer" ID?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently in the general messaging semantic conventions that apply to both consumers and producers. But I assume that consumer_id only applies to consumer spans, not producer spans. So I'd put in the messaging.consumer semantic convention group, a bit down below in the YAML file (https://github.com/open-telemetry/opentelemetry-specification/pull/1810/files#diff-6ea24809c25647f7e2f067dfc8fa080983ab936471e9eda83b26f190181bdeb4R104)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I see what you mean now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the attribute change to id, or it stays what it currently is and becomes messaging.consumer.consumer_id?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've figured it out, so will update and get feedback.

Thanks for all the pointers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The messaging.consumer semantic convention still only has "messaging." as prefix, so the final string to use as key in the semantic conventions will still be just "messaging.consumer_id".

@Oberon00
Copy link
Member

@kenfinnigan Please also add a Changelog entry for the new attribute.

@kenfinnigan
Copy link
Member Author

@Oberon00 I will update the changelog, but I'm not sure what to do regarding the suggested change to client_id as there's already approval for the change "as is"?

@Oberon00
Copy link
Member

as there's already approval for the change "as is"?

There is only one approval yet, it's no big deal if you still change it (if you agree with @iNikem's and my comments).

Just re-request the review from @carlosalberto afterwards.

@kenfinnigan
Copy link
Member Author

I think this is good, but would appreciate a re-review to confirm

type: string
brief: >
The identifier for the consumer receiving a message. For Kafka, set it to
`{messaging.kafka.consumer_group} - {messaging.kafka.client_id}`, if both are present, or only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing I'm not sure about in this PR is this format. Isn't there some standard "fully qualified client ID" format for Kafka?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so, as https://jaceklaskowski.gitbooks.io/apache-kafka/content/kafka-properties-client-id.html describes how it's an optional identifier passed to the broker, so it could be anything

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the combined format, where we now put " - " in-between. I wondered if there is some existing notation to combine consumer group and client_id into a single identifier string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I've found unfortunately

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are client_ids unique across consumer groups? Maybe it would be better then to just use the client_id here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They must be unique within a Kafka Broker, but they're not required. It's possible for a consumer to only have the consumer group id set and not the client id.

@kenfinnigan
Copy link
Member Author

Any other comments on this, or is it good?

@kenfinnigan
Copy link
Member Author

Any feedback/comments? I'd like to have this merged soon if possible so I can implement it in our frameworks

@kenfinnigan
Copy link
Member Author

Not sure why these checks failed, it almost appears they didn't even run.

Is there a problem with CI checks at present?

@Oberon00
Copy link
Member

Oberon00 commented Aug 5, 2021

I re-ran the CI.

@kenfinnigan
Copy link
Member Author

Thanks @Oberon00, seems ok now

@kenfinnigan
Copy link
Member Author

Is there additional feedback on this?

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 14, 2021
@kenfinnigan
Copy link
Member Author

Any additional concerns with this, or can it be merged?

@github-actions github-actions bot removed the Stale label Aug 17, 2021
@kenfinnigan
Copy link
Member Author

@Oberon00 Is this ok to be merged or is there more needed?

@kenfinnigan
Copy link
Member Author

I'd like to see this completed soon, what are the next steps for this to happen?

@Oberon00
Copy link
Member

I can't merge it. There needs to be one more approval and then a maintainer (= TC member) needs to merge it.

@kenfinnigan
Copy link
Member Author

@Oberon00 Ok. What can I do to help that process along?

@carlosalberto carlosalberto merged commit 028ac12 into open-telemetry:main Aug 23, 2021
@kenfinnigan
Copy link
Member Author

Thanks @carlosalberto and @Oberon00!

arminru added a commit that referenced this pull request Aug 24, 2021
arminru added a commit that referenced this pull request Aug 24, 2021
@kenfinnigan kenfinnigan deleted the trace-messaging-consumer-id branch September 13, 2021 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unique consumer identifier span attribute for messaging
5 participants