Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify uniqueness of tracestate keys as the sender's responsibility #477

Merged
merged 4 commits into from
Jan 4, 2022

Conversation

basti1302
Copy link
Contributor

@basti1302 basti1302 commented Nov 11, 2021

This partially resolves #446.

This adds a new rule to "Mutating the tracestate Field" that makes it
clear that a sender must not create duplicated tracestate keys. It is a
non-breaking change because the section "Combined Header Value" already
had called that out with "Only one entry per key is allowed". Basically
this adds that rule to the section it actually belongs into.

Having that as the responsibility of the sender seems to be consensus in #446.

There seems to be no consensus yet whether an implementation should
clean up a tracestate value it receives by discarding duplicate keys
that it does not own. Thus, that aspect is not considered in this
change.

Update: Consensus has been reached that an implementation MAY discard duplicate key but is not obliged to. This aspect has been added to this change as well now.

Finally, this change fixes the test suite that was too strict and demanded that the whole tracestate should be dropped when duplicated keys are present. Thus, this PR fixes #369, too.


Preview | Diff

@@ -297,7 +297,7 @@ The `tracestate` value is the concatenation of trace graph key/value pairs

Example: `vendorname1=opaqueValue1,vendorname2=opaqueValue2`

Only one entry per key is allowed because the entry represents that last position in the trace. Hence vendors must overwrite their entry upon reentry to their tracing system.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"because the entry represents that last position in the trace." had been called out here by @yurishkuro as not making a lot of sense, and I agree. Hence, I removed that part.

Also, I think update could be a better phrase instead of overwrite because we also use that in the section "Mutating the traceparent Field".

Finally "upon reentry to their tracing system" is really unwieldy, IMO.

@@ -353,6 +355,6 @@ Vendors receiving a `tracestate` request header MUST send it to outgoing request

Following are allowed mutations:

- **Add a new key/value pair**. The new key/value pair SHOULD be added to the beginning of the list.
- **Add a new key/value pair**. The new key/value pair SHOULD be added to the beginning of the list. Adding a key/value pair MUST NOT result in the same key being present multiple times.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This had been called out in "Combined Header Value" already, but it actually belong here, IMO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how this formulated. It eliminates concern that if platform doesn't deal with tracestate keys, just propagates it as is - it is not responsible for the duplicates. We may want to spell it out explicitly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this part of my PR comment:

There seems to be no consensus yet whether an implementation should
clean up a tracestate value it receives by discarding duplicate keys
that it does not own. Thus, that aspect is not considered in this
change.

And my message here #446 (comment)

TL;DR: I agree with you that this is how it should work (vendors should not need to care about foreign keys at all) but I am not sure if this is consensus, so for now I was reluctant to add this aspect to this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that a tracing system should be responsible for deduping their own keys and shouldn't need to analyze or dedupe other keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added that aspect as well now: ccbabd6

@@ -297,7 +297,7 @@ The `tracestate` value is the concatenation of trace graph key/value pairs

Example: `vendorname1=opaqueValue1,vendorname2=opaqueValue2`

Only one entry per key is allowed because the entry represents that last position in the trace. Hence vendors must overwrite their entry upon reentry to their tracing system.
Only one entry per key is allowed. Hence vendors MUST update their entry instead of adding it twice when propagating a modfied `tracestate` header downstream.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • the word "hence" not needed
  • instead of speaking of updating/adding, I would rather say vendors MUST ensure that the entry with their vendor key is only present once in the header sent downstream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to just say that vendor is responsible for not duplicating keys it owns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that is exactly what I have added down in the "Mutating the traceparent Field" section -> https://github.com/w3c/trace-context/pull/477/files#diff-3832602716774745d030f22592f4c5b7d8f316e4d44be0165282244f9d6f14d9R358

Do we just want to duplicate that phrasing here? Seems redundant. How about doing away with the second sentence entirely? So just "Only one entry per key is allowed." and that's it. We also have the cross-reference

See Mutating the tracestate Field for details.

so I think spelling out specifics here too is not required, the actual normative rules belong in the "Mutating" section and are already phrased as you requested.

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, perhaps just delete the entire phrase including "Only one entry per key is allowed."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the first sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the first sentence, it leads nicely into the following example, but I removed the duplication about not adding your own header twice.

@kalyanaj kalyanaj linked an issue Dec 7, 2021 that may be closed by this pull request
- **Update an existing value**. The value for any given key can be updated. Modified keys SHOULD be moved to the beginning (left) of the list.
- **Delete a key/value pair**. Any key/value pair MAY be deleted. Vendors SHOULD NOT delete keys that were not generated by them. The deletion of an unknown key/value pair will break correlation in other systems. This mutation enables two scenarios. The first is that proxies can block certain `tracestate` keys for privacy and security concerns. The second scenario is a truncation of long `tracestate`s.
- **Delete a key/value pair**. Any key/value pair MAY be deleted. Vendors SHOULD NOT delete keys that were not generated by them. The deletion of an unknown key/value pair will break correlation in other systems. This mutation enables three scenarios. The first is that proxies can block certain `tracestate` keys for privacy and security concerns. The second scenario is a truncation of long `tracestate`s. Finally, vendors MAY also discard duplicate keys that were not generated by them, however, they are not obliged to do that.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Delete a key/value pair**. Any key/value pair MAY be deleted. Vendors SHOULD NOT delete keys that were not generated by them. The deletion of an unknown key/value pair will break correlation in other systems. This mutation enables three scenarios. The first is that proxies can block certain `tracestate` keys for privacy and security concerns. The second scenario is a truncation of long `tracestate`s. Finally, vendors MAY also discard duplicate keys that were not generated by them, however, they are not obliged to do that.
- **Delete a key/value pair**. Any key/value pair MAY be deleted. Vendors SHOULD NOT delete keys that were not generated by them. The deletion of an unknown key/value pair will break correlation in other systems. This mutation enables three scenarios. The first is that proxies can block certain `tracestate` keys for privacy and security concerns. The second scenario is a truncation of long `tracestate`s. The third scenario is vendors MAY discard duplicate keys that were not generated by them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The third scenario is vendors MAY discard" sounds a bit awkward IMO. I would prefer the current phrasing, to be honest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current phrasing of "Finally, vendors..." is fine with me. However, the phrase ", however, they are not obliged to do that" feels redundant since it is conveyed by the MAY semantics in the previous sentence, so it feels we can remove that part to make it tighter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I have removed that part.

@basti1302
Copy link
Contributor Author

I have added another commit that fixes the broken test case that originally sparked this discussion in #369.

This partially resolves w3c#446.

This adds a new rule to "Mutating the tracestate Field" that makes it
clear that a sender must not create duplicated tracestate keys. It is a
non-breaking change because the section "Combined Header Value" already
had called that out with "Only one entry per key is allowed". Basically
this adds that rule to the section it actually belongs into.

Having that as the responsibility of the sender seems to be consensus in

There seems to be no consensus yet whether an implementation should
clean up a tracestate value it receives by discarding duplicate keys
that it does not own. Thus, that aspect is not considered in this
change.
This is already called out in the section
"Mutating the traceparent Field".
The test_tracestate_duplicated_keys was broken, it demanded to drop
the whole tracestate when a duplicate key was found. This does not match
what the specification mandates in that situation. Instead,
implementations MAY remove the duplicates from tracestate or even leave
them in as-is.

fixes w3c#369
@basti1302
Copy link
Contributor Author

@dyladan @yurishkuro @SergeyKanzhelev I think all comments have been addressed by now and the PR also includes a fix for the test case related to duplicate keys in the tracestate header. Could you please re-review?

Only one entry per key is allowed because the entry represents that last position in the trace. Hence vendors must overwrite their entry upon reentry to their tracing system.

For example, if a vendor name is Congo and a trace started in their system and then went through a system named Rojo and later returned to Congo, the `tracestate` value would not be:
Only one entry per key is allowed. For example, if a vendor name is Congo and a trace started in their system and then went through a system named Rojo and later returned to Congo, the `tracestate` value would not be:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is allowed" is not a normative language. I would change it to "There SHOULD be only one entry per key".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normative language around this is in the section "Mutating the tracestate Field":

Adding a key/value pair MUST NOT result in the same key being present multiple times.

I would restrict the normative description of that aspect to the "Mutating" section because it describes what an implementation is allowed to do with the header and what not. I do not see any added benefit by duplicating this aspect in normative language here. "Only one entry per key is allowed." is given in the context of the example (see the line above) and in my understanding, examples are never normative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clarify tracestate keys MUST be unique Clarify handling of duplicate keys in tracestate
6 participants