-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why should keys be restricted to lowercase ASCII? #508
Comments
Thanks for the review and for filing this issue. We discussed this in the Distributed Tracing working group meeting today and below captures the summary from that discussion. Since this is Level 2 of the spec, the primary reason is back compatibility with existing Level 1 standard. Some of the original reasons (from Level 1) for restricting the keys and values:
We will update the specification to include the above rationale. At this point, the main reason to not revisit this decision is backwards compatibility since Level 1 is already a standard. |
I think these reasons are questionable. The ASCII comma is still a comma. Disallowing a small list of characters (comma, space, etc.) because they are syntax-meaningful would allow implementations to split on commas. There is no need to introduce escapes unless you want to allow restricted characters (comma, space) into the namespace.
I can't find ANY record of I18N having reviewed trace-context 1. In fact, I see your transreq (w3c/transitions#133) does not mention us--only security and privacy. As you might imagine, I am unhappy to see you relying on back compat if we were not afforded a look at your first edition. @plehegar was there a process failure here or did I overlook a review somehow? @r12a for visibility. |
#515 is an attempt to improve that partial aspect by providing the reason to restrict the allowed characters. You have a point that the original design choice for this restriction could have been different. I also understand your unhappiness of level 1 not having gone through an i18n review (if that is what happened). I cannot speak on @plehegar behalf so I will let him comment on that. But even if there was a process failure for level 1, the fact remains that there are implementations out there which we would break if we changed this now. We are adamant that we cannot break existing implementations by allowing more characters in |
I wonder if w3c/i18n-request#108 was meant to be the i18n review request for trace context? It fails to mention which spec it is for. But I'm guessing wildly here, nevermind. |
@basti1302 I don't know how the process failed us here. I think I see a gap in our tooling and I'll take it up with @plehegar and company to help prevent repeats. Regarding this spec/issue, I understand your commitment to backwards compatibility: it's important. Regarding #515, that's the right intention, but the logic of that statement is flawed. It says:
A string consisting of Unicode code points or a bytestring consisting of UTF-8 can still be parsed on commas and equal signs, since those are still characters--with the same byte values even (in the case of UTF-8). Consider the string:
Notice that 0x3D ( I would tend to say instead:
( I added a review just now to #515 that has a different wording suggestion. Use that one) I suppose we could introduce a way to smuggle non-ASCII characters into the values (Punycode [RFC3492] looks like about the only commonly available option compatible with the character set you have), but given the nature and probable uses of tracestate, I think you might as well skip it for now. For future reference: restricted-to-ASCII mechanisms have a long history of coming back to bite standardizers/implementers. The existence of Punycode to support IDNs is just one example of inventing convoluted schemes for getting Unicode code points into protocols or formats after the fact. But v1's day is done. If you fix your note above I think you can close this issue. |
Closing this as satisfied as discussed in #515 (comment) |
See discussion in PR. I'm satisfied by this outcome. |
3.3.2.2.1 Key
https://www.w3.org/TR/trace-context-2/#key
The definitions here should be more rigorously laid out. The
key
is part of thetracestate
header's value and is described as "...an identifier that describes the vendor". The identifier is restricted to lowercase ASCII letters, digits, and the punctuation shown above in the quote.Allowing keys/values that are non-ASCII is one possibility here, since vendors presumably can be named anything. Restricting the value to an ASCII token is understandable because of the need for encoding efficiency, but the text should be clearer about the reasoning behind the ASCII restriction for the key.
Note that key names like this are sometimes machine generated and a restriction to ASCII can be irksome when non-ASCII Unicode characters might be present in the data. The pre-computed portions probably should be well-known (ASCII !) tokens, but it's unclear why the names are otherwise restricted.
The text was updated successfully, but these errors were encountered: