-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify the HTTP header name choice for trace #119
Comments
the recasing is interesting though odd. which product do you know of takes
lowercase and makes it uppercase?
I have seen the other way around (http2 for ex downcases) but not the other
way (upcasing). If something did, the only possibly sensible thing it could
do is uppercase the first character which is still easier to handle than
hyphens.
As this spec is only concerned with trace data if other things are recased,
it doesn't impact us.
Which tracing system are you working on by the way?
|
Personally, the no-hyphen rule bothers me too. The current draft is specifically talking about HTTP headers, where hyphens are perfectly fine. Another protocol may have a different specification. The theoretic case of some system that receives HTTP requests and translates them, say, into RabbitMQ messages by blindly copying all HTTP headers to message headers is not going to work anyway, since most common HTTP headers have hyphens, as well as headers used by existing tracing systems. Even in HTTP world it's not uncommon to have infra that doesn't allow dashes (e.g. CGI, WSGI - they will also uppercase, @adriancole), but people often assume that dash and underscore are interchangable. |
when we say something doesn't work we should qualify that.
for example tools like camel map hyphens and underscores to avoid problems
like these but they do so in ways not exactly the same as other tools. For
example this mapping is incompatible with the approach used in opentracing
jms.
to say removing hyphens doesn't help with fields required for tracing to
work is false. other tools and the heuristics they use for inconsistent
mapping can continue even if we fix things for tracing.
Removing the hyphens absolutely obviates this problem of hyphen or
underscore mapping provably. nothing has changed fact wise since we decided
this either.
|
@adriancole thank you for taking the time to respond. My interest in this is one of implementation, as I work on distributed systems and am a web framework author; and as a member of the wider standards community. The recasing to uppercase is indeed an odd idea, although there is prior art in cgi gates of old times. There is however a whole set of http frameworks that would parse http headers, turn them into an object model, and re-serialize them on the other end. It is also common for operators configuring reverse-proxy to mistype the casing. From a standards perspective, defining a header as case sensitive is not possible without redefining the semantics of HTTP itself. I woudl imagine that it would prevent a registration of the header as per BCP90 From an implementation perspective, the header itself may be comma-separated, so it could be folded into one or split in multiple headers, as per the specification. As with most things in HTTP, if it can be done it will be done, in intermediaries that the operator of a tracing system, or the author of the headers, will have no control over. RFC7231 Section 8.3.1 says on the subject
Within those constraints, I find it difficult to imagine using the http headers as one serialisation stable enough to transfer with no modification across protocols without some level of parsing, and that will mean some level of processing for things that you cannot disallow http to do, by design, like case sensitivity in header field names and the presence of multiple headers for lists. At this point, I'm not sure that the lack of hyphen looks as attractive anymore, but it's entirely likely that i do not understand the mappings between tracing systems that you refer to, and I expect other readers and implementers of specifications may stumble upon the same difficulty, hence my suggestion to extend the description of the incompatibilities between toolsets leading to the choice of the hyphen. Either way, if the goal is to have one simple mapping from a single header to whatever other system of choice exists, then lists with commas seem to prevent this from happening, as for example most json-based messaging systems will not allow the presence of duplicate keys in objects. It would probably be wiser to have a space-separated list instead, and a processing rule (as per above) when multiple are present. If the goal, on the other hand, is to define semantics that integrate well with HTTP, it may be necessary to provide the steps to generate an idiomatic single key/value pair representation, separate from the definition of the tracing elements themselves, along the lines of what was used rather successfully in RFC8288. On a side-note, my original research lead me from the pull request to the minutes that alluded to a decision made at a workshop, for which I couldn't find any minutes. This would explain why, as part of the wider spec community, the facts leading to that decision are unknown enough that I opened this issue. |
you seem to be raising many sorts of issues in the same issue comment. let
me try to figure out what you want:
1. you want us to intentionally add hyphens even though folks with practice
know this to cause issues in distributed tracing which by definition is
more than http
2. you want us to intentionally not suggest lowercase for header name
encoding
3. you want us to accept concatenated headers on things that shouldn't be.
ex right now trace headers in the wild would fail of people concatenated
them.
all of these things seem thoroughly documented by your research but also
destroy practical value.
I wouldn't want such a spec as it is worse than status quo. others can feel
free to chime in.
|
While I understand why you would want some definition power of the header's value, I don't understand the need to control its name. As @serialseb writes, proxies, caches, middleware and whatnot is going to rewrite these names regardless of what you write in this specification. Most proxy providers probably won't even know about its existence for many years to come. You can't redefine HTTP; it is well defined in RFC 7230. Also, just the idea of taking one protocol's header name and plunging it into another without any modification sounds very weird to me. The value, I can understand, but not the name. The name will have to obey the syntax of the protocol it is being used in. In binary protocol such as MQTT, the header won't even have a name – it will just be represented by a number.
If you want to make suggestions, they should at least conform with the protocol which they apply to. If you want to ensure compatibility across protocols, provide guidance for each protocol you envision that this standard is going to be used in. For most open and standard protocols, I assume you're going to have to formally register the header, in which case you will need to adhere to the syntax of each individual protocol anyway.
This is how HTTP, SMTP, POP3, IMAP and all other protocols using MIME headers work. Headers can be split, concatenated and juggled around for every hop a message takes from the sender to the receiver. The standard allows for a great deal of flexibility and you can't break this 45 years of legacy with your specification. |
@adriancole I don't want you to do anything. You're writing an open specification that, as currently written, would suffer from significant compatibility issues with HTTP. I suggested changes to either confirm better to HTTP, or alternatively, to continue in the current direction but without the compatibility problems. Either way, as is this specification wouldn't be usable on HTTP as deployed on the web, hence my feedback. If you believe it to be unhelpful, or incorrect, I'd suggest outreaching to some of the w3c resources, as they're much more authoritative than me on the question. If you wish me not to provide feedback at this stage in the standardisation, please feel free to say so, I'm in no way trying to impose. |
while you are good at citing things, several facts remain in quite
practical terms. you elude to something being unusable which is quite false
|
I am also not interested in arguing with you. Just I hope that someone here
can apply things in practical terms like for example what this
specification is actually about.
for example, we have trace identifiers in headers today. while it is spec
possible for someone to comma delimit an extra trace ID, that is malformed
in the sense of the trace ID even if not in the sense of http.
moreover while it is possible to add hyphens in field names this only
matters if you do. we can choose not to for any reason especially in
context of tracing. a different approach would need to make sense in the
sense of what we are actually doing here. notably cross systems.
finally while yes a stupid middleware could take a lowercase header name
and make it every other character capital, this is just dumb. In practical
terms the least efforts are required if we are consistent across all of our
field name instructions.. http being one of them (w3c being an arbitrary
choice vs other bodies). In practical terms implementations will want to
check for uppercase first character as that is maybe even likely. however
this doesn't mean we *must* recommend fixed name for the header.
I don't want to argue ad there is limited time and spending it in circles
isn't best for either of us. for example, I have choice to spend time
arguing about nuance without any context to what we are doing or maybe work
on something. I plead with you to concentrate on what we are doing before
offering change requests, but of course your opinions are fine even without
relevance to tracing.
|
just chiming in here: @serialseb and @asbjornu to my understanding have a very valid point here: you may be defining something that attempts to be cross-protocol, and that's fine. but it is hard to imagine that you can do so without carefully examining and respecting the rules of each individual protocol you want to cover. i haven't look at any of your other "bindings", but for HTTP, you definitely have a bit of adjustment to do. and as @asbjornu said, it is hard to imagine that it will scale to rely on blindly copying "headers" and "values" across protocols, as all of them will have different definitions of how to handle both (he alluded to MQTT, and there certainly are others). |
so in summary, my 2p is to suggest this:
* don't intentionally break consistency with non http by introducing
hyphens in the name
* don't intentionally make non http difficult by suggesting to write the
header names in mixed case
* do caveat non http reminding that mixed case is possible when not using
http2 transport even if the header names are listed as lowercase. recommend
to not rely on all lowercase and perhaps expect 1st character being
uppercase. use a real life example of something that uppercases 1st
character.
* do discuss traceparent and tracestate headers independently. traceparent
is malformed if multiple headers exist.. period. tracestate is the only one
that could possibly overrun a header though very very unlikely with 512char
limit. if it is likely cute what does this.
|
by the way we don't need to use imagination. there are definitely products
that mostly blindly copy such as apache camel. When they do this, they use
their own reformatting rules for hyphens. hyphen is the biggest problem
much bigger than case format.
openzipkin/brave#584
folks who actively disbelieve that this exists can, and they can also
subvert the design aspects that accommodate this for something that seems
more theoretically coherent with http. the side effect besides undoing the
discussions and work around this is that it makes the format less
attractive as it continues the spaghetti problem. personally I will stop
working on the spec if this happens as it would be more fruitful to work on
just b3 as we can reuse it more effectively due to reasons discussed.
your call
|
Summarising this issue, we have two items to discuss:
|
absolutely nothing keeps you from not using hyphens in your header field name. just name them as you wish.
… On Apr 30, 2018, at 10:20, Adrian Cole ***@***.***> wrote:
by the way we don't need to use imagination. there are definitely products
that mostly blindly copy such as apache camel. When they do this, they use
their own reformatting rules for hyphens. hyphen is the biggest problem
much bigger than case format.
openzipkin/brave#584
folks who actively disbelieve that this exists can, and they can also
subvert the design aspects that accommodate this for something that seems
more theoretically coherent with http. the side effect besides undoing the
discussions and work around this is that it makes the format less
attractive as it continues the spaghetti problem. personally I will stop
working on the spec if this happens as it would be more fruitful to work on
just b3 as we can reuse it more effectively due to reasons discussed.
your call
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Due to the tone of this conversation, I’m withdrawing from making further comments. |
@serialseb I'm sorry this thread came into this. I appreciate your feedback and opinion. We want to make this specification work for real life scenarios. In order to achieve interoperability it should be as strict as possible while accounting for the history and prior art. @adriancole and others working on it has a great experience building distributed tracing systems. And saw a lot of misunderstanding, lax and corner-cutting implementations of standards. Which made code a big mess of One additional feedback to what @AloisReitbauer listed I'm taking is that rationale and headers overview section should be re-worked to not make reasoning of chosen non-traditional names clearer. |
On 2018-04-30 18:25, Sergey Kanzhelev wrote:
One additional feedback to what @AloisReitbauer
<https://github.com/AloisReitbauer> listed I'm taking is that rationale
and headers overview section should be re-worked to not make reasoning
of chosen non-traditional names clearer.
it really is very simple:
- pick any name you like within the naming rules of the HTTP spec.
hyphens are a convention that you can simply and silently ignore.
- do not violate https://tools.ietf.org/html/rfc7230#section-3.2
|
@dret sure! The rationale docs and introductions is a way to be transparent of the thinking process AND more importantly -- the way to highlight non-traditional for http choice. This text makes it visible and explicit for implementors. |
We should probably re-word some stuff to make clear what is a suggestion (i.e. lowercasing - which we can't enforce since HTTP treats header names as case-insensitive, but we can encourage) vs. required (i.e. no hyphens in the name - which we are certainly allowed to do). These are oversights or artifacts of early drafts - not an intentional attempt to subvert HTTP. C'mon - nobody is attempting to redefine or break HTTP with this spec... With that out of the way: it seems to me that since the main purpose of this tracing spec is to maximize interoperability, we should focus on making decisions that promote success by default, even in the face of middleware that does strange things or tooling that is implemented without carefully read the spec. That's what the no-hyphen-header-names and lowercase-suggestion stuff is about - successful integration by default, in the wild, based on hard-earned prior experience:
However much we feel that middleware shouldn't do certain things we have empirical proof that it does happen, and will continue to happen if we make the same mistakes from the past. We might gain some moral high ground by saying "sorry your stuff is broken but we decided to go with hyphens and not encourage lowercase header names because it was more HTTP conventional - you need to fix your libraries". And that middleware (e.g. apache camel) would likely respond "cool story, but we're established and popular and not going to break our existing customers for your tracing spec". The end result is that actual users have broken tracing and nobody is happy. Hence the no-hyphen names and lowercase suggestions: success by default.
As @serialseb mentioned, in RFC 7231 Section 8.3.1, we are certainly allowed to define |
Added some clarification for the name choice in spec. Closing issue |
I know I'm very, very late to this party but I just... wat. |
I've implemented HTTP/1 & HTTP/2 clients and servers and spent a lot of time reading through the relevant RFCs/specs. Since I wrote these clients and servers, I wanted to add tracing, so I was really excited to see this spec. Well done on standardising it. But then I saw the header names and the related comments regarding hyphens and wanted to understand the rational behind such a decision. I'm curious what actually would have broken using more standard names like "Trace-Parent" and "Trace-State" which just semantically seem much more consistent with the entire HTTP model. When I read the spec and noticed the comment regarding hyphens, I wanted to know what was the justification. Reading this thread, I just don't understand why anyone could possibly map between HTTP headers and some non-HTTP key-value metadata without some kind of transformation. So the whole idea of "interoperability" just feels like a stretch without understanding the exact "hard won experience" that would lead to such a trade-off. I guess I understand the logic, but I'm also a bit disappointed as the headers feel out of place and a bit ugly/inconsistent with the rest of HTTP (which it seems this spec is mostly in relation to). I know backwards compatibility is important but I believe specs need to be forward looking. Maybe I'm too idealistic but I always believe there will be a lot more code written in the future than has been written in the past, so I personally care more about the future and less about the past. Anyway, I'm guessing that ship has sailed? |
Most of these decisions were made before my time on the working group, but I suspect the ship has indeed sailed. At least for |
I would have gone for:
Case is irrelevant (even more so with HTTP/2). |
Hi,
The following text made me rise an eyebrow, so I thought I'd give some feedback, maybe the text could clarify those points for future readers.
Connection
response header.Many thanks.
The text was updated successfully, but these errors were encountered: