tracestate response header proposal #483

dyladan · 2022-03-02T19:39:44Z

One issue that is frequently raised with response headers is the absence of a mechanism similar to tracestate which can be used to return tracing-system-specific information to the caller. As an example, in a multi-tenant tracing system it may be desirable to return the tenant ID with the trace ID and span ID so that the caller knows which tenant to use to look up that trace.

Problem with tracestate response

The primary issue preventing the implementation of a tracestate response header in the past has been the inability to reliably merge tracestate response headers. For example, if node A calls node B and node C which both return a tracestate in the response, how does node A handle conflicts between the tracestate response from B and C if the requests to B and C both return key foo with different values?

Proposal 1 - Do not backpropagate tracestate

Given the above example, node A would not return any tracestate values from the responses from node B or node C. In this case, the support use case described above still works because a customer of A can include A's tracestate response in a support request. A can provide any data in the tracestate it needs in order to look up the trace (cluster id, tenant id, etc).

This is the simplest implementation because the tracestate header can be a completely opaque value. A requester would not receive any tracestate from any of its grandchildren in the trace, but it still opens the possibility for many supportability use cases. A "proxy mode" or similar would also be desired in this case in order for a load balancer, firewall, proxy, or similar system to return the tracestate returned by the target unaltered.

Proposal 2 - Treat tracestate as a set of tokens

Given the above example, treat all tracestate entries as an opaque token (e.g. a=1 is a distinct token from a=2). In this case, node A would concatenate the tracestate responses from B and C, and optionally prepend its own token, then make a pass through the full tracestate which removes all duplicate tokens before including it in the response.

This is slightly more complex to implement as it requires backpropagating the tracestate from children to parent. It would also possibly grow the tracestate quite large, although reasonable limits could be applied. It also raises the question of how to order the tokens from B and C.

The text was updated successfully, but these errors were encountered:

yurishkuro · 2022-03-03T14:09:33Z

It is worth pointing out that Jonathan Mace's "baggage" paper advocated for implementing the merge semantics (or option 2). However, it does feel like a higher burden on implementers (many implementations propagate the context forward as immutable). I like option 1. Maybe in the future we could introduce different levels where both options are possible but option 2 is not required.

dyladan · 2022-03-03T14:21:14Z

it wouldn't be too hard to write the spec such that we do proposal 1 for now but leave proposal 2 open for the future. For example we could specify it as a comma separated list but ignore all entries after the first in version 1

dyladan · 2023-05-11T13:47:25Z

I would like to revive this proposal for level 3. Specifically, I believe proposal 2 to be the more useful. I agree with Yuri that it does complicate implementation, but I believe the usefulness outweighs the benefits. Option 2 is required if there is any middleman like a proxy, or routing service, or other the header will be lost.

SergeyKanzhelev · 2023-12-06T00:05:03Z

The MAY semantic on back propagation sounds like a good plan. A few thoughts:

I wonder if backpropagation opens door for any new information exposure problems. With forward propagation one can inject a cleanup filter on incoming request that will ensure that tracestate is cleaned up before any outgoing calls. With back propagation it is unclear at what stage this clean up filer must be set. Do all frameworks support this type of callback to clean up the whole tracestate on response?
What to do with async calls? If a request initiates some background actions, does framework needs to collect all tracestates, even though neither of them will ever be used? With forward propagation it is a constant set of headers that will be stored and sent to background tasks forewer. With back propagation the tracestate potentially will grow with no use.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tracestate response header proposal #483

tracestate response header proposal #483

dyladan commented Mar 2, 2022

yurishkuro commented Mar 3, 2022

dyladan commented Mar 3, 2022

dyladan commented May 11, 2023

SergeyKanzhelev commented Dec 6, 2023

tracestate response header proposal #483

tracestate response header proposal #483

Comments

dyladan commented Mar 2, 2022

Problem with tracestate response

Proposal 1 - Do not backpropagate tracestate

Proposal 2 - Treat tracestate as a set of tokens

yurishkuro commented Mar 3, 2022

dyladan commented Mar 3, 2022

dyladan commented May 11, 2023

SergeyKanzhelev commented Dec 6, 2023