Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracestate response header proposal #483

Open
dyladan opened this issue Mar 2, 2022 · 4 comments
Open

tracestate response header proposal #483

dyladan opened this issue Mar 2, 2022 · 4 comments

Comments

@dyladan
Copy link
Member

dyladan commented Mar 2, 2022

One issue that is frequently raised with response headers is the absence of a mechanism similar to tracestate which can be used to return tracing-system-specific information to the caller. As an example, in a multi-tenant tracing system it may be desirable to return the tenant ID with the trace ID and span ID so that the caller knows which tenant to use to look up that trace.

Problem with tracestate response

The primary issue preventing the implementation of a tracestate response header in the past has been the inability to reliably merge tracestate response headers. For example, if node A calls node B and node C which both return a tracestate in the response, how does node A handle conflicts between the tracestate response from B and C if the requests to B and C both return key foo with different values?

Proposal 1 - Do not backpropagate tracestate

Given the above example, node A would not return any tracestate values from the responses from node B or node C. In this case, the support use case described above still works because a customer of A can include A's tracestate response in a support request. A can provide any data in the tracestate it needs in order to look up the trace (cluster id, tenant id, etc).

This is the simplest implementation because the tracestate header can be a completely opaque value. A requester would not receive any tracestate from any of its grandchildren in the trace, but it still opens the possibility for many supportability use cases. A "proxy mode" or similar would also be desired in this case in order for a load balancer, firewall, proxy, or similar system to return the tracestate returned by the target unaltered.

Proposal 2 - Treat tracestate as a set of tokens

Given the above example, treat all tracestate entries as an opaque token (e.g. a=1 is a distinct token from a=2). In this case, node A would concatenate the tracestate responses from B and C, and optionally prepend its own token, then make a pass through the full tracestate which removes all duplicate tokens before including it in the response.

This is slightly more complex to implement as it requires backpropagating the tracestate from children to parent. It would also possibly grow the tracestate quite large, although reasonable limits could be applied. It also raises the question of how to order the tokens from B and C.

@yurishkuro
Copy link
Member

It is worth pointing out that Jonathan Mace's "baggage" paper advocated for implementing the merge semantics (or option 2). However, it does feel like a higher burden on implementers (many implementations propagate the context forward as immutable). I like option 1. Maybe in the future we could introduce different levels where both options are possible but option 2 is not required.

@dyladan
Copy link
Member Author

dyladan commented Mar 3, 2022

it wouldn't be too hard to write the spec such that we do proposal 1 for now but leave proposal 2 open for the future. For example we could specify it as a comma separated list but ignore all entries after the first in version 1

@dyladan
Copy link
Member Author

dyladan commented May 11, 2023

I would like to revive this proposal for level 3. Specifically, I believe proposal 2 to be the more useful. I agree with Yuri that it does complicate implementation, but I believe the usefulness outweighs the benefits. Option 2 is required if there is any middleman like a proxy, or routing service, or other the header will be lost.

@SergeyKanzhelev
Copy link
Member

The MAY semantic on back propagation sounds like a good plan. A few thoughts:

  1. I wonder if backpropagation opens door for any new information exposure problems. With forward propagation one can inject a cleanup filter on incoming request that will ensure that tracestate is cleaned up before any outgoing calls. With back propagation it is unclear at what stage this clean up filer must be set. Do all frameworks support this type of callback to clean up the whole tracestate on response?
  2. What to do with async calls? If a request initiates some background actions, does framework needs to collect all tracestates, even though neither of them will ever be used? With forward propagation it is a constant set of headers that will be stored and sent to background tasks forewer. With back propagation the tracestate potentially will grow with no use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants