-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] Client x-request-id Mangled by RequestIdExtension::setTraceStatus #11532
Comments
Yes agreed this seems like a bug. cc @euroelessar who has worked on this code recently and has more context. |
What would be an expected behavior? Envoy uses 14th byte to store a tracing decision, so alternatives are:
Thoughts? |
Thanks for looking at this so quickly! If the goal of x-request-id is correlation across two or more calls (for either unified logging or tracing) then I would argue stability of that ID is paramount. To that end, 1) would be my preference. As a compromise, you could allow users to opt into that with a config-flag:
This would at least keep default consistent with current and require a user to make an explicit decision on 'id stability' vs 'tracing status propagation' If the underlying issue is packing the trace-status in an unrelated field, then the cleanest/most consistent solution would be 3) to pack tracing decision in a dedicated x-header. Something like: (one of)
Despite it being cleaner, I don't love the idea of adding a new x-tracing-decision header on all requests just to satisfy the 0.01% case. I imagine the getTracingStatus() logic during would end up being pretty complicated too. I don't understand the implications of non/propagation of that status and would value your recommendation on that. |
+1 to allow the behavior to be configurable, -1 to a new header. Context propagation is a disaster and I would rather not make it more complicated for the common case as @cbrisket mentions. |
@mattklein123 @euroelessar +1 on the opt out config option. Would it be possible to include this on the up coming release for envoy ? We are kind of in a situation where we are moving away from our existing edge router to Envoy and apparently there are business tools that are breaking due to the value being altered. preserve_external_request_id even when set true does not seem to prevent the modification of the header value. The 14th byte on the UUID changes no mater what .. |
I would be very happy to have a different built-in implementation which does something more basic, since this comes up quite often, but someone will have to implement it. @euroelessar if your internal impl is this basic thing perhaps you can just upstream it? |
Sounds good, let me have a look at it |
I have created this issue a while ago pointing the same problem: #13774 |
Hey @euroelessar, wanted to know if you have been able to make progress on a fix for this? |
I would also argue for not modifying x-b3-traceid, when it is provided by clients. When traces are stored in the same place across multiple systems, this breaks trace analysis tools - they group spans by x-b3-traceid, and that header's value changes for requests which should belong to the same trace. |
Hey @euroelessar @mattklein123 is this something that we can expect in the upcoming release ? we are trying to plan our course of action, would appreciate any update. |
I don't think anyone is working on this currently. I will try to get this done since it's a frequent request. |
Just wanted to note that at least some tracing implementations do their own sampling decision propagation, e.g. in Lightstep. |
For everyone watching this issue I'm going to fix this, but I'm trying to sort through some details. First, the request ID header mangling is behind this guard AFAICT: envoy/source/common/http/conn_manager_utility.cc Lines 261 to 263 in ea39e3c
So AFAICT no mangling will occur unless the user configures the So I assume everyone hitting this issue wants to use tracing orthogonally to request ID generation. Is that right? Or am I misunderstanding how the mangling is occurring? Assuming the real issue is people want to use tracing separate from request ID generation my suggestion is to do the following:
Let me know what folks think about this as I want to make sure we are solving the right problem here. |
Hey @mattklein123, thanks for you help on this. Congrats on the baby btw, hope you're all healthy and safe. Yes, we are using What we were considering was to have something along the lines of your option If you decide not to ship this within Envoy, we would actually start looking into the possibility of creating our own Extension with the same idea stated above. In a few cases the Today what we do is to revert the byte back from 9 to 4 using a I hope this provides some more clarity |
Hi @mattklein123, thanks for working on this one. What you describe would solve our problem with the current things. |
Yes, this is my plan. I'm going to allow configuration to be passed to the built in UUID request ID extension. Initially I will just add configuration to disable trace status packing, but eventually we could also allow the header name itself to be configured. I think this will satisfy all of the problems that are being reported here. |
1) Promote the default UUID request_id implementation to an actual extension that is required in the build and wire up all documentation. 2) Add a configuration option to the extension that allows trace reason packing to be disabled (the default continues to be for it to be enabled to match existing behavior). 3) Update all documentation for the new behavior. 4) Substantial cleanup of these code paths for clarity and robustness. Fixes #11532 Signed-off-by: Matt Klein <mklein@lyft.com>
1) Promote the default UUID request_id implementation to an actual extension that is required in the build and wire up all documentation. 2) Add a configuration option to the extension that allows trace reason packing to be disabled (the default continues to be for it to be enabled to match existing behavior). 3) Update all documentation for the new behavior. 4) Substantial cleanup of these code paths for clarity and robustness. Fixes #11532 Signed-off-by: Matt Klein <mklein@lyft.com>
1) Promote the default UUID request_id implementation to an actual extension that is required in the build and wire up all documentation. 2) Add a configuration option to the extension that allows trace reason packing to be disabled (the default continues to be for it to be enabled to match existing behavior). 3) Update all documentation for the new behavior. 4) Substantial cleanup of these code paths for clarity and robustness. Fixes #11532 Signed-off-by: Matt Klein <mklein@lyft.com>
I posted #15248 which should fix this if anyone wants to take a look while it winds its way through code review. |
1) Promote the default UUID request_id implementation to an actual extension that is required in the build and wire up all documentation. 2) Add a configuration option to the extension that allows trace reason packing to be disabled (the default continues to be for it to be enabled to match existing behavior). 3) Update all documentation for the new behavior. 4) Substantial cleanup of these code paths for clarity and robustness. Fixes #11532 Signed-off-by: Matt Klein <mklein@lyft.com>
1) Promote the default UUID request_id implementation to an actual extension that is required in the build and wire up all documentation. 2) Add a configuration option to the extension that allows trace reason packing to be disabled (the default continues to be for it to be enabled to match existing behavior). 3) Update all documentation for the new behavior. 4) Substantial cleanup of these code paths for clarity and robustness. Fixes envoyproxy/envoy#11532 Signed-off-by: Matt Klein <mklein@lyft.com> Mirrored from https://github.com/envoyproxy/envoy @ 07c4c17be61c77d87d2c108b0775f2e606a7ae12
Hi @mattklein123, just to clarify something, based on the newly created doc: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/request_id/uuid/v3/uuid.proto#envoy-v3-api-msg-extensions-request-id-uuid-v3-uuidrequestidconfig So I would be able to just use:
And I will still get 100% sampling (but without header mutation) because of the |
@Brunomachadob yes that's correct. Let me know if it doesn't work for you. |
Yes, this is will work for me. |
Do you have a full example EnvoyFilter that would set pack_trace_reason: false? We think this is set but still seeing the 14th nibble being changed. Thanks! |
1) Promote the default UUID request_id implementation to an actual extension that is required in the build and wire up all documentation. 2) Add a configuration option to the extension that allows trace reason packing to be disabled (the default continues to be for it to be enabled to match existing behavior). 3) Update all documentation for the new behavior. 4) Substantial cleanup of these code paths for clarity and robustness. Fixes envoyproxy/envoy#11532 Signed-off-by: Matt Klein <mklein@lyft.com>
@Brunomachadob @mattklein123 Do we have a sample EnvoyFilter? I tried with multiple options but not working.
|
|
Description
When providing an x-request-id header in the request (rather than have it injected by Envoy), a UUID has its 14th byte modified. This results in a different UUID than provided by the caller, breaking the x-request-id passed along a call chain.
At a cursory glance, I believe the code is here:
request_id_extension_uuid_impl.cc
This code doesn't distinguish either:
Illustrative flow:
Ask
Ideally, setTraceStatus() should only update trace status on valid UUIDs generated by Envoy itself, and should not update values provided/retained from caller.
Config
Seems to occur either naturally (when request is deemed to be x-request-internal) or when using the 'preserve_external_request_id' connection manager flag.
References
I believe it caused the behaviour seen ##6050
Examples/Reproduce
Used with lds.yaml:
Example - Request populated 'x-request-id', len != 36
Example - Request populated 'x-request-id, UUID with 14th byte == 4 (no trace)
(extraneous -v fields removed)
The text was updated successfully, but these errors were encountered: