New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenTelemetry: Context lost on consumer retry and Baggage Propagation #1452
Comments
Hi @bschwehn , We did not consider the retry use case in our implementation of OpenTelemetry. I am unsure if adding a retry use case would make the entire tracing timeline look reasonable, especially since retries have to account for multiple attempts or eventual failure. The reason for using _contexts is that we chose not to inject the latest context into the Header in BeforeConsume, which is not only a performance improvement but also reduces complexity. On the other hand, we overlooked the contextual continuity between sending messages again in the consumer during our implementation. If you are interested, please refer to #1100. We would be very happy to accept a PR, and it would be great if you could help resolve #1100. Thanks very much. |
Thanks for the quick reply. I have tested the scenario:
The Baggage and traceid is forwarded for all the calls. But I will try to set up this scenario today to make sure.
What I am trying to accomplish is this:
I'll try to check the behaviour with a non aspnet consumer and will create a PR then. |
I believe we can then entirely remove the _context Dictionary. It is already not used during publish: It is set here:
And removed here:
but the value is never used anymore. I will put this change into the PR also and it will then be easier to talk about the change. But if you already see something wrong with this reasoning above, let me know. |
Hi @bschwehn I think you have a deeper understanding here than I do, just do it and look forward to your PR, thank you very much. |
1. Fixes an issue that caused context propagation to fail on message retry (dotnetcore#1452) Before, BeforeSubscribeInvoke removed the context so it was no longer available on retry. It also removes the _contexts dictionary, as during Subscribe this handling didn't work during retry and during Publish, the dictionary value was no longer used. I have tested this with both aspnetcore services and console subscriber that publishes during the subscribe and context propagation was working fine. Thus dotnetcore#1100 is also closed, probably already in dotnetcore#1407.
1. Fixes an issue that caused context propagation to fail on message retry (dotnetcore#1452) Before, BeforeSubscribeInvoke removed the context so it was no longer available on retry. It also removes the _contexts dictionary, as during Subscribe this handling didn't work during retry and during Publish, the dictionary value was no longer used. I have tested this with both aspnetcore services and console subscriber that publishes during the subscribe and context propagation was working fine. Thus dotnetcore#1100 is also closed, probably already in dotnetcore#1407.
1. Fixes an issue that caused context propagation to fail on message retry (dotnetcore#1452) Before, BeforeSubscribeInvoke removed the context so it was no longer available on retry. It also removes the _contexts dictionary, as during Subscribe this handling didn't work during retry and during Publish, the dictionary value was no longer used. I have tested this with both aspnetcore services and console subscriber that publishes during the subscribe and context propagation was working fine. Thus dotnetcore#1100 is also closed, probably already in dotnetcore#1407.
1. Fixes an issue that caused context propagation to fail on message retry (#1452) Before, BeforeSubscribeInvoke removed the context so it was no longer available on retry. It also removes the _contexts dictionary, as during Subscribe this handling didn't work during retry and during Publish, the dictionary value was no longer used. I have tested this with both aspnetcore services and console subscriber that publishes during the subscribe and context propagation was working fine. Thus #1100 is also closed, probably already in #1407. Co-authored-by: Benjamin Schwehn <benjamin.schwehn.contractor@ert.com>
Thanks! Feel free to @ me if this changes seems to cause issues down the road! |
Hi,
We had an issue that OpenTelemetry context propagation to the consumer is not working correctly on retry.
The issue seems to be, that the context is extracted in BeforeConsume:
CAP/src/DotNetCore.CAP.OpenTelemetry/DiagnosticListener.cs
Line 149 in 67c882a
and then saved:
CAP/src/DotNetCore.CAP.OpenTelemetry/DiagnosticListener.cs
Line 172 in 67c882a
Then the first BeforeSubscribeInvoke removes the saved context:
CAP/src/DotNetCore.CAP.OpenTelemetry/DiagnosticListener.cs
Line 205 in 67c882a
On retries, BeforeConsume is not called again and the saved context is no longer saved, thus causing the issue.
I have tried changing the code to not save the context and instead just extract it every time in BeforeSubscribeInvoke, and that solves our issue.
But I don't know why the context is saved in the first place -- is it just a performance improvement?
I can create a pull request if you think this way of fixing the issue is correct.
I have also changed the Propagator from TraceContextPropagator
CAP/src/DotNetCore.CAP.OpenTelemetry/DiagnosticListener.cs
Line 26 in 67c882a
to Propagators.DefaultTextMapPropagator and added a
Baggage.Current = propagatedContext.Baggage;
which then automatically propagates Baggage.Current, which may also be useful for other people?
Instead or Propagators.DefaultTextMapPropagator, it can explicitly be set to
for BaggagePropagation, but as I understand, Propagators.DefaultTextMapPropagator will allow users of CAP to enable or disable baggage propagation by calling Sdk.SetDefaultTextMapPropagator (https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Api/README.md?plain=1#L463)
The text was updated successfully, but these errors were encountered: