Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue with auto-injection of Correlation Identifiers (Serilog) #472

Merged
merged 9 commits into from Aug 16, 2019

Conversation

zacharycmontoya
Copy link
Collaborator

Scenario: Some time after the first span is activated, we have set the
logging context properties dd.trace_id=0 and dd.span_id=0. Application
code (using Serilog) then sets their own logging context properties. Then, a new span
is created and previously-set properties no longer exist in the logging context.

Problem: Our OnSpanActivated event immediately disposes the previously-set
correlation identifier properties before re-adding the new values.
Serilog has a strict correctness guarantee that requires properties be
modified in stack order. Since we remove properties further down the stack,
we end up losing properties.

Fix: For all logging libraries, do not set initial values
(dd.trace_id=0, dd.span_id=0). For Serilog, add and remove the two
properties on SpanOpened and on SpanClosed. This ensures that properties
only get added to/removed from the stack once.

@DataDog/apm-dotnet

Scenario: Some time after the first span is activated, we have set the
logging context properties dd.trace_id=0 and dd.span_id=0. Application
code then sets their own logging context properties. Then, a new span is
created and previously-set properties no longer exist in the logging context.

Problem: Our OnSpanActivated event immediately disposes the previously-set
correlation identifier properties before re-adding the new values.
Serilog has a strict correctness guarantee that requires properties be
modified in stack order. Since we remove properties further down the stack,
we end up losing properties.

Fix: For all logging libraries, do not set initial values
(dd.trace_id=0, dd.span_id=0). For Serilog, add and remove the two
properties on SpanOpened and on SpanClosed. This ensures that properties
only get added to/removed from the stack once.
@zacharycmontoya zacharycmontoya added this to the 1.6.2 milestone Aug 15, 2019
@zacharycmontoya zacharycmontoya requested a review from a team as a code owner August 15, 2019 15:38
@zacharycmontoya zacharycmontoya self-assigned this Aug 15, 2019
@zacharycmontoya zacharycmontoya added the area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) label Aug 15, 2019
@colin-higgins
Copy link
Member

Is it still theoretically possible for this to happen if they add correlation properties after we open a scope and then they dispose of them after we have closed a scope?
Is the window gone, or is it smaller?

return;
}

// if the scope that was just closed was the active scope,
// set its parent as the new active scope
_activeScope.Set(current.Parent);
SpanDeactivated?.Invoke(this, new SpanEventArgs(current.Span));
_activeScope.Set(scope.Parent);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change to scope here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point we know that scope==current, and I wanted to have consistency in the variable name we used to make it easier to read. Everywhere we create an SpanEventArgs we use scope now so it seemed like a nice refactor.

@zacharycmontoya
Copy link
Collaborator Author

zacharycmontoya commented Aug 15, 2019

Serilog users should already be aware that they need to open and close their mapped contexts in the correct order, so they will likely be using using statements for them. If they do that (and they also use using statements for creating/disposing any custom spans), then this will always work.

@zacharycmontoya
Copy link
Collaborator Author

Colin and I chatted offline and realized that it will be really hard to ensure Serilog users don't get messed up. This PR is without a doubt an improvement because we only push/pop the correlation identifier properties when we open/close a span. While the other loggers will have a default value of 0, I've opted to not have a default value for Serilog because we would need some part of the managed profiler to inject the value early enough so as not to invalidate its stack behavior.

@zacharycmontoya zacharycmontoya merged commit efe193e into master Aug 16, 2019
@zacharycmontoya zacharycmontoya deleted the zach/bug-fix/correlationid branch August 16, 2019 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) type:bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants