Runtime context API #948

reyang · 2020-07-30T04:02:22Z

Fixes #.

Changes

Trying to build a foundation for us to address #893 (comment) and the 2nd point of #809 (comment).

I haven't got time to work on the Apply and Snapshot part which I've commented out in the PR (and leave them for another PR). These are required for writing test cases. I haven't got time to write test cases but here goes how to use it:

class Program
{
    static void Main()
    {
        // This is the default behavior for newer version of .NET (basically everything except for 4.5.2)
        // RuntimeContext.ContextSlotType = typeof(AsyncLocalRuntimeContextSlot<>);

        // This is the default behavior for .NET 4.5.2
        // RuntimeContext.ContextSlotType = typeof(RemotingRuntimeContextSlot<>);

        // Instead of using the default behavior, you can switch to other context management such like TLS
        RuntimeContext.ContextSlotType = typeof(ThreadLocalRuntimeContextSlot<>);

        RuntimeContext.RegisterSlot<int>("foo");
        RuntimeContext.RegisterSlot<double>("bar");

        RuntimeContext.SetValue("foo", 3);
        var x = RuntimeContext.GetValue<int>("foo");
        Console.WriteLine($"{x}");

        // This gives an idea how to propagate context explicitly to a custom thread
        var snapshot = RuntimeContext.Snapshot();
        var thread = new Thread(() =>
        {
            RuntimeContext.Apply(snapshot);
            DoSomeWork();
        });
        thread.Start();
        thread.Join();
    }
}

And to give some idea on how we can suppress instrumentation from the exporter:

// During the static initialization of the OpenTelemetry SDK
RuntimeContext.RegisterSlot<bool>("otel.suppress_instrumentation");

// Inside the Exporter
public override Task<ExportResult> ExportAsync(
    IEnumerable<Activity> batch, CancellationToken cancellationToken)
{
    var suppressInstrumentation = RuntimeContext.GetValue<bool>("otel.suppress_instrumentation");
    RuntimeContext.SetValue("otel.suppress_instrumentation", true);
    // do something
    // restore the flag, we can create some syntax sugar such as IDispose/using
    RuntimeContext.SetValue("otel.suppress_instrumentation", suppressInstrumentation);
    return Task.FromResult(ExportResult.Success);
}

// Inside the instrumentation or activity processor

if (RuntimeContext.GetValue<bool>("otel.suppress_instrumentation"))
{
    return;
}
ActualInstrumentation();

Please provide a brief description of the changes here. Update the
CHANGELOG.md for non-trivial changes.

For significant contributions please make sure you have completed the following items:

Design discussion issue #
Changes in public API reviewed

codecov · 2020-07-30T04:09:00Z

Codecov Report

Merging #948 into master will decrease coverage by 0.56%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master     #948      +/-   ##
==========================================
- Coverage   68.74%   68.17%   -0.57%     
==========================================
  Files         213      218       +5     
  Lines        5912     5955      +43     
  Branches      967      969       +2     
==========================================
- Hits         4064     4060       -4     
- Misses       1580     1627      +47     
  Partials      268      268

Impacted Files	Coverage Δ
...emetry.Api/Context/AsyncLocalRuntimeContextSlot.cs	`0.00% <0.00%> (ø)`
...elemetry.Api/Context/RemotingRuntimeContextSlot.cs	`0.00% <0.00%> (ø)`
src/OpenTelemetry.Api/Context/RuntimeContext.cs	`0.00% <0.00%> (ø)`
...rc/OpenTelemetry.Api/Context/RuntimeContextSlot.cs	`0.00% <0.00%> (ø)`
...metry.Api/Context/ThreadLocalRuntimeContextSlot.cs	`0.00% <0.00%> (ø)`
src/OpenTelemetry/Metrics/CounterMetricSdkBase.cs	`80.00% <0.00%> (-13.34%)`	⬇️
... and 1 more

CodeBlanch · 2020-07-30T04:26:05Z

I'm not going to lie, this is kind of a 🤯 The second part, where we're using it, looks great. The first part though, for us mere mortals, why would we (or the user hosting us?) want to change the RuntimeContext.ContextSlotType?

reyang · 2020-07-30T04:29:06Z

I'm not going to lie, this is kind of a 🤯 The second part, where we're using it, looks great. The first part though, for us mere mortals, why would we (or the user hosting us?) want to change the RuntimeContext.ContextSlotType?

There are many scenarios where big companies/customers have their own context management, especially when they have interop story with native code, or they have interaction with another runtime/component that has different context management model).

I would expect 99% of the users to just use it as a "context aware" dictionary, only 1% would need to specify a non-default implementation.

CodeBlanch

LGTM. We already have CorrelationContext & DistributedContext so I am a bit worried about just general confusion arising from having three "context" things. Maybe the "ContextSlot" classes should all be named "RuntimeContextSlot" as a hint to their relationship?

reyang · 2020-07-30T04:53:45Z

LGTM. We already have CorrelationContext & DistributedContext so I am a bit worried about just general confusion arising from having three "context" things.

The CorrelationContext and DistributedContext are something we'll have to rework. Most likely the names are going to change (e.g. Baggage), too.

Maybe the "ContextSlot" classes should all be named "RuntimeContextSlot" as a hint to their relationship?

That's a good point, let me rename them.

MikeGoldsmith

I agree this is a fairly advanced scenario but makes sense for those running an older version of .NET and/or require a customisable solution.

alanwest · 2020-07-30T17:51:17Z

@reyang this is great! Thanks for getting this started.

why would we (or the user hosting us?) want to change the RuntimeContext.ContextSlotType

New Relic's .NET agent uses a number of storage mechanisms to flow context, and we have a similar abstraction to what this PR contains. We support the mechanisms in the PR: ThreadLocal, AsyncLocal, and the logical call context for older .NET. Just as another example - in ASP.NET applications we use HttpContext as the "ContextSlotType" for storing context in HttpContext.Items.

General questions:

I have had a fair share of experience battling memory leaks caused by the use of AsyncLocal.
I also have experience battling serialization issues regarding the improper use of storing arbitrary types in logical call context.

I'm not sure there are any easier answers here, though I am curious, @reyang, if you have encountered any issues with your experience on OpenCensus?

reyang · 2020-07-30T18:06:35Z

General questions:

I have had a fair share of experience battling memory leaks caused by the use of AsyncLocal.

I also have experience battling serialization issues regarding the improper use of storing arbitrary types in logical call context.

I'm not sure there are any easier answers here, though I am curious, @reyang, if you have encountered any issues with your experience on OpenCensus?

Not in OpenCensus .NET/C# as I only worked on OpenCensus Python.
I have done similar thing in .NET many years ago (around 2013 IIRC) where we've seen serialization issue in logical call context, across remote boundaries, and that's why you see the BitArray hack in my PR.
Regarding memory "leaks", this is exactly why I make the RuntimeContext.RegisterSlot explicit, and I stored/duplicated the slot name in the RuntimeContextSlot classes - it makes your job easier while using windbg/sos or memory analyzer.

alanwest · 2020-07-30T18:57:19Z

that's why you see the BitArray hack in my PR

Ah, thank you for drawing my attention closer to this. This is a nice way of handling things.

this is exactly why I make the RuntimeContext.RegisterSlot explicit, and I stored/duplicated the slot name in the RuntimeContextSlot classes

👍 cool, yea that makes sense

One more thing to highlight just so that it's in peoples' minds is that using ThreadLocal brings the risk of leaving it "dirty" in the event that it is used in context where a thread hop may occur (e.g., awaiting an async method). Doesn't look like this PR currently uses the ThreadLocalRuntimeContextSlot type. Does it make sense to keep this as an option at this point?

reyang · 2020-07-30T19:09:09Z

One more thing to highlight just so that it's in peoples' minds is that using ThreadLocal brings the risk of leaving it "dirty" in the event that it is used in context where a thread hop may occur (e.g., awaiting an async method). Doesn't look like this PR currently uses the ThreadLocalRuntimeContextSlot type. Does it make sense to keep this as an option at this point?

ThreadLocal is useful for PInvoke scenario. To be honest I don't know if OpenTelemetry .NET customers would use it and how deep they would go. In OpenCensus Python we did have customers who wanted to switch from async local to thread local as they are hosting Python and the host is using explicit threads for workloads.

alanwest · 2020-07-30T19:46:10Z

In OpenCensus Python we did have customers who wanted to switch from async local to thread local as they are hosting Python and the host is using explicit threads for workloads.

Interesting! I wasn't aware that Python as a notion of async local. I suppose the .NET use case would be similar in that it can be nice at times to be able to interrupt the "magic" of async local and prevent it from flowing context where you don't want it to flow - like dedicated background worker threads.

reyang · 2020-07-30T19:50:37Z

Interesting! I wasn't aware that Python as a notion of async local. I suppose the .NET use case would be similar in that it can be nice at times to be able to interrupt the "magic" of async local and prevent it from flowing context where you don't want it to flow - like dedicated background worker threads.

It is called contextvars. In OpenTelemetry we took similar approach and here goes the code FYI https://github.com/open-telemetry/opentelemetry-python/blob/8c4fca5c754bb98fcb0101e5865e5b7e70ecf659/opentelemetry-api/src/opentelemetry/context/__init__.py#L55

runtime context API

b3e0824

reyang requested a review from a team as a code owner July 30, 2020 04:02

reyang requested review from alanwest, CodeBlanch, rajkumar-rangaraj and cijothomas July 30, 2020 04:02

fix comments

a20bd40

CodeBlanch approved these changes Jul 30, 2020

View reviewed changes

rename

3c8be54

CodeBlanch mentioned this pull request Jul 30, 2020

added entityframeworkcore instrumentation #929

Closed

reyang added 2 commits July 29, 2020 23:18

changelog

ee7b5a7

changelog

d13ad3c

MikeGoldsmith approved these changes Jul 30, 2020

View reviewed changes

alanwest approved these changes Jul 30, 2020

View reviewed changes

Merge branch 'master' into reyang/context

dd687f2

cijothomas merged commit 3333957 into master Jul 31, 2020

cijothomas deleted the reyang/context branch July 31, 2020 03:45

reyang mentioned this pull request Jul 31, 2020

Add Sdk.SuppressInstrumentation #960

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime context API #948

Runtime context API #948

reyang commented Jul 30, 2020 •

edited

codecov bot commented Jul 30, 2020 •

edited

CodeBlanch commented Jul 30, 2020

reyang commented Jul 30, 2020 •

edited

CodeBlanch left a comment

reyang commented Jul 30, 2020

MikeGoldsmith left a comment

alanwest commented Jul 30, 2020

reyang commented Jul 30, 2020 •

edited

alanwest commented Jul 30, 2020

reyang commented Jul 30, 2020

alanwest commented Jul 30, 2020

reyang commented Jul 30, 2020

Runtime context API #948

Runtime context API #948

Conversation

reyang commented Jul 30, 2020 • edited

Changes

codecov bot commented Jul 30, 2020 • edited

Codecov Report

CodeBlanch commented Jul 30, 2020

reyang commented Jul 30, 2020 • edited

CodeBlanch left a comment

Choose a reason for hiding this comment

reyang commented Jul 30, 2020

MikeGoldsmith left a comment

Choose a reason for hiding this comment

alanwest commented Jul 30, 2020

reyang commented Jul 30, 2020 • edited

alanwest commented Jul 30, 2020

reyang commented Jul 30, 2020

alanwest commented Jul 30, 2020

reyang commented Jul 30, 2020

reyang commented Jul 30, 2020 •

edited

codecov bot commented Jul 30, 2020 •

edited

reyang commented Jul 30, 2020 •

edited

reyang commented Jul 30, 2020 •

edited