Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime context API #948

Merged
merged 6 commits into from Jul 31, 2020
Merged

Runtime context API #948

merged 6 commits into from Jul 31, 2020

Conversation

reyang
Copy link
Member

@reyang reyang commented Jul 30, 2020

Fixes #.

Changes

Trying to build a foundation for us to address #893 (comment) and the 2nd point of #809 (comment).

I haven't got time to work on the Apply and Snapshot part which I've commented out in the PR (and leave them for another PR). These are required for writing test cases. I haven't got time to write test cases but here goes how to use it:

class Program
{
    static void Main()
    {
        // This is the default behavior for newer version of .NET (basically everything except for 4.5.2)
        // RuntimeContext.ContextSlotType = typeof(AsyncLocalRuntimeContextSlot<>);

        // This is the default behavior for .NET 4.5.2
        // RuntimeContext.ContextSlotType = typeof(RemotingRuntimeContextSlot<>);

        // Instead of using the default behavior, you can switch to other context management such like TLS
        RuntimeContext.ContextSlotType = typeof(ThreadLocalRuntimeContextSlot<>);

        RuntimeContext.RegisterSlot<int>("foo");
        RuntimeContext.RegisterSlot<double>("bar");

        RuntimeContext.SetValue("foo", 3);
        var x = RuntimeContext.GetValue<int>("foo");
        Console.WriteLine($"{x}");

        // This gives an idea how to propagate context explicitly to a custom thread
        var snapshot = RuntimeContext.Snapshot();
        var thread = new Thread(() =>
        {
            RuntimeContext.Apply(snapshot);
            DoSomeWork();
        });
        thread.Start();
        thread.Join();
    }
}

And to give some idea on how we can suppress instrumentation from the exporter:

// During the static initialization of the OpenTelemetry SDK
RuntimeContext.RegisterSlot<bool>("otel.suppress_instrumentation");

// Inside the Exporter
public override Task<ExportResult> ExportAsync(
    IEnumerable<Activity> batch, CancellationToken cancellationToken)
{
    var suppressInstrumentation = RuntimeContext.GetValue<bool>("otel.suppress_instrumentation");
    RuntimeContext.SetValue("otel.suppress_instrumentation", true);
    // do something
    // restore the flag, we can create some syntax sugar such as IDispose/using
    RuntimeContext.SetValue("otel.suppress_instrumentation", suppressInstrumentation);
    return Task.FromResult(ExportResult.Success);
}

// Inside the instrumentation or activity processor

if (RuntimeContext.GetValue<bool>("otel.suppress_instrumentation"))
{
    return;
}
ActualInstrumentation();

Please provide a brief description of the changes here. Update the
CHANGELOG.md for non-trivial changes.

For significant contributions please make sure you have completed the following items:

  • Design discussion issue #
  • Changes in public API reviewed

@codecov
Copy link

codecov bot commented Jul 30, 2020

Codecov Report

Merging #948 into master will decrease coverage by 0.56%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #948      +/-   ##
==========================================
- Coverage   68.74%   68.17%   -0.57%     
==========================================
  Files         213      218       +5     
  Lines        5912     5955      +43     
  Branches      967      969       +2     
==========================================
- Hits         4064     4060       -4     
- Misses       1580     1627      +47     
  Partials      268      268              
Impacted Files Coverage Δ
...emetry.Api/Context/AsyncLocalRuntimeContextSlot.cs 0.00% <0.00%> (ø)
...elemetry.Api/Context/RemotingRuntimeContextSlot.cs 0.00% <0.00%> (ø)
src/OpenTelemetry.Api/Context/RuntimeContext.cs 0.00% <0.00%> (ø)
...rc/OpenTelemetry.Api/Context/RuntimeContextSlot.cs 0.00% <0.00%> (ø)
...metry.Api/Context/ThreadLocalRuntimeContextSlot.cs 0.00% <0.00%> (ø)
src/OpenTelemetry/Metrics/CounterMetricSdkBase.cs 80.00% <0.00%> (-13.34%) ⬇️
... and 1 more

@CodeBlanch
Copy link
Member

I'm not going to lie, this is kind of a 🤯 The second part, where we're using it, looks great. The first part though, for us mere mortals, why would we (or the user hosting us?) want to change the RuntimeContext.ContextSlotType?

@reyang
Copy link
Member Author

reyang commented Jul 30, 2020

I'm not going to lie, this is kind of a 🤯 The second part, where we're using it, looks great. The first part though, for us mere mortals, why would we (or the user hosting us?) want to change the RuntimeContext.ContextSlotType?

There are many scenarios where big companies/customers have their own context management, especially when they have interop story with native code, or they have interaction with another runtime/component that has different context management model).

I would expect 99% of the users to just use it as a "context aware" dictionary, only 1% would need to specify a non-default implementation.

Copy link
Member

@CodeBlanch CodeBlanch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We already have CorrelationContext & DistributedContext so I am a bit worried about just general confusion arising from having three "context" things. Maybe the "ContextSlot" classes should all be named "RuntimeContextSlot" as a hint to their relationship?

@reyang
Copy link
Member Author

reyang commented Jul 30, 2020

LGTM. We already have CorrelationContext & DistributedContext so I am a bit worried about just general confusion arising from having three "context" things.

The CorrelationContext and DistributedContext are something we'll have to rework. Most likely the names are going to change (e.g. Baggage), too.

Maybe the "ContextSlot" classes should all be named "RuntimeContextSlot" as a hint to their relationship?

That's a good point, let me rename them.

Copy link
Member

@MikeGoldsmith MikeGoldsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is a fairly advanced scenario but makes sense for those running an older version of .NET and/or require a customisable solution.

@alanwest
Copy link
Member

@reyang this is great! Thanks for getting this started.

why would we (or the user hosting us?) want to change the RuntimeContext.ContextSlotType

New Relic's .NET agent uses a number of storage mechanisms to flow context, and we have a similar abstraction to what this PR contains. We support the mechanisms in the PR: ThreadLocal, AsyncLocal, and the logical call context for older .NET. Just as another example - in ASP.NET applications we use HttpContext as the "ContextSlotType" for storing context in HttpContext.Items.


General questions:

  • I have had a fair share of experience battling memory leaks caused by the use of AsyncLocal.
  • I also have experience battling serialization issues regarding the improper use of storing arbitrary types in logical call context.

I'm not sure there are any easier answers here, though I am curious, @reyang, if you have encountered any issues with your experience on OpenCensus?

@reyang
Copy link
Member Author

reyang commented Jul 30, 2020

General questions:

  • I have had a fair share of experience battling memory leaks caused by the use of AsyncLocal.
  • I also have experience battling serialization issues regarding the improper use of storing arbitrary types in logical call context.

I'm not sure there are any easier answers here, though I am curious, @reyang, if you have encountered any issues with your experience on OpenCensus?

Not in OpenCensus .NET/C# as I only worked on OpenCensus Python.
I have done similar thing in .NET many years ago (around 2013 IIRC) where we've seen serialization issue in logical call context, across remote boundaries, and that's why you see the BitArray hack in my PR.
Regarding memory "leaks", this is exactly why I make the RuntimeContext.RegisterSlot explicit, and I stored/duplicated the slot name in the RuntimeContextSlot classes - it makes your job easier while using windbg/sos or memory analyzer.

@alanwest
Copy link
Member

that's why you see the BitArray hack in my PR

Ah, thank you for drawing my attention closer to this. This is a nice way of handling things.

this is exactly why I make the RuntimeContext.RegisterSlot explicit, and I stored/duplicated the slot name in the RuntimeContextSlot classes

👍 cool, yea that makes sense


One more thing to highlight just so that it's in peoples' minds is that using ThreadLocal brings the risk of leaving it "dirty" in the event that it is used in context where a thread hop may occur (e.g., awaiting an async method). Doesn't look like this PR currently uses the ThreadLocalRuntimeContextSlot type. Does it make sense to keep this as an option at this point?

@reyang
Copy link
Member Author

reyang commented Jul 30, 2020

One more thing to highlight just so that it's in peoples' minds is that using ThreadLocal brings the risk of leaving it "dirty" in the event that it is used in context where a thread hop may occur (e.g., awaiting an async method). Doesn't look like this PR currently uses the ThreadLocalRuntimeContextSlot type. Does it make sense to keep this as an option at this point?

ThreadLocal is useful for PInvoke scenario. To be honest I don't know if OpenTelemetry .NET customers would use it and how deep they would go. In OpenCensus Python we did have customers who wanted to switch from async local to thread local as they are hosting Python and the host is using explicit threads for workloads.

@alanwest
Copy link
Member

In OpenCensus Python we did have customers who wanted to switch from async local to thread local as they are hosting Python and the host is using explicit threads for workloads.

Interesting! I wasn't aware that Python as a notion of async local. I suppose the .NET use case would be similar in that it can be nice at times to be able to interrupt the "magic" of async local and prevent it from flowing context where you don't want it to flow - like dedicated background worker threads.

@reyang
Copy link
Member Author

reyang commented Jul 30, 2020

Interesting! I wasn't aware that Python as a notion of async local. I suppose the .NET use case would be similar in that it can be nice at times to be able to interrupt the "magic" of async local and prevent it from flowing context where you don't want it to flow - like dedicated background worker threads.

It is called contextvars. In OpenTelemetry we took similar approach and here goes the code FYI https://github.com/open-telemetry/opentelemetry-python/blob/8c4fca5c754bb98fcb0101e5865e5b7e70ecf659/opentelemetry-api/src/opentelemetry/context/__init__.py#L55

@cijothomas cijothomas merged commit 3333957 into master Jul 31, 2020
@cijothomas cijothomas deleted the reyang/context branch July 31, 2020 03:45
@reyang reyang mentioned this pull request Jul 31, 2020
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants