Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ActivitySourceAdapter sampling is internal-only #1761

Conversation

CodeBlanch
Copy link
Member

@CodeBlanch CodeBlanch commented Jan 30, 2021

The .NET API allows anyone to register an ActivityListener like this:

            using var activityListener = new ActivityListener
            {
                ShouldListenTo = source => source.Name == myActivitySourceName,
                Sample = (ref ActivityCreationOptions<ActivityContext> options) => ActivitySamplingResult.AllData,
            };

            ActivitySource.AddActivityListener(activityListener);

The Sample call will apply to any Activity sent through where ShouldListenTo resolves to true. In the case of multiple listeners, the highest level of sampling returned wins. This means users can participate in OpenTelemetry sampling if they so desire.

ActivitySourceAdapter however only runs the OpenTelemetry sampling.

Changes

ActivitySourceAdapter will now sample using all the registered listeners.

Example

This test case is a good example of what this fix accomplishes:

public void ActivitySourceAdapter_ExternalActivityListener_Test()
{
using var activityListener = new ActivityListener
{
ShouldListenTo = source => source.Name == this.activitySource.Name,
Sample = (ref ActivityCreationOptions<ActivityContext> options) => ActivitySamplingResult.AllData,
};
ActivitySource.AddActivityListener(activityListener);
this.testSampler.SamplingAction = (samplingParams) =>
new SamplingResult(SamplingDecision.Drop, null);
var activity = new Activity("test");
activity.Start();
this.activitySourceAdapter.Start(activity, ActivityKind.Internal, this.activitySource);
activity.Stop();
Assert.True(activity.IsAllDataRequested);
}

External ActivityListener impacting the OpenTelemetry sampling decision.

TODOs

  • CHANGELOG.md updated for non-trivial changes

@CodeBlanch CodeBlanch requested a review from a team as a code owner January 30, 2021 01:14
@CodeBlanch
Copy link
Member Author

I ran into this real-world trying to provide a utility which listens to all the Activity objects created for a specific TraceId: https://github.com/Macross-Software/core/blob/83791dd0500945b370ff6386ee8091d3790d7e51/ClassLibraries/Macross.OpenTelemetry.Extensions/Code/ActivityTraceListenerManager.cs#L130-L135

return new SampleResult(samplingResult, aco._samplerTags);
}
*/
private static Func<ActivitySource, Activity, ActivityContext, SampleResult> CreateCallSamplersFunc()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tarekgh This was mega hard to put together 🤯 We don't have to expose anything in the public API, but if you could refactor the below block into a helper method that I could bind to it would be much easier and probably less fragile:

https://github.com/dotnet/runtime/blob/f2148079d4476073bb5e79277b557807ed0c9984/src/libraries/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/ActivitySource.cs#L195-L206

The main challenge with it as-is (primarily) is the closure. Have to hook into compiler-generated types to fire that.

@codecov
Copy link

codecov bot commented Jan 30, 2021

Codecov Report

Merging #1761 (0c44c9e) into main (416a6c1) will increase coverage by 0.21%.
The diff coverage is 95.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1761      +/-   ##
==========================================
+ Coverage   83.14%   83.36%   +0.21%     
==========================================
  Files         193      193              
  Lines        6005     6083      +78     
==========================================
+ Hits         4993     5071      +78     
  Misses       1012     1012              
Impacted Files Coverage Δ
...nTelemetry.Exporter.Jaeger/Implementation/Batch.cs 84.09% <ø> (+3.65%) ⬆️
...rc/OpenTelemetry.Exporter.Jaeger/JaegerExporter.cs 83.33% <72.72%> (-1.93%) ⬇️
src/OpenTelemetry/Trace/ActivitySourceAdapter.cs 96.96% <97.24%> (-0.09%) ⬇️
...emetry.Api/Internal/OpenTelemetryApiEventSource.cs 79.41% <0.00%> (-2.95%) ⬇️
...ZPages/Implementation/ZPagesExporterEventSource.cs 62.50% <0.00%> (+6.25%) ⬆️

@tarekgh
Copy link
Contributor

tarekgh commented Jan 30, 2021

@CodeBlanch could you explain what you are trying to do in general? why you want invoke all listeners?

CC @noahfalk

@CodeBlanch
Copy link
Member Author

@tarekgh Activity created using ActivitySource.StartActivity gets passed to ActivityListener ShouldListenTo, ActivityStarted, ActivityStopped, & Sample callbacks. Activity created any other way gets passed to ActivityListener ShouldListenTo, ActivityStarted, ActivityStopped callbacks but NOT Sample callback. This was one of the gaps we talked about in our sessions last year. In order to get around it, we have this ActivitySourceAdapter class. Its purpose is to take those Activity objects created some other way (DiagnosticSource, for example) and feed them through for sampling. Before this PR, we just ran them through the OpenTelemetry Sampler. But that creates a weird situation where people using ActivityListener can participate in some of OpenTelemetry, but not all of it. This PR addresses that by running these outside Activity objects through all the listeners in the same way ActivitySource.StartActivity does.

@tarekgh
Copy link
Contributor

tarekgh commented Jan 31, 2021

@CodeBlanch I think depending on the internals of the .NET implementation is fragile as we may (and likely) change it again when accommodating more APIs. I would say for this scenario, may be creating a new Activity using ActivitySource.StartActivity could be reasonable. Did you measure if this can be reasonable for the scenario you are talking about?

@CodeBlanch
Copy link
Member Author

@tarekgh Generally our goal is to have less allocations not more 😄 @cijothomas tried a bunch of different ways to make this work when he was building the ActivitySourceAdapter so I'll let him comment on that. I think if we had to decide between more allocations and supporting external sample interaction through ActivityListener, we would just drop the later.

Long term we are working with aspnetcore team to use ActivitySource by default. Same thing will need to happen with HttpClient diagnostics. And then anything else still using DiagnosticSource, etc. It is a big, maybe impossible effort, but if we can do that, the ActivitySourceAdapter won't be needed. Our ultimate goal is to not need it.

But really ActivitySourceAdapter and this whole migration effort (IMO) is to work around a strange behavior in the runtime. Let me phrase it as a question: What is the use case for Activity objects having two different flows through ActivityListener? That's basically why we need to do this. Today depending on how you create Activity, some will hit ActivityListener ShouldListenTo + ActivityStarted + ActivityStopped events while others (started through ActivitySource) hit ActivityListener ShouldListenTo + ActivityStarted + ActivityStopped + Sample events.

If everything flowed the same, we wouldn't need the adapter.

So the ask is: Can we modify runtime so all Activity objects flow through ActivityListener.Sample?

Could we make that work in a backward-compatible way? I think so. If using ActivitySource, Sample behavior works as it does right now deciding whether or not to create Activity and then setting the alldatarequested & recorded flags. If using any other mechanism, Activity should always be created, sample result should only impact the alldatarequested & recorded flags.

Just an idea 😄

Alternative idea would be for runtime to give us a method (on ActivitySource) to run an Activity instance through Sample and get back the decision. That's basically what this PR is doing.

@tarekgh
Copy link
Contributor

tarekgh commented Jan 31, 2021

Long term we are working with aspnetcore team to use ActivitySource by default. Same thing will need to happen with HttpClient diagnostics. And then anything else still using DiagnosticSource, etc. It is a big, maybe impossible effort, but if we can do that, the ActivitySourceAdapter won't be needed. Our ultimate goal is to not need it.

I am aware about this discussion but the question would be, does OTel is going to drop the support for the older versions of aspnet and Http Client?

So the ask is: Can we modify runtime so all Activity objects flow through ActivityListener.Sample?

I think the Sampling has to be done before creating the Activity object. This was the intention to avoid the allocation all together. In .NET 5.0 we were focusing more on the future direction. So old behavior just continue to work as it is but we enabled the new features and we didn't miss with the legacy support. As you pointed if users started to use ActivitySource then they wouldn't need to care about the legacy stuff. It is clear ActivityListener.Sample wil work with the activities created by ActivitySource. DiagnosticSource/Listener is the way to listen to old way created Activities and can be free to do whatever sampling they need to do with it.
Yes, we can do something in the runtime but I am seeing this is very low priority things to do compared to the requested work/features in our pipelines.

If using any other mechanism, Activity should always be created, sample result should only impact the alldatarequested & recorded flags.
Alternative idea would be for runtime to give us a method (on ActivitySource) to run an Activity instance through Sample and get back the decision. That's basically what this PR is doing.

I would say instead of complicating the whole thing by adding sampling support for already created Activities, we could just enable setting alldatarequested & recorded flags on the old created activities (even after starting it). this will have the same effect. Anyone interested to set these data there will just need to listen (using the DiagnosticsSource/Listener) and then set it. IMHO running the already created Activities to the sampler just to set these fields would be overkill.

@CodeBlanch
Copy link
Member Author

I would say instead of complicating the whole thing by adding sampling support for already created Activities, we could just enable setting alldatarequested & recorded flags on the old created activities (even after starting it). this will have the same effect. Anyone interested to set these data there will just need to listen (using the DiagnosticsSource/Listener) and then set it. IMHO running the already created Activities to the sampler just to set these fields would be overkill.

Interesting idea. I think we need to sample them because users could have registered all kinds of custom logic that will set those flags based on rules we won't have available? Still attempting to think through it 🤣

@cijothomas Any thoughts?

@@ -80,7 +84,7 @@ public void Start(Activity activity, ActivityKind kind, ActivitySource source)

Copy link
Contributor

@utpilla utpilla Feb 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we achieve this by creating a temporary activity with the given source and use its TraceFlags and IsAllDataRequested information in conjunction with the tracer sampler information to update the activity correctly?

So something like below in the Start method would give us the information to decide the highest level of sampling for the activity.

var tempActivity = source.StartActivity("Test");
var isRecordedByOtherListeners = tempActivity.Recorded;
var isAllDataRequestedByOtherListeners = tempActivity.IsAllDataRequested;
tempActivity.Stop();

We would incur the cost of allocating a new activity object though.

@CodeBlanch CodeBlanch mentioned this pull request Feb 18, 2021
1 task
@CodeBlanch
Copy link
Member Author

Closing because ActivitySourceAdapter was removed. Back to the drawing board!

@CodeBlanch CodeBlanch closed this Feb 28, 2021
@CodeBlanch CodeBlanch deleted the activitysourceadapter-external-sample branch March 31, 2021 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants