Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Activity should have AddExceptionMethod #53641

Closed
DR9885 opened this issue Jun 2, 2021 · 61 comments · Fixed by #102905
Closed

Activity should have AddExceptionMethod #53641

DR9885 opened this issue Jun 2, 2021 · 61 comments · Fixed by #102905
Labels
api-approved API was approved in API review, it can be implemented area-System.Diagnostics.Activity blocking Marks issues that we want to fast track in order to unblock other important work feature-request in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@DR9885
Copy link

DR9885 commented Jun 2, 2021

Edited by @tarekgh

Proposal

This proposal aims to enable the recording of exceptions during tracing operations, in accordance with the OpenTelemetry Record Exception specification

namespace System.Diagnostics;

public class Activity
{
    public void RecordException(Exception exception, TagList tags = default, DateTimeOffset timestamp = default);
}

public delegate void ExceptionRecording(Activity activity, Exception exception, ref TagList tags);

public sealed class ActivityListener : IDisposable
{
    // Will be invoked when Activity.RecordException is called.
    public ExceptionRecording? ExceptionRecording { get; set; }
}

Notes

  • RecordException will save details of the exception into an ActivityEvent entry, adhering to the guidelines of the OpenTelemetry semantic convention.
  • RecordException captures information such as the stack trace, exception type name, and exception message, but not the exception object itself. This approach is chosen because the exception might be rethrown, potentially altering the stack trace. Additionally, exception objects can be large and complex, leading to performance issues.
  • Anyone interested in accessing the actual exception object during recording can use ActivityListener.ExceptionRecording, which provides a notification that includes the exception objects along with the tags to be recorded. When listeners are enabled, the behavior of RecordException will be as follows:
    • RecordException will iterate through all non-nullable ExceptionRecording listeners one by one, passing the exception object and the input tags list provided to RecordException. Listeners can add or modify tags as necessary.
    • If multiple registered listeners modify the same tags, the changes made by the last listener will prevail and be recorded in the event.
    • After calling all listeners, RecordException adds the populated event to the Activity.Events list. If any tags specified in the OpenTelemetry semantic convention are missing, they will be added to the tag list before the event is stored.
  • RecordException will always log the event, and there will be no option to suppress its addition.

The recorded event will have the name as exception and will include the following tags:


Usage Example

    using ActivitySource aSource =  new ActivitySource("SourceActivityListener");
    
    using ActivityListener listener = new ActivityListener();
    listener.ShouldListenTo = (activitySource) => object.ReferenceEquals(aSource, activitySource);
    listener.Sample = (ref ActivityCreationOptions<ActivityContext> activityOptions) => ActivitySamplingResult.AllData;
    listener.ExceptionRecording = (activity, exception, ref tags) => { /* access the exception and the tags */  };
    ActivitySource.AddActivityListener(listener);

    using Activity activity = aSource.StartActivity("AllDataRequestedActivity");
    activity.RecordException(new Exception("Exception to be recorded"), new TagList { { "exception.escaped", "true" } }, DateTimeOffset.UtcNow );
    
    ActivityEvent exceptionEvent = activity.Events.First();

Original Proposal

Activity should have AddExceptionMethod, rather than having to set these tags everytime.

        public static Activity AddException(this Activity activity, Exception ex)
        {
            return activity?.AddTag("event", "error")
                .AddTag("error", "true")
                .AddTag("error.kind", ex.GetType().Name)
                .AddTag("error.message", ex.Message)
                .AddTag("error.stack", ex.StackTrace);
        }

This should also have constants for these strings, for performance.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Diagnostics.Tracing untriaged New issue has not been triaged by the area owner labels Jun 2, 2021
@ghost
Copy link

ghost commented Jun 2, 2021

Tagging subscribers to this area: @tarekgh, @tommcdon, @pjanotti
See info in area-owners.md if you want to be subscribed.

Issue Details

Activity should have AddExceptionMethod, rather than having to set these tags everytime.

        public static Activity AddException(this Activity activity, Exception ex)
        {
            return activity?.AddTag("event", "error")
                .AddTag("error", "true")
                .AddTag("error.kind", ex.GetType().Name)
                .AddTag("error.message", ex.Message)
                .AddTag("error.stack", ex.StackTrace);
        }

This should also have constants for these strings, for performance.

Author: DR9885
Assignees: -
Labels:

area-System.Diagnostics.Tracing, untriaged

Milestone: -

@ghost
Copy link

ghost commented Jun 2, 2021

Tagging subscribers to this area: @tarekgh
See info in area-owners.md if you want to be subscribed.

Issue Details

Activity should have AddExceptionMethod, rather than having to set these tags everytime.

        public static Activity AddException(this Activity activity, Exception ex)
        {
            return activity?.AddTag("event", "error")
                .AddTag("error", "true")
                .AddTag("error.kind", ex.GetType().Name)
                .AddTag("error.message", ex.Message)
                .AddTag("error.stack", ex.StackTrace);
        }

This should also have constants for these strings, for performance.

Author: DR9885
Assignees: -
Labels:

area-System.Diagnostics.Activity

Milestone: -

@tarekgh
Copy link
Member

tarekgh commented Jun 2, 2021

Why can't use Activity.SetCustomProperty?

Something like:

activity.SetCustomProperty("exception", exceptionObject);

In general, for error reporting we followed up OpenTelemetry specs and provided Activity.SetStatus. So, you can also do:

activity.SetStatus(ActivityStatusCode.Error, exceptionObject.ToString());

I don't think adding a method for exception will be desirable for most scenarios as wouldn't be portable.

CC @noahfalk

@noahfalk
Copy link
Member

noahfalk commented Jun 2, 2021

There is a semantic convention for exceptions but it is currently marked experimental. I do not recommend we codify it into the BCL until it is more stable, but once it was stable AddException() seems like a useful API.

@DR9885
Copy link
Author

DR9885 commented Jun 9, 2021

Theres also a few tags listed on microsoft docs:

otel.status_code
otel.status_description

https://docs.microsoft.com/en-us/dotnet/core/diagnostics/distributed-tracing-instrumentation-walkthroughs#optional-add-status

... However ... I Just realized this is done by SetStatus, so maybe redundant to an earlier comment

Based on all comments though, It might be better if SetStatus did all of this:

  public static Activity SetStatus(this Activity activity, Status status, Exception ex = null)
        {
            return activity?.SetTag("event", "error")
                .SetTag("otel.status_code", status.ToString())
                .SetTag("otel.status_description", ex.Message)
                .SetTag("error", "true")
                .SetTag("error.kind", ex.GetType().Name)
                .SetTag("error.message", ex.Message)
                .SetTag("error.stack", ex.StackTrace);
        }

@CodeBlanch
Copy link
Contributor

@DR9885 Just to clarify this a bit, the OTel spec doesn't say to add tags for an exception, it says to add an event (with tags). Here is the extension method we have in the OTel.NET library for this: https://github.com/open-telemetry/opentelemetry-dotnet/blob/4c22707a98abf356fa4224ebbc872f3bad0d72c6/src/OpenTelemetry.Api/Trace/ActivityExtensions.cs#L95

But yes it would be great to have this API on activity so that users don't need the OpenTelemetry.Api dependency.

I was thinking the .NET API would be something simple like...

class Activity
{
   public Exception? Exception { get; set; }
}

...or...

class Activity
{
   public Exception? Exception { get; }

   public void RecordException(Exception exception) { ... }
}

And then the OpenTelemetry library (or anyone interested) could worry about what to do with it.

/cc @cijothomas

@cijothomas
Copy link
Contributor

Related issue : open-telemetry/opentelemetry-dotnet-contrib#1794
(There are privacy concerns about storing stack as activity event)

@tarekgh
Copy link
Member

tarekgh commented Mar 16, 2023

Reopen to track adding the API in the future.

@jamescrosswell
Copy link

Just adding my vote - would love to be able to see Exception(s) on the Activity somewhere. It's possible for multiple Exceptions to be thrown, caught and recorded for a single Activity right, so maybe it makes sense for this to live on Events within the Activity rather than directly on the Activity?

@tarekgh
Copy link
Member

tarekgh commented May 7, 2024

Linking related PR.

open-telemetry/semantic-conventions#970

@jamescrosswell
Copy link

Linking related PR.

open-telemetry/semantic-conventions#970

@tarekgh it's nice to see support for exceptions as Events. However it looks like the intention is still to "flatten" the stack trace to a string, which means a loss of information. There's no easy way to take that string and match it up with debug files or source maps to augment the telemetry with additional context, when it's available.

It would be amazing if there was some kind of OnRecordException callback that we could register to intercept the original exception... or possibly an Event.TryGetOriginalException method... any way to get the original exception really since what is available via OTel currently isn't sufficient for many use cases.

@KalleOlaviNiemitalo
Copy link

@jamescrosswell, what kind of debug files are you using that Exception.StackTrace does not parse already?

@jamescrosswell
Copy link

@jamescrosswell, what kind of debug files are you using that Exception.StackTrace does not parse already?

One example of something we do with raw exceptions is to generate enhanced stack frames using Ben.Demystifier.

There are plenty of other examples but the TLDR; is that StackTrace.ToString() is insufficient. We need structured exception data to work with.

@KalleOlaviNiemitalo
Copy link

I wonder what would happen if AddException just saved the Exception reference with the "exception.stacktrace" tag name. Would telemetry libraries implicitly call Exception.ToString() and thus get the OpenTelemetry-specified content?

@tarekgh
Copy link
Member

tarekgh commented May 8, 2024

This is the same direction I am thinking in. Is to store the exception object in the event. The only thing raised and not addressed is how to handle the rethrow case as the stack can change after recording it. This is what I was trying to get some feedback on how common this scenario or should we care much about it? I know there is some ideas in this issue to expose some callback in the activity listener to allow users get the notification when recording the exception.

@jamescrosswell
Copy link

jamescrosswell commented May 8, 2024

The only thing raised and not addressed is how to handle the rethrow case as the stack can change after recording it. This is what I was trying to get some feedback on how common this scenario or should we care much about it?

For our part, we only care about the first throw. Our SDK has some duplicate exception logic built into it to explicitly ignore exceptions that have already been processed. We're a bit of a special case though since we do crash reporting. We might see the exception when it's first caught (e.g. via Activity.RecordException) and then if it gets re-thrown (in order to ensure particular application logic) we might see it again via a global exception handler. When we get it via the global exception handler we know we've already processed it, so we just ignore...

There are other variants of this, like where the exception gets captured, wrapped in an AggregateException and re-thrown. Again, we just filter those out via duplicate exception detection.

There may be situations where people genuinely care about the re-thrown exception but I haven't bumped into any yet.

@tarekgh
Copy link
Member

tarekgh commented May 9, 2024

@jamescrosswell thanks, that is helpful.

Could you elaborate more on how Activity.RecordException will be used in your scenario?

@jamescrosswell
Copy link

Could you elaborate more on how Activity.RecordException will be used in your scenario?

Sentry has a number of capabilities. One of them is tracing and one of them is crash reporting. For crash reporting, there are a couple of different ways we can "detect" and capture the details of an exception when one occurs but, if people are using OpenTelemetry to instrument their applications, one of them would be this:

using (var activity = MyActivitySource.StartActivity("Foo"))
{
    try
    {
        Func();
    }
    catch (SomeException ex)
    {
        activity?.SetStatus(ActivityStatusCode.Error, ex.Message);
        activity?.RecordException(ex);
    }
}

At the moment, when exceptions are captured in that way, we only have a subset of the information that we actually need when an exception is captured in order to be able to provide users with full context for the exception that has been detected.

By contrast, if people capture exceptions using the Sentry APIs (rather than the OpenTelemetry APIs), we can give them a whole bunch of extra information. We'd love to be able to provide equivalent functionality if people are instrumenting their applications using OpenTelemetry, but we can only do that if we have access to the original exception (not just StackTrace.ToString())

    try {
        // ... do some dangerous stuff
    }
    catch(Exception e) {
        // This captures the exception with a bunch of other context and sends it to a dashboard where users
        // can see other similar exceptions. They can upload debug files and source maps to Sentry that we use
        // to show the exact line of code in their application that threw the exception and we can even tell them 
        // the commit where that line of code was last changed - so potentially the PR that caused a bug
        SentrySdk.CaptureException(e); // <- This is our vendor specific equivalent of RecordException... 
    }

For our particular use case, the exception doesn't have to be stored as a property on the Activity or as an Activity.Event... we just need some way to be notified when an Exception occurs (e.g. via a callback). I guess, if we wanted, we could always use such a callback to record whatever exception was detected either on the Activity.Tags or ActivityEvent.Tags.

If you provided some kind of callback then, you'd give people the flexibility to be able to record or not record the exception against the Activity/Event (or to store some summary/transformation of that, if that's all they required).

@KalleOlaviNiemitalo
Copy link

some way to be notified when an Exception occurs

There is the AppDomain.FirstChanceException event, but it seems difficult to guarantee that the event handler won't throw more exceptions and cause a stack overflow, especially now that .NET doesn't support constrained execution regions (CER).

@jamescrosswell
Copy link

jamescrosswell commented May 9, 2024

There is the AppDomain.FirstChanceException event

Hm, interesting idea 🤔 . I see we already use that here... as a last resort that might work then, but I'd definitely rather get at the exception via the Activity, if that was possible.

@tarekgh
Copy link
Member

tarekgh commented May 21, 2024

I have updated the proposal and added some notes at the top. Please have a look and let me know if you have any feedback before we proceed with that proposal. Thanks!

@jamescrosswell
Copy link

Anyone interested in accessing the actual exception object during recording can use ActivityListener.ExceptionRecording, which provides a notification that includes the exception objects along with the tags to be recorded.

Yay! This would be fantastic ❤️

@tarekgh tarekgh added blocking Marks issues that we want to fast track in order to unblock other important work api-ready-for-review API is ready for review, it is NOT ready for implementation labels May 24, 2024
@terrajobst
Copy link
Member

terrajobst commented May 28, 2024

Video

  • We prefer the er suffix for the delegate
  • We should use Add as the prefix as that's consistent with other methods on Activity
  • We concluded we don't need to make the parameters nullable
namespace System.Diagnostics;

public partial class Activity
{
    public void AddException(Exception exception, TagList tags = default, DateTimeOffset timestamp = default);
}

public delegate void ExceptionRecorder(Activity activity, Exception exception, ref TagList tags);

public partial class ActivityListener : IDisposable
{
    public ExceptionRecorder? ExceptionRecorder { get; set; }
}

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels May 28, 2024
@tarekgh
Copy link
Member

tarekgh commented May 30, 2024

@terrajobst @bartonjs, as we have decided to use the method name AddException instead of RecordException, and considering the pattern of other methods in the Activity class, the Add methods always return the Activity instead of void. I will follow this pattern and make AddException return Activity instead of void. Please let me know if you have any concerns about this. Thanks!

public Activity AddTag(string key, string? value) => AddTag(key, (object?)value);



public Activity AddBaggage(string key, string? value)

@terrajobst
Copy link
Member

Returning Activity makes sense, so:

namespace System.Diagnostics;

public partial class Activity
{
    public Activity AddException(Exception exception, TagList tags = default, DateTimeOffset timestamp = default);
}

public delegate void ExceptionRecorder(Activity activity, Exception exception, ref TagList tags);

public partial class ActivityListener : IDisposable
{
    public ExceptionRecorder? ExceptionRecorder { get; set; }
}

@tarekgh
Copy link
Member

tarekgh commented Jun 3, 2024

@terrajobst @bartonjs sorry to ping again, there was a suggestion to mark the tags parameter with in keyword to avoid copying the TagList struct. what do you think?

public Activity AddException(Exception exception, in TagList tags = default, DateTimeOffset timestamp = default);

@CodeBlanch

@bartonjs
Copy link
Member

bartonjs commented Jun 3, 2024

My initial thought is "using in goes against our guidelines". But that sounds familiar for this, and looking at the ref.cs it does appear that this struct is only ever passed as ref or in; so in would be consistent here, I guess.

@tarekgh
Copy link
Member

tarekgh commented Jun 3, 2024

Thanks @bartonjs!

I'll go ahead and apply that if @terrajobst has no objection.

@tarekgh
Copy link
Member

tarekgh commented Jun 3, 2024

I created the PR #103009 to update the tags parameter modifier.

@KalleOlaviNiemitalo
Copy link

It seems ActivitySource can call the ExceptionRecorder delegate multiple times with the same arguments, if another thread adds or removes listeners in parallel with Activity.AddException. This is more surprising for ExceptionRecorder, which is designed to have side effects on the TagList, than for ShouldListenTo. It will probably be OK but it should preferably be documented.

@github-actions github-actions bot locked and limited conversation to collaborators Jul 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Diagnostics.Activity blocking Marks issues that we want to fast track in order to unblock other important work feature-request in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
Development

Successfully merging a pull request may close this issue.