Support Exception Groups #37716

AbhiPrasad · 2022-08-11T17:23:53Z

Problem Statement

Typically, we are used to exceptions having a linear relationship to one another (an exception causes another, which causes another). We represent this by allowing folks to send in a list of exceptions for error events.

In several languages though, there is this idea of AggregateExceptions. With AggregateExceptions, we have error relationships where an AggregateException has multiple causes (a 1 to N relationship).

This was initially raised by @mortargrind in getsentry/sentry-javascript#5469, borrowing their example below:

Legend:

Error1 => Error2 means Error1 caused by Error2 (handled by existing Sentry data model)
Error1 [Error2, Error3] means Error1 aggregates Error2 and Error3 (would be handled by the hypothetical AggregatedErrors)

Sample error chain:

HighLevelError => AggregateException [LowLevelError1, LowLevelError2 => LowerLevelError1, LowLevelError3] => LowestLevelError

Typically these AggregateExceptions are produced when you have futures/async await/coroutines, so a set of tasks (and their possible exceptions) getting associated by the callee of the tasks.

There is no way in Sentry to represent the relationship between these errors.

Examples:

UPDATE

This feature is coming soon. Here is the list of items being tracked to complete it.

Sentry Backend & Relay

Give feedback

Sentry UI

Give feedback

feat(exception-group): Display related exceptions in UI #47176

Scope: Frontend
feat(exception-group): Move related exceptions out of the stack trace and sort properly #48167

Scope: Frontend
feat(exception-group): Auto-collapse child exception groups #48527

Scope: Frontend
Options

SDKs

Give feedback

Add support for ExceptionGroups sentry-python#2025
Rework handling of AggregateException to align with Sentry Exception Groups sentry-dotnet#2287
Handle AggregateErrors sentry-javascript#5469

Status: In Progress Type: Improvement
Options

Misc

Give feedback

rfc(feature): Exception Groups rfcs#79
Document SDK requirements for Exception Groups develop#932
Public announcement
Add docs for Exception Groups sentry-docs#7501

1 of 4

Product Area: Issues Team: Web Backend SDKs Team: Web Frontend SDKs Type: Content
Options

The text was updated successfully, but these errors were encountered:

polamjag · 2022-10-09T02:35:48Z

I think golang/go#53435 (comment) (multierror support in standard library) should be an example in Golang

andyljones · 2022-11-15T13:33:22Z

We've started using ExceptionGroups extensively in our codebase and support for this has become important to us. Sentry has become dramatically less useful to us now that both the UI stacktrace and the raw output are missing the original error.

sbv-csis · 2022-12-19T11:47:34Z

We've also recently started using python ExceptionGroups - if even just the raw stack trace contained the nice stringified exceptiongroup stacktrace that would be major improvement

therealarkin · 2023-01-30T22:37:49Z

@andyljones : Can you describe why Sentry has become less useful because of less of support? Thanks!

therealarkin · 2023-01-30T22:38:16Z

@sbv-csis What do you use ExceptionGroups for?

andyljones · 2023-01-31T01:38:54Z

@therealarkin Sentry only displays the top-level error in the tree. This means it hides the origin of any error that rises up as a group.

Here's an example:

import sentry_sdk
import trio

async def f():
    raise ValueError()
   
async def main():
    try:
         async with trio.open_nursery() as nursery:
             nursery.start_soon(f)
    except Exception as e:
        sentry_sdk.capture_exception(e)
        raise

trio.run(main)

This code runs f in a background task, and f raises an error.

The stack trace printed by this shows both the stack for the code that started the background task (main), and the actual background task (f).

  + Exception Group Traceback (most recent call last):
  |   File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3442, in run_code
  |     exec(code_obj, self.user_global_ns, self.user_ns)
  |   File "<ipython-input-10-f738d97c5b2e>", line 15, in <module>
  |     trio.run(main)
  |   File "/root/code/belt/belt/triospy.py", line 498, in run
  |     return _original_trio_run(*args, **kwargs)
  |   File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/trio/_core/_run.py", line 2010, in run
  |     raise runner.main_task_outcome.error
  |   File "<ipython-input-10-f738d97c5b2e>", line 9, in main
  |     async with trio.open_nursery() as nursery:
  |   File "/root/.pyenv/versions/3.10.8/lib/python3.10/site-packages/trio/_core/_run.py", line 850, in __aexit__
  |     raise combined_error_from_nursery
  | trio.NonBaseMultiError: ValueError()
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "<ipython-input-10-f738d97c5b2e>", line 5, in f
    |     raise ValueError()
    | ValueError
    +------------------------------------

Sentry meanwhile only shows where the error came out of the nursery

mattjohnsonpint · 2023-03-16T23:15:34Z

Hi. I'm working on the overall design for how Sentry will handle this, and I would appreciate some feedback on a few things.

A few quick notes first:

I'll use the term "exception group" throughout this, but I'm referring to all platforms that implement similar features, as called out in the top of this issue.
I lead the Sentry .NET SDKs and I'm most familiar with AggregateException in .NET.
We're designing the feature generically. .NET and Python will be implemented first.
Existing workarounds in .NET will likely be removed in favor of this new design. For sake of design consistency, we'll pretend Flatten aggregate exception sentry-dotnet#1672 didn't happen.

The first thing to decide on is what to do with a given exception group. Let's look at a Python example (taken from here):

raise ExceptionGroup("nested",
    [
        ValueError(654),
        ExceptionGroup("imports",
            [
                ImportError("no_such_module"),
                ModuleNotFoundError("another_module"),
            ]
        ),
        TypeError("int"),
    ]
)

  + Exception Group Traceback (most recent call last):
  |   ...
  | ExceptionGroup: nested (3 sub-exceptions)
  +-+---------------- 1 ----------------
    | ValueError: 654
    +---------------- 2 ----------------
    | ExceptionGroup: imports (2 sub-exceptions)
    +-+---------------- 1 ----------------
      | ImportError: no_such_module
      +---------------- 2 ----------------
      | ModuleNotFoundError: another_module
      +------------------------------------
    +---------------- 3 ----------------
    | TypeError: int
    +------------------------------------

What might we expect to see in Sentry? My thinking is that we would want four separate issues raised for ValueError, ImportError, ModuleNotFoundError and TypeError.

The exception groups themselves are less important, however they might have stack traces showing where they were fired, and those might need to be visible on each issue.

Now, if say for example that ImportError also had a chained exception of its own that was a an ExceptionGroup, I don't think that one needs to be expanded. We'd just take the first exception from the group and discard the rest. Reason being - the extra exceptions probably don't create any additional insight into troubleshooting the issue.

Does that make sense?

Also, I think that we should be thinking about an upper limit of how many top-level exceptions within an exception group we'll support. A small number (perhaps 10?) makes sense to me. I'm not sure that we would want hundreds to be sent, which could happen in the case of parallel work.

Would appreciate any other insights you may have. Thanks.

mattjohnsonpint · 2023-03-17T00:26:48Z

Next, assuming that we'd create separate issues as described above, how important would it be for those issues to be linked together in some way? In the example above, is it important that the four issues were raised together? Would you need a way to navigate from the ValueError to the TypeError (for example)?

If this is important, my idea is that we could assign each issue created for an exception group with the same unique identifier in a new field (perhaps group_id). We'd make that field indexed and searchable. Then we could update the UI to provide navigation to the issues list, filtering to all issues with the same group_id.

Or is that less important and we should just focus on getting actionable issues created with the right titles and groupings?

therealarkin · 2023-03-17T00:54:26Z

cc: @andyljones, @sbv-csis pls see @mattjohnsonpint comments above and feel free to chime in!

andyljones · 2023-03-17T06:21:04Z

Would appreciate any other insights you may have. Thanks.

My immediate instinct is to lean in the opposite direction to you - to just raise one ExceptionGroup error for the whole thing. My reason for this is that while you can be smart about dismantling an error group into components that are more semantically meaningful, it seems precarious - while the dismantling will be useful most of the time, the few times it becomes misleading will cost a lot of effort.

I obviously don't know much of Sentry internals, but my expectation is that the 80/20 MVP is just to have an equivalent of Python's ExceptionGroup stack trace in the Sentry UI. That'd take Sentry-with-exception-groups from 'unusable' to 'functional' for me immediately, and then anything on top of that would be a bonus.

mattjohnsonpint · 2023-03-17T17:27:41Z

@andyljones - If we were to go that route instead, how would you expect the issue to be titled on the issues list in Sentry?

Keep in mind:

We'd want to avoid having multiple unrelated issues all titled the same, such as "ExceptionGroup: nested (3 sub-exceptions)"
We'd want Sentry's issue grouping to work in a sensible manner, such that the quantity or order of sub-exceptions doesn't affect the way we present multiple events as a single issue.

andyljones · 2023-03-19T19:37:58Z

We'd want to avoid having multiple unrelated issues all titled the same,

I think my aesthetic is that this is fine? My feeling is that this is something to be dealt with by a combination of (a) users raising informative exception types from exception groups (b) maybe (maybe?) some kind of improved grouping system that can match on nested errors.

I'm being lead here by how exception group handling is done in the language itself.

mattjohnsonpint · 2023-03-24T06:58:40Z

Thanks for the feedback. Hopefully we can come up with a solution that will accurately reflect the way the exception group is handled in the language, and also fit in to the expected workflow for issues in Sentry.

I have a few more questions. We'll start with this one:

In .NET, an AggregateException inherits from Exception. All Exception types have a nullable InnerException property that when populated indicates the cause of the exception. With AggregateException, we also have an InnerExceptions array. In all cases of AggregateException, the InnerException will match InnerExceptions[0].

However in Python, things are slightly different. On all exceptions, there is both a __context__ and a __cause__ attribute. __context__ is set if an exception is raised while handling an exception. __cause__ is set if an exception is raised from another exception.

try:
    try:
        raise Exception("1")
    except Exception as e1:
        e2 = Exception("2")
        raise Exception("3") from e2
except Exception as e:
    print (e.__context__) # 1
    print (e.__cause__)   # 2
    print (e)             # 3

Both can exist on a single exception, though context is suppressed in traceback output if cause has been set.

.NET's InnerException is closest to Python's __cause__. .NET doesn't have the concept of "context".

So far so good. The interesting part is that with a Python ExceptionGroup, none of the provided exceptions are considered to be either the context or cause.

try:
    e1 = Exception("1")
    e2 = Exception("2")
    raise ExceptionGroup("group", [e1, e2])
except ExceptionGroup as eg:
    print (eg.__context__) # None
    print (eg.__cause__)   # None
    print (eg.exceptions)  # (Exception('1'), Exception('2'))

So in theory, one could have completely different context or cause than the exceptions in the group.

e1 = Exception("1")
e2 = Exception("2")
e3 = Exception("3")
e4 = Exception("4")

try:
    try:
        raise e1
    except:
        raise ExceptionGroup("group", [e3, e4]) from e2
except ExceptionGroup as eg:
    print (eg.__context__) # 1
    print (eg.__cause__)   # 2
    print (eg.exceptions)  # (Exception('3'), Exception('4'))

I'm not sure how to rationalize this. Does it mean that exceptions 3 and 4 were also contributing to the cause of the exception group? If we could only pick one exception to show as the next item in the chain, which would it be?

The Sentry Python SDK currently does the following:

if exc_value.__suppress_context__:
    cause = exc_value.__cause__
else:
    cause = exc_value.__context__

Should we simply expand that and say if there's no cause or context, use the first exception of the exception group? (That would align with .NET InnerException = InnerExceptions[0].). If so, which should take precedence?

antonpirker · 2023-03-24T09:23:49Z

I think there are two questions here for Python:

What exception in an exception group to pick as "the problem to fix" of the exception group
How to deal with chained exceptions in exception groups.

For 1) I think we just pick the first exception in eg.exeptions and see if the users complain that this is not what they wan.t

For 2) I guess we can ignore the eg.__context__ and eg.__cause because we are just presenting the user with one Exception and not the entire ExceptionGroup. And possible cause of the one Exception should be handled the same way as we deal now with chained exceptions.

This of course just the first step. To get some handling of ExceptionGroup out there and then we can collect feedback and improve the solution over time.

mattjohnsonpint · 2023-03-24T18:26:18Z

@antonpirker - Are you suggesting that __cause__ always be ignored on an ExceptionGroup, and we'd use exceptions[0] instead? Or should we use exceptions[0] only if __cause__ is None?

We need to pick a path, even if exception groups are located further down the chain. For example:

try:
    raise ValueError
except Exception as e1:
    try:
        raise ExceptionGroup("group", [Exception('A'), Exception('B')]) from e1
    except Exception as e2:
        raise TypeError from e2

Python will give the following output:

Traceback (most recent call last):
  File "test.py", line 2, in <module>
    raise ValueError
ValueError

The above exception was the direct cause of the following exception:

  + Exception Group Traceback (most recent call last):
  |   File "test.py", line 5, in <module>
  |     raise ExceptionGroup("group", [Exception('A'), Exception('B')]) from e1
  | ExceptionGroup: group (2 sub-exceptions)
  +-+---------------- 1 ----------------
    | Exception: A
    +---------------- 2 ----------------
    | Exception: B
    +------------------------------------

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test.py", line 7, in <module>
    raise TypeError from e2
TypeError

So in that case, the exception group is in the middle of the chain, and there's already a cause.

✅ The issue in Sentry would be for the TypeError, because it's the most recent exception. It would be the first exception shown on the issue (with the default sorting).
✅ The second exception in the chain shown in Sentry would be fore the ExceptionGroup, because it was the direct cause of the TypeError
❓ What is the third exception in the chain to show on the Sentry issue? Is it ValueError because it's the direct cause of the ExceptionGroup? Or is it Exception: A because it's the first item in the exception group? Does our answer change if __cause__ is None?

Answer that will help us understand how to build the exception tree in Python. One we figure that part out, I think I have a working plan for the rest. That would be as follows:

UPDATE: the below has been redacted because we've significantly deviated from this plan. Please see RFC 79 for details on the updated plan.

Phase 1 (what we'll commit to in the near term):

Issues in Sentry will be created for the "first path" through the tree of exceptions. In other words, by traversing from the root node to the first child until reaching a leaf. All other nodes of the tree would be discarded.

To handle issue titles and grouping, an SDK option will be provided to decide whether to keep the exceptions that are of the exception group type or not. The option name can be language specific (KeepAggregateExceptions in .NET, keep_exception_groups in Python, etc.).

The default should be false, meaning to discard the newest exception groups in the chain until reaching an exception that is not an exception group. (Any others further along in the chain would remain.)

Setting to true would opt-in to having issues raised for the entire chain, so the title would be similar to "ExceptionGroup: nested (3 sub-exceptions)" or "System.AggregateException: One or more errors occurred. (Test 1) (Test 2)", etc.

Elsewhere on the event we will include the entire exception group (as an extra, or message, or attachment, TBD). This will allow two things to happen:

When looking at the "first path" issue, you'll still be able to see some representation of the entire exception group in the Sentry UI.

It provides data we can use for Phase 2 (below).

Phase 2 (for future consideration - not committing to this just yet):

Sentry can run a background job when processing issues that will look for the exception group extra data.

If it finds any, and there's more than one top-level exception that should be its own issue, then it can split it apart, creating new issues. (It would clone the issue and replace the title and stack traces sections with the appropriate information from that part of the exception group tree.)

The net result would be that the SDKs would only send data for the one event, and that could create multiple issues in Sentry depending on what real exceptions were at the top-level in the exception groups.

mattjohnsonpint · 2023-03-24T18:47:43Z

Some further discussion on the Python specifics here: getsentry/sentry-python#1788 (comment)

mattjohnsonpint · 2023-03-26T23:16:48Z

Hi all. I have drafted an RFC with the proposed changes. Please review, and feel free to leave feedback on that PR (or discuss here, if preferred).
https://github.com/getsentry/rfcs/blob/main/text/0079-exception-groups.md

Thanks!

mattjohnsonpint · 2023-04-01T04:26:40Z

Hi everyone. After significant discussion internally, we've revised the plan yet again. The RFC is updated with the new approach. Thanks.

https://github.com/getsentry/rfcs/blob/main/text/0079-exception-groups.md

mattjohnsonpint · 2023-05-05T22:46:42Z

We're getting there! See the UPDATE section at top of this issue for current status. Thanks!

Implements issue grouping and titling changes required for processing exception groups per [Sentry RFC 79](https://github.com/getsentry/rfcs/blob/main/text/0079-exception-groups.md#sentry-issue-grouping). Part of #37716 A few notes: - The issue titling requirements were implemented by dropping a `main_exception_id` on the event data in the exception grouping logic, and then later using it in the `ErrorEvent.extract_metadata` function. If present, it uses the corresponding exception for metadata including the title. This is working well, but just wanted to check if this is safe, or if there's a better way. - In the RFC, I had indicated that disparate top-level exceptions would be grouped under their immediate parent exception, which might not necessarily be the root exception. This turned out to be difficult to implement, and didn't add much value. Instead, I always use the root exception. This seems to work well enough. I included a comment at the appropriate place in the code, in case it comes up in the future. - I only modified the `NewStyle` strategy. I'm not sure if `Legacy` or other strategies should be updated as well? - I fixed a related bug in `exception.py`, which was that the first exception was being used to create default issue tags instead of the last. This should be done regardless of exception groups, as it corrects the `handled` and `mechanism` event tags such that they are for the outermost (latest) exception. Tests are updated to match this change as well.

mattjohnsonpint · 2023-05-23T13:36:26Z

This feature is now available for SaaS (sentry.io) customers. It will also be included in the next Sentry Self-Hosted release (23.6.0, est. June 2023).

On the client side, it requires a supporting SDK. Currently this is supported in:

JavaScript will be coming soon.

Implements issue grouping and titling changes required for processing exception groups per [Sentry RFC 79](https://github.com/getsentry/rfcs/blob/main/text/0079-exception-groups.md#sentry-issue-grouping). Part of #37716 A few notes: - The issue titling requirements were implemented by dropping a `main_exception_id` on the event data in the exception grouping logic, and then later using it in the `ErrorEvent.extract_metadata` function. If present, it uses the corresponding exception for metadata including the title. This is working well, but just wanted to check if this is safe, or if there's a better way. - In the RFC, I had indicated that disparate top-level exceptions would be grouped under their immediate parent exception, which might not necessarily be the root exception. This turned out to be difficult to implement, and didn't add much value. Instead, I always use the root exception. This seems to work well enough. I included a comment at the appropriate place in the code, in case it comes up in the future. - I only modified the `NewStyle` strategy. I'm not sure if `Legacy` or other strategies should be updated as well? - I fixed a related bug in `exception.py`, which was that the first exception was being used to create default issue tags instead of the last. This should be done regardless of exception groups, as it corrects the `handled` and `mechanism` event tags such that they are for the outermost (latest) exception. Tests are updated to match this change as well.

AbhiPrasad · 2023-07-25T13:58:53Z

Closing this issue as we've added the necessary JS implementation here: getsentry/sentry-javascript#5469

We now have getsentry/sentry-docs#7501 and getsentry/develop#932 for docs, but that can be tracked separately there.

AbhiPrasad added Status: Backlog Feature: Errors labels Aug 11, 2022

AbhiPrasad mentioned this issue Aug 11, 2022

Handle AggregateErrors getsentry/sentry-javascript#5469

Closed

rodolfoBee mentioned this issue Dec 16, 2022

Add support for ExceptionGroup getsentry/sentry-python#1788

Closed

therealarkin assigned therealarkin and logicalbomb and unassigned therealarkin Jan 30, 2023

mattjohnsonpint mentioned this issue Mar 17, 2023

Rework handling of AggregateException getsentry/sentry-dotnet#2240

Closed

mattjohnsonpint changed the title ~~Support AggregateExceptions~~ Support Exception Groups Mar 24, 2023

mattjohnsonpint added Status: In Progress and removed Status: Backlog labels Mar 25, 2023

mattjohnsonpint mentioned this issue May 5, 2023

feat(grouping): Exception Groups #48653

Merged

mattjohnsonpint assigned mattjohnsonpint and unassigned logicalbomb May 5, 2023

mattjohnsonpint mentioned this issue May 22, 2023

Rework handling of AggregateException to align with Sentry Exception Groups getsentry/sentry-dotnet#2287

Merged

mattjohnsonpint removed their assignment Jun 20, 2023

lforst self-assigned this Jul 4, 2023

github-actions bot removed the Status: In Progress label Jul 20, 2023

AbhiPrasad mentioned this issue Jul 25, 2023

Add docs for Exception Groups getsentry/sentry-docs#7501

Open

AbhiPrasad closed this as completed Jul 25, 2023

github-actions bot locked and limited conversation to collaborators Aug 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Exception Groups #37716

Support Exception Groups #37716

AbhiPrasad commented Aug 11, 2022 •

edited

Loading

Sentry Backend & Relay

Sentry UI

SDKs

Misc

polamjag commented Oct 9, 2022

andyljones commented Nov 15, 2022

sbv-csis commented Dec 19, 2022

therealarkin commented Jan 30, 2023

therealarkin commented Jan 30, 2023

andyljones commented Jan 31, 2023

mattjohnsonpint commented Mar 16, 2023 •

edited

Loading

mattjohnsonpint commented Mar 17, 2023

therealarkin commented Mar 17, 2023

andyljones commented Mar 17, 2023

mattjohnsonpint commented Mar 17, 2023

andyljones commented Mar 19, 2023

mattjohnsonpint commented Mar 24, 2023

antonpirker commented Mar 24, 2023

mattjohnsonpint commented Mar 24, 2023 •

edited

Loading

Phase 1 (what we'll commit to in the near term):

Phase 2 (for future consideration - not committing to this just yet):

mattjohnsonpint commented Mar 24, 2023

mattjohnsonpint commented Mar 26, 2023 •

edited

Loading

mattjohnsonpint commented Apr 1, 2023 •

edited

Loading

mattjohnsonpint commented May 5, 2023

mattjohnsonpint commented May 23, 2023

AbhiPrasad commented Jul 25, 2023

Support Exception Groups #37716

Support Exception Groups #37716

Comments

AbhiPrasad commented Aug 11, 2022 • edited Loading

Problem Statement

UPDATE

Sentry Backend & Relay

Sentry UI

SDKs

Misc

polamjag commented Oct 9, 2022

andyljones commented Nov 15, 2022

sbv-csis commented Dec 19, 2022

therealarkin commented Jan 30, 2023

therealarkin commented Jan 30, 2023

andyljones commented Jan 31, 2023

mattjohnsonpint commented Mar 16, 2023 • edited Loading

mattjohnsonpint commented Mar 17, 2023

therealarkin commented Mar 17, 2023

andyljones commented Mar 17, 2023

mattjohnsonpint commented Mar 17, 2023

andyljones commented Mar 19, 2023

mattjohnsonpint commented Mar 24, 2023

antonpirker commented Mar 24, 2023

mattjohnsonpint commented Mar 24, 2023 • edited Loading

Phase 1 (what we'll commit to in the near term):

Phase 2 (for future consideration - not committing to this just yet):

mattjohnsonpint commented Mar 24, 2023

mattjohnsonpint commented Mar 26, 2023 • edited Loading

mattjohnsonpint commented Apr 1, 2023 • edited Loading

mattjohnsonpint commented May 5, 2023

mattjohnsonpint commented May 23, 2023

AbhiPrasad commented Jul 25, 2023

AbhiPrasad commented Aug 11, 2022 •

edited

Loading

mattjohnsonpint commented Mar 16, 2023 •

edited

Loading

mattjohnsonpint commented Mar 24, 2023 •

edited

Loading

mattjohnsonpint commented Mar 26, 2023 •

edited

Loading

mattjohnsonpint commented Apr 1, 2023 •

edited

Loading