Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that our docs present strict_exception_groups=True as the natural default #2929

Closed
Zac-HD opened this issue Jan 22, 2024 · 7 comments · Fixed by #2984
Closed

Ensure that our docs present strict_exception_groups=True as the natural default #2929

Zac-HD opened this issue Jan 22, 2024 · 7 comments · Fixed by #2984
Labels

Comments

@Zac-HD
Copy link
Member

Zac-HD commented Jan 22, 2024

Follow-up for #2886 (comment):

I think that we'll eventually want to rewrite this whole section from the new strict-by-default perspective, under the heading "Deprecated: non-strict ExceptionGroups" - to explain that it only exists for backwards-compatibility, will be removed in future, and that we recommend against it for all new code.

We should also skim through other mentions of ExceptionGroup to ensure that our messaging is consistent across all the docs.

@Zac-HD
Copy link
Member Author

Zac-HD commented Mar 13, 2024

Relatedly, we should document any design patterns we like or dislike for interfaces with an "inner nursery", c.f. python-trio/trio-websocket#132

The basic problems are that (1) forcing users to handle errors which may or may not be an ExceptionGroup has horrible ergonomics, and that (2) usually one or the other is more common, so it's really easy to end up with latent bugs from only handling the common kind. That's why we made strict_exception_groups=True the default!

Unfortunately, this also means that using a nursery internally means that error handling doesn't quite follow the black box rule - you can still catch an ungrouped exception with except*:, but most code has good reason to prefer except:.

Example patterns:

  1. Pass through the ExceptionGroup, letting users handle it.
    • Pros: this is the default behavior, so it's trivial to implement
    • Cons: forcing users to deal with maybe-a-group is unergonomic and constrains internal changes
  2. raise SomeInternalError(...) from group
    • Pros: reasonably simple; easy to implement with a context manager
    • Cons: users still have to deal with maybe getting a SomeInternalError instead of a group, so it's not clear that we've actually gained much - except Exception: will catch an ExceptionGroup directly, after all.
  3. Raise the contained exception, if there's only one, else raise a group
    • Pros: status quo ante
    • Cons: see above for why we deprecated this approach
  4. Raise the contained exception, if there's only one, else raise PleaseReportThisBugError
    • Pros: hides the implementation detail of an inner nursery
    • Cons: relatively few design patterns are incapable of raising concurrent errors without a bug.
  5. Avoid using an inner nursery
    • Pros: avoids the problem entirely
    • Cons: ...by avoiding concurrency. Good advice when it works but can't be our only solution.
  6. [requires new code] Offer a nursery mode where exceptions in background tasks raise SomeInternalError(...) from group but exceptions in the body of the nursery block are raised directly.
    • if an exception in the body triggers an error when a background task is cancelled, you'd get a SomeInternalError from ExceptionGroup(...) from body-error. Verbose but clear IMO.
    • Pros: allows you to implement a context manager which doesn't change the exception semantics of wrapped code.
  7. As (6), but implement downstream by carefully wrapping tasks in exception-catching proxies
    • Pros: nice semantics; doesn't require changes in Trio itself, and if useful this would justify work on (6)
    • Cons: tricky to implement, probably worse than anything other than (6) itself

...and that's probably a decent outline for the section, I suppose.

@belm0
Copy link
Member

belm0 commented Mar 28, 2024

It's scary that these approaches to the nursery-as-implementation-detail problem haven't been fleshed out before committing to strict exception groups.

I think the best practice for avoiding concurrent error headaches is to minimize their scope, converging on a winning exception in a local context wherever reasonable. Up until now, I've only had to deal with that in a few places in our app or libraries where a concurrent error might leak. Now every single use of a nursery (including as implementation detail by 3rd party libraries) needs to be bandaged over. It's a large blow to trio ergonomics.

(6) sounds interesting, but I'm not sure if the body/background distinction is sufficient. I'd have to see it and try to apply it.

The careful wrapping of API's alluded to in (7) has been the approach in my application all along, and has worked fine. For example, see multi_error_defer_to() and defer_to_cancelled() in trio-util. Notably, these can be decorators on API async functions. (We've only needed @defer_to_cancelled in our codebase.)

@belm0
Copy link
Member

belm0 commented Mar 29, 2024

I think it's related to the patterns discussion, so I'll share thoughts on how I might migrate trio-websocket to trio's new regime.

This library seems like it would be a simple enough case-- no async package dependencies, doesn't expose concurrent errors in its API-- yet how to reach the goal isn't obvious. The points to watch for:

  • exception catching within the implementation - anywhere there is an except: within an async context, it might need to be adjusted to handle an exception group. It's hard to know from just reading the code, since nursery use may be an implementation detail of any call or context manager-- as we are aware. (We'd need to rely on diligent documentation on every async definition in the world regarding potentially raising a transitive ExceptionGroup.)
  • maintaining the API's intention to not raise concurrent exceptions - there may be cases where nurseries are used internally, but the usage is such that concurrent exceptions don't happen. Now with strict exception groups, ExceptionGroup might be raised anyway for a lone exception, violating the API. There isn't a straightforward way to locate what parts of the code need adjustment.

Between these, the latter has me more concerned. At least with the former, the code to review is clearly marked with except.

So, along the lines of pattern (2), I might make a context manager that translates ExceptionGroup into an internal error, unless the group consists entirely of Cancelled. Then hope by way of unit tests, integration tests, and use in the field that any internal errors might be reported, and the holes eventually patched up.

Regarding pattern (2):

users still have to deal with maybe getting a SomeInternalError instead of a group, so it's not clear that we've actually gained much - except Exception: will catch an ExceptionGroup directly, after all

No one should be explicity catching an internal error of an API (a general catch is fine). I guess here I'm equating "internal error" with "error that the API does not expect to raise", so it's a bug to be fixed. It's important to use a dedicated error and make it clear that a bug should be reported.

@Zac-HD
Copy link
Member Author

Zac-HD commented Mar 31, 2024

It's scary that these approaches to the nursery-as-implementation-detail problem haven't been fleshed out before committing to strict exception groups.

I'm not delighted about this gap either, but #2785 (comment) explains why I think we had to do it: strict is the standard semantics across asyncio and anyio, and mixing strict and non-strict is a recipe for bugs as authors neglect the chance of a group. Unfortunately you might not notice in testing - it's the kind of bug that shows up under (production) load.

I think I'm less bothered by the worse ergonomics because those bugs have repeatedly bitten me, in both our own and upstream code, and I'm working in a large enough codebase that buying reliably-correct-ness for some convenience is net-positive tradeoff.

In cases where you know that only one error can every happen at a time (although how sure should we be that we'd notice if that changed?), it's not too hard to flatten the group - although it'd be nice to have a helper for that and raise a PleaseReportBug exception if there's more than one leaf. In cases where there can be multiple concurrent errors, I'm disinclined to hide that and would defer_to_cancelled() or just raise the whole group - except* is pretty nice to work with.

@belm0
Copy link
Member

belm0 commented Apr 8, 2024

I'm still struggling with how to mitigate strict exception groups within the trio libraries I maintain.

The worst thing is that strict_exception_groups was made a user-configurable option that can be set globally on run(). This means that the application can enable this mode before authors of any of the transitive dependencies have given thought to whether they want their API's to leak ExceptionGroup.

I can't declare a trio <= X dependency in my package as a stopgap while sorting things out (where version X enabled strict exceptions), again since strict_exception_groups is a global runtime option set by the app.

@A5rocks
Copy link
Contributor

A5rocks commented Apr 8, 2024

I haven't seen any reference of explicitly setting strict_exception_groups=False for internal nurseries (as a temp stop-gap for libraries: async with trio.open_nursery(strict_exception_groups=False) as nursery: ...). Is there a big obvious flaw with it? Maybe people avoid it cause it's deprecated?

I don't maintain a library that uses trio so if it seems like I'm missing something, I probably am :P

@belm0
Copy link
Member

belm0 commented Apr 8, 2024

explicitly setting strict_exception_groups=False for internal nurseries

I don't know how far you'd get, as there may be 3rd party things your code calls (within trio API itself or utility libs) that wrap nursery and don't give you access to the option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants