Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow a custom panic handler #1328

Merged
merged 1 commit into from
Dec 17, 2015
Merged

Conversation

sfackler
Copy link
Member

/// # Panics
///
/// Panics if called from a panicking thread. Note that this will be a nested
/// panic and therefore abort the process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a panic handler is invoked, perhaps it could be considered as "being taken"? That would mean that this function would never need to panic as if called while panicking it'd just continue to return the default handler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could work, but the semantics would be a little weird - the handler can't actually have been taken since it may need to be running on other threads as well, so you'd run into a situation where take_handler doesn't return the thing that's actually registered, and the handler would still be registered

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait right, this is a global resource, so there can be concurrent panics. Nevermind!

@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Oct 19, 2015

/// Returns information about the location from which the panic originated,
/// if available.
pub fn location(&self) -> Option<Location> { ... }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah one thing I meant to ask, is there a reason this returns Location instead of &Location? In theory a Location could be at least Clone (storing a &'static str internally for the file perhaps) which would allow storing locations elsewhere if necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember why I chose a Location<'a> vs a &Location off the top of my head, but &Location seems fine as well.

@bstrie
Copy link
Contributor

bstrie commented Oct 21, 2015

Could this mechanism be used to support abort-on-panic semantics without the need for a compiler flag?

@sfackler
Copy link
Member Author

Yep, though you'd still have all of the panic infrastructure (landing pads, etc) in place, so we'd probably still want to have the flag. This would cover the use case of "I want to find out about panics via a process abort (maybe with debug data logged or something)", but not "I'm running in an environment that I don't want/can't have stack unwinding".

@arielb1
Copy link
Contributor

arielb1 commented Oct 21, 2015

@sfackler

-Z no-landing-pads causes a very ugly crash on panic with the default (unwinding) panic handler, but otherwise it is kind of orthogonal.

@aturon
Copy link
Member

aturon commented Nov 18, 2015

cc @rust-lang/lang @rust-lang/core -- while technically a "library" issue, this is a pretty central aspect of the language and should get scrutiny across these teams.

@aturon
Copy link
Member

aturon commented Nov 18, 2015

The API design here seems reasonable for the "global resource" approach to panic handling, and I agree with @sfackler that as long as panic handlers are an application-level concern, this probably works just fine. But I have a couple concerns.

Composability/library use

My biggest worry about the design is potential use within libraries; I agree with @alexcrichton that such use is basically inevitable. Global resources give you poor composability between libraries (think: global state like the current working directory).

As others have mentioned, the per-thread handler model could potentially deal with it. The alternatives section mentioned the per-thread model, but dismisses it essentially due to lack of inheritance. But we should be able to provide inheritance for threads spawned via the standard library at least. So I'd like to discuss the tradeoffs here a bit more deeply.

Relatedly, the RFC claims that it'd be easy to add per-thread handlers on top later. I'm curious exactly what you have in mind -- would it be an entirely separate layer prior to calling the global handler?

Pre- vs post-unwinding

The proposal here is to invoke the handler immediately upon panic, rather than after unwinding. IIRC this was motivated in part so that the call stack is available. But it's also a bit counterintuitive, and doesn't play well with recover -- even in cases that panics are "caught" and otherwise "handled" in some way, the generic global handler will still be run.

Is there some other way we could retain the needed stack information but invoke the handler after unwinding? Or is there rationale, perhaps for providing hooks on both ends of unwinding?

Abort semantics

Finally, there are two issues around abort that would be good to settle:

  • To what degree does this API tie us to the current "abort on double panic" semantics? I know some have argued against that semantics (@nikomatsakis among them, IIRC). I've heard talk of, for example, "bundling together" the payloads from multiple panics. It'd be nice to keep our options open.
  • Others have raised the issue of using this mechanism for an abort-on-panic semantics, but @sfackler points out that you probably want to invoke a separate compiler flag as well. It'd be nice to have a relatively clear and simple story about how to get abort-on-panic, and how that might play with the handlers here. (Not a blocker for this RFC, more something to discuss a bit just to be sure there's a reasonable plan.)

@aturon
Copy link
Member

aturon commented Nov 19, 2015

Follow-up: my previous comment sort of implied that inherited, thread-local handlers "solve" composability, but I think that's not really true; they just mitigate it. It's still possible to use multiple libraries that want to frob the thread-local handler, and they have to play together somehow.

@arielb1
Copy link
Contributor

arielb1 commented Nov 19, 2015

@aturon

Panic handlers are intended to work even without unwinding, and therefore they should run before it. If you want to ignore panics that are caught by recover, you should do your own recover at the event loop.

The interaction with double panics is annoying. I would prefer a version that has panic!("foo") being equivalent to

    let panic_info = PanicInfo {
        location: current_location!(),
        info: box "foo"
    };
    GLOBAL_PANIC_HANDLER.borrow()(&panic_info);
    unwind(panic_info);

In that case, a panic in a panic handler would lead to a safe death via stack overflow. If this is not desirable, a panic handler could have a recursive lock (we need to have some way to handle concurrent panics anyway).

@alexcrichton
Copy link
Member

🔔 This RFC is now entering its week-long final comment period 🔔

Note that this entering FCP at the same time as #1100 and the two will likely be decided upon as a unit.

@alexcrichton alexcrichton added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Nov 19, 2015
@yberreby
Copy link

This is a more polished design than the one I proposed and didn't have time to revise, and supports adding thread-local handlers in the future.

👍

@sfackler
Copy link
Member Author

@aturon

I am not a fan of an inheritance based situation for a couple of reasons:

  • Many programming languages provide a global exception handler - C++, Java, Python, etc. As far as I can tell, zero languages provide this kind of handler inheritance scheme.
  • Threads not spawned via std::thread::spawn are left in some kind of limbo state with respect to panic handling. We could have some function you could call to register a handler after the fact, but that makes writing code that interoperates with other languages way weirder.

It's not totally clear to me what use cases you see for libraries wanting to mess with a global handler. From my experience in Java, the only libraries that mess with it are the ones that explicitly advertise that fact and take over the handler to e.g. forward uncaught exceptions from an Android app back out to some server for tracking. I can imagine we'd update the log crate to have a function that will replace the global handler with one that logs errors for people that want to do that. Poorly behaved libraries might do weird things, but that seems to me to fall on the same "just don't do that" line of a library calling std::process::exit randomly.

Libraries certainly might find it convenient to set custom handlers on specific threads that they create and maintain but that doesn't seem like a necessity - std::panic::recover is a better bet for thread pools and Java is the only language I've looked at that supports per-thread uncaught exception handlers.

Running handlers post-unwinding does seem like a better bet if we can figure out what to do with backtraces. @alexcrichton, is the stack walking part of backtrace generation fast enough that we can reasonably run it for every panic? If not, we might need a separate method to indicate that the handler is interested in backtrace info or something like that.

I don't think this ties us to abort on double panic any more than the Err value for joining a thread does - in both cases we have a Box<Any + Send> which right now implies one payload, though the thing inside of it could of course in the future be something wrapping a Vec<Box<Any + Send>> or whatever.

The reentrancy of the handler itself is a bit more subtle, but I don't feel super strongly about what the behavior is there. It could be treated as a double panic -> abort, or recursively call the handler, or skip the handler the second time around.

@arielb1
Copy link
Contributor

arielb1 commented Nov 20, 2015

@sfackler

From my point of view this is not supposed to be an unhandled exception handler because Rust does not support handled exceptions, it just supports recovery from exceptions. It is just the panic handler, and it should run when panic is called (for example, unwinding may not actually be supported, and even if it is, it may call undesirable destructors).

@alexcrichton
Copy link
Member

In terms of composability I think I prefer the ability to only add a handler and unregister just that handler (e.g. the Windows-like style), but in terms of implementation what's proposed here is more general (e.g. you can build the Windows style on this one). As a result, plus some of the comments @sfackler has made, I think I'd lean in favor of this strategy, acknowledging that the composability just isn't that great and more composable solutions can be built up externally. The upside of this proposal is that it's easy to understand and implement and provides at least the foundation for doing more pieces later.

Pre- vs post-unwinding

@aturon, this is an interesting point I hadn't considered before, yeah. I wouldn't necessarily say that there's a hard requirement that it runs pre-unwinding, it's just convenient today that the backtrace is available. That being said, it may not be totally unreasonable to have a scheme along the lines of the outermost recover block is the only one that will run the global handler. For example if you have nested recover blocks on one thread you'll never invoke the global handler until the outermost one catches the panic. We could easily set this up with TLS, and it means that FFI boundaries and thread boundaries would run the global handler, but manual calls to panic::recover wouldn't.

Along those lines, it may not be so critical that we distinguish pre vs post unwinding.

Abort semantics

I agree with @sfackler that I don't think this really changes our story here much. Even if we have to special case the panic-handler panicking that seems like it's not the end of the world, the desire to handle double panics I believe is motivated to work with "most code" rather than all code.

is the stack walking part of backtrace generation fast enough that we can reasonably run it for every panic?

@sfackler, I'm not actually quite sure, but panicking is so slow already it probably wouldn't matter if we just generated a backtrace. That being said I still think we should always have backtraces be optional because including libbacktrace is a nontrivial dependency of the standard library that may not always be desired.

@sfackler
Copy link
Member Author

Yep, I figure we'd make any backtrace accessor return an Option<Whatever> like PanicInfo::location does in this proposal. It's more of a question of whether we can unconditionally generate the trace when libbacktrace is compiled in or if a handler would have to opt-in in some way.

@aturon
Copy link
Member

aturon commented Nov 20, 2015

@sfackler Thanks for the detailed reply!

I think I'm persuaded on the global vs. thread-local issue. In favor of global:

  • the most common way for libraries to interact will be to simply chain handlers;
  • in principle a single global handler is enough to bootstrap other means of coordinating handlers;
  • it's dirt simple;
  • everybody does it.

And it's not like the thread-local version completely solves all composability problems.

There's some slight chance that there will arise multiple competing frameworks on top for composing handlers, which would be a shame, but even if that happened we could probably push to standardize on one, so I'm not too worried.

re: pre/post-unwinding, I'm coming around to @arielb1's perspective that this is more like an "on panic hook" rather than a "panic handler". If we're clear enough in the naming and docs, we can probably avoid confusion around when the callback is triggered, and it seems like all of the technical advantages are on the side of doing this pre-unwinding.

So at this point I'm pretty much 👍. Thanks again for the replies!

@arielb1
Copy link
Contributor

arielb1 commented Nov 21, 2015

@alexcrichton

I am not particularly sure about the behavior with panic::recover. The intended uses for it are

  • An FFI handler that re-packages panics across an FFI boundary.
  • A "worker thread" top-level handler to allow recycling the worker thread when a panic occurs.

In the first case, the panic handler should definitely run, as the intended behavior is that of a normal panic. In the second case, the panic handler should be cooperating with the top-level handler, so there should be no problem.

Anyway, the handler really should be run before the stack is unwound - a "send backtrace to a server" handler may want to use a logger allocated on the stack, and destroying it before the handler is run would be sad.

@alexcrichton
Copy link
Member

@arielb1 ah yeah that's a good point, I always seem to forget about the worker-thread-like situation!

@nikomatsakis
Copy link
Contributor

I feel uncomfortable with this RFC casually stating that a double panic leads to abort. I'd prefer to leave the proper handling of this case as an Unresolved Question, since I think that aspect of the language semantics is not entirely nailed down. That said, I don't think it really introduces a lot of complications -- I think of this as basically inserting a destructor that runs at the point of panic, before other destructors, and hence the semantics of a panic in that context seems to be no different from the semantics of a panic occurring in the middle of any other destructor. Currently, this aborts, and it may be that we cannot in fact change this (or find a more satisfactory semantics).

Just in general, I think that aborting in some situations seems like it might be the right thing. There is certainly precedent: it's quite close to what C++ does. If we are to embrace the abort, I could even imagine going a bit further, and adding the concept of "nounwind" functions, which will abort if unwinding passes through them. All destructors could then be marked nounwind, so that we wind up with a simple rule -- do not panic in a destructor (whether or not a panic has already occurred).

However, there are definitely use cases where an abort is kind of "game over", so I still find the idea that we can't recover in some more graceful manner unfortunate. It does seem like we won't ever be able to guarantee a "complete" recovery though (because the recovery process itself is not completing), so perhaps that would have to be something that the user opts into -- i.e., saying "it's ok if some destructors never complete (or never execute), please don't abort". (Also, this line of thinking argues against making unwinding during destructors a complete no-no.)

In this sense, executing the panic handler immediately on panic is good, because it gives us the opportunity to affect (and maybe configure, someday) how unwinding will proceed.

So I guess, all in all, I feel pretty comfortable with this RFC as a starting point, but I'd prefer to move the question of double panics to an unresolved question, something we can revisit when we stabilize.

///
/// Panics if called from a panicking thread. Note that this will be a nested
/// panic and therefore abort the process.
pub fn take_handler() -> Box<Fn(&PanicInfo) + 'static + Sync + Send> { ... }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you expect std::panic to be part of core and if so how can there be a Box here (which is in the alloc crate)? Or if not; how will this work in a no_std scenario?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The panicking infrastructure already allocates and is already in libstd.

no_std executables need to define some lang items to implement the functionality for panicking.

@alexcrichton
Copy link
Member

The libs team discussed this RFC during triage yesterday, and the conclusion was to merge. It looks like there is broad consensus around this strategy (especially as a first-but-unstable pass) modulo some careful wording about what it means to double-panic.

Thanks again for the RFC @sfackler!

Tracking issue

@alexcrichton alexcrichton merged commit e4fd4e6 into rust-lang:master Dec 17, 2015
@sfackler sfackler removed the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Dec 22, 2015
@Centril Centril added the A-panic Panics related proposals & ideas label Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-panic Panics related proposals & ideas T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.