Stop runtime on task panic #2002

mikhailOK · 2019-12-20T22:15:44Z

Version

tokio 0.2.6

Description

I'm indirectly using tokio runtime with basic scheduler (through using actix 0.9.0).
It seems like tokio 0.1 would stop if any task panics, but 0.2.6 catches everything in task::harness::Harness::poll and the runtime keeps going.

Is there any way to get the old behavior of stopping the runtime?

The text was updated successfully, but these errors were encountered:

carllerche · 2019-12-20T22:16:42Z

Are you able to abort on panic? You could also set a panic_handler that signals the root task (block_on) to exit.

mikhailOK · 2019-12-20T22:34:30Z

panic hook to signal the root task works.
Is there a plan to add API to pass a panic_handler to Harness::poll as opposed to std panic hook?

Vlad-Shcherbina · 2020-01-30T12:48:41Z

Dealing with it in the panic handler is not the best option because maybe I still want to explicitly catch panics in specific scopes, but unexpected panics elsewhere should terminate the whole thing. By default. It's an unpleasant surprise when they don't (see fail-fast).

Darksonn · 2020-07-25T09:49:55Z

Closing in favor of #2699.

Vlad-Shcherbina · 2020-07-25T11:03:25Z

It's not about tests. It's about panics being silently caught everywhere. In production too.

Anywhere else in Rust, if there is a panic in the code not explicitly wrapped in catch_unwind, the whole program terminates with a diagnostic message. This goes in line with Rust's emphasis on correctness. Panic usually indicated a bug in the code, and I don't want bugs to be silently ignored. I want bugs to be reported and fixed.

It is true that sometimes we need to catch panics to ensure robustness. For example, perhaps we don't want a panic in a request handler to terminate the whole web server program. But that's none of tokio's business! It's web framework's or even web application's business! It is possible to use tokio for something besides web applications, and in those use cases panics definitely shouldn't be silently ignored.

Consider reopening.

Darksonn · 2020-07-25T11:06:36Z

Regarding the "anywhere else in Rust" part, I will note that we are mirroring the behavior of std::thread. See also #1830 and #1879.

carllerche · 2020-07-25T18:25:31Z

tokio::spawn models thread::spawn. As @Darksonn mentioned, thread::spawn does not abort the process on panic. Spawned tasks are unwind-safe due to the Send + 'static bound.

In order to deviate from thread::spawn's behavior, we would need a compelling argument.

I could buy into a shutdown_on_panic flag to runtime given a compelling argument. One would have to explain why std's behavior is not sufficient (i.e. configure the process to abort on panic).

s97712 · 2020-08-06T19:05:25Z

tokio::spawn models thread::spawn. As @Darksonn mentioned, thread::spawn does not abort the process on panic. Spawned tasks are unwind-safe due to the Send + 'static bound.

In order to deviate from thread::spawn's behavior, we would need a compelling argument.

I could buy into a shutdown_on_panic flag to runtime given a compelling argument. One would have to explain why std's behavior is not sufficient (i.e. configure the process to abort on panic).

What about spawn_local? Whether to consider to end the current thread when panic in the "local task"?

Darksonn · 2020-08-06T20:24:12Z

I don't think spawn_local and spawn should have different behaviour on this point.

hawkw · 2020-08-06T20:27:54Z

What about spawn_local? Whether to consider to end the current thread when panic in the "local task"?

I think inconsistent behavior between spawn and spawn_local is not ideal --- it would introduce more complexity and confusion.

I could buy into a shutdown_on_panic flag to runtime given a compelling argument. One would have to explain why std's behavior is not sufficient (i.e. configure the process to abort on panic).

IMO, the main argument for a shutdown_on_panic flag is for test code. If assertions are made in code that ends up being run in a spawned task, the JoinHandles of all those spawned tasks must be awaited in the main test body to ensure panics from assertion failures are propagated. This can be unwieldy, and in some cases, it's easy to misplace a JoinHandle and forget to await it, resulting in a test that passes even if an assertion fails --- which is far from ideal. If there was a shutdown_on_panic flag, I would definitely use it in tests (and might want tokio::test to enable it).

s97712 · 2020-08-07T06:31:02Z

If these tasks are running in the same thread and one of tasks is panic, why the other tasks still working, which makes me more confused.

Darksonn · 2020-08-07T06:47:19Z

@s97712 Panics in spawned tasks are caught, just like they are for spawned threads in std. Tasks spawned with the ordinary tokio::spawn function also share their threads in some manner.

lamafab · 2021-12-01T14:01:13Z

Any progress on this?

Darksonn · 2021-12-01T21:48:31Z

No, there's currently no way to do this.

Venryx · 2022-01-24T19:06:01Z

Having a way to tell Tokio to "not catch" panics that occur in its threads seems like a useful feature for me.

My use-case: I have my Rust program deployed in Kubernetes. When a panic occurs, I want my program to crash/completely-close, so that Kubernetes can notice the crash and perform its regular handling (eg. restarting the pod, unless it keeps crashing immediately, in which case back off for a while).

I looked through the source-code of Tokio, and could not find a way to directly achieve what I wanted. That said, here are some workarounds I have found.

Workaround 1

Enable Rust's "abort on panic" setting.

You can do this by...
A) Adding the following to your root Cargo.toml file, as seen here:

[profile.XXX]
panic = "abort"

B) Or, by adding -C panic=abort to the rustflags, as seen here.

You can control the granularity of the stack-traces logged to the console by setting the RUST_BACKTRACE environment variable:

RUST_BACKTRACE=0 # no backtraces
RUST_BACKTRACE=1 # partial backtraces
RUST_BACKTRACE=full # full backtraces

Workaround 2

Add a custom panic handler, which receives the error, prints a backtrace (optionally), and then manually aborts your program (optionally):

#![feature(backtrace)]

use std::backtrace::Backtrace;

#[tokio::main]
async fn main() {
    //panic::always_abort();
    panic::set_hook(Box::new(|info| {
        //let stacktrace = Backtrace::capture();
        let stacktrace = Backtrace::force_capture();
        println!("Got panic. @info:{}\n@stackTrace:{}", info, stacktrace);
        std::process::abort();
    }));

    [...]
}

I like this approach better because it gives me control of how much of the stacktrace to print (they can be quite long!), as well as whether the panic is of a type that is worth calling abort() for.

The one main drawback is that the backtrace-generation code (Backtrace.capture()) is currently only available on Rust nightly.

If you want to use the backtrace-generation on Rust stable, you can actually, but it requires a hack where you set this environment variable: RUSTC_BOOTSTRAP=1 (as described here)

You can set that as a global environment variable, or have it set specifically for your cargo-build command.

For Docker: Just add a ENV RUSTC_BOOTSTRAP=1 line before your build commands. (or use RUN RUSTC_BOOTSTRAP=1 <rest of command> for each command)

For rust-analyzer (in VSCode): Add this to your project's .vscode/settings.json file:

    "rust-analyzer.server.extraEnv": {"RUSTC_BOOTSTRAP": "1"}

bouk · 2022-11-10T14:20:25Z

Related: #4516

mikhailOK mentioned this issue Dec 20, 2019

A way to check if Runtime has panicked tasks #1999

Closed

mikhailOK mentioned this issue Dec 20, 2019

System::stop_on_panic is removed. actix/actix-net#80

Open

Darksonn added A-tokio Area: The main tokio crate C-feature-request Category: A feature request. M-runtime Module: tokio/runtime labels Jul 25, 2020

Darksonn closed this as completed Jul 25, 2020

Darksonn reopened this Jul 25, 2020

thomcc mentioned this issue Feb 1, 2022

Force test failure on async task panic (for now) build-trust/ockam#2479

Open

thomcc mentioned this issue Apr 7, 2022

Should we add an panic interface that reports the error via the panic handler but unconditionally aborts? rust-lang/project-error-handling#34

Open

iulianbarbu mentioned this issue Apr 19, 2023

cargo-shuttle: fix address in use error when service panicked in a previous run shuttle-hq/shuttle#805

Merged

bsbds mentioned this issue Jun 29, 2023

[Bug]: The test_no_root_user_do_admin_ops will pass despite the panic xline-kv/Xline#267

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop runtime on task panic #2002

Stop runtime on task panic #2002

mikhailOK commented Dec 20, 2019

carllerche commented Dec 20, 2019

mikhailOK commented Dec 20, 2019

Vlad-Shcherbina commented Jan 30, 2020

Darksonn commented Jul 25, 2020

Vlad-Shcherbina commented Jul 25, 2020

Darksonn commented Jul 25, 2020 •

edited

carllerche commented Jul 25, 2020

s97712 commented Aug 6, 2020 •

edited

Darksonn commented Aug 6, 2020

hawkw commented Aug 6, 2020

s97712 commented Aug 7, 2020

Darksonn commented Aug 7, 2020

lamafab commented Dec 1, 2021

Darksonn commented Dec 1, 2021

Venryx commented Jan 24, 2022 •

edited

bouk commented Nov 10, 2022

Stop runtime on task panic #2002

Stop runtime on task panic #2002

Comments

mikhailOK commented Dec 20, 2019

Version

Description

carllerche commented Dec 20, 2019

mikhailOK commented Dec 20, 2019

Vlad-Shcherbina commented Jan 30, 2020

Darksonn commented Jul 25, 2020

Vlad-Shcherbina commented Jul 25, 2020

Darksonn commented Jul 25, 2020 • edited

carllerche commented Jul 25, 2020

s97712 commented Aug 6, 2020 • edited

Darksonn commented Aug 6, 2020

hawkw commented Aug 6, 2020

s97712 commented Aug 7, 2020

Darksonn commented Aug 7, 2020

lamafab commented Dec 1, 2021

Darksonn commented Dec 1, 2021

Venryx commented Jan 24, 2022 • edited

Workaround 1

Workaround 2

bouk commented Nov 10, 2022

Darksonn commented Jul 25, 2020 •

edited

s97712 commented Aug 6, 2020 •

edited

Venryx commented Jan 24, 2022 •

edited