Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tokio::spawn_blocking has slightly misleading documentation #2143

Closed
ghost opened this issue Jan 21, 2020 · 8 comments · Fixed by #2426
Closed

tokio::spawn_blocking has slightly misleading documentation #2143

ghost opened this issue Jan 21, 2020 · 8 comments · Fixed by #2426
Labels
A-tokio Area: The main tokio crate M-blocking Module: tokio/task/blocking T-docs Topic: documentation

Comments

@ghost
Copy link

ghost commented Jan 21, 2020

Version

tokio 0.2.9

Platform

Linux 5.4.13 amd64

Description

https://docs.rs/tokio/0.2.9/tokio/task/fn.spawn_blocking.html

Run the provided closure on a thread where blocking is acceptable.

In general, issuing a blocking call or performing a lot of compute in a future without yielding is not okay, as it may prevent the executor from driving other futures forward. A closure that is run through this method will instead be run on a dedicated thread pool for such blocking tasks without holding up the main futures executor.

This is missing the information that this thread pool for blocking tasks is going to spawn up to max_threads - core_threads (~512) new threads on calls to spawn_blocking if no idle worker thread is available.

When I read this section in the documentation I implicitly assumed a similar behavior to the default runtime for futures, that has as many threads as the system has cores, but that doesn't seem to be the case looking at the source code.

In my case I was using it for heavy computation. I previously used tokio-threadpool but on porting to 0.2 I switched to spawn_blocking resulting in a hugely increased system load.

Knowing this I can create a separate runtime for my compute intensive workloads, but adding this information to the documentation would be helpful.

I can try to make a documentation PR if you want, but I'm writing this issue first because I'm not 100% confident in my documentation writing skills and if I got all the details correct.

@kosta
Copy link

kosta commented Jan 21, 2020

I think the documentation should also note whether there is a limit on the number of tasks waiting to be executed if spawn_blocking is called many more times than threads being available. This indicated what failure mode is to be expected (memory usage / "stale work"; panics; returns errors etc.)

@ghost
Copy link
Author

ghost commented Jan 21, 2020

This seem to be the relevant sections:

struct Shared {
queue: VecDeque<Task>,
num_th: usize,
num_idle: u32,
num_notify: u32,
shutdown: bool,
shutdown_tx: Option<shutdown::Sender>,
}

fn spawn(&self, task: Task, rt: &Handle) {
let shutdown_tx = {
let mut shared = self.inner.shared.lock().unwrap();
if shared.shutdown {
// Shutdown the task
task.shutdown();
// no need to even push this task; it would never get picked up
return;
}
shared.queue.push_back(task);
if shared.num_idle == 0 {
// No threads are able to process the task.
if shared.num_th == self.inner.thread_cap {
// At max number of threads
None
} else {
shared.num_th += 1;
assert!(shared.shutdown_tx.is_some());
shared.shutdown_tx.clone()
}
} else {
// Notify an idle worker thread. The notification counter
// is used to count the needed amount of notifications
// exactly. Thread libraries may generate spurious
// wakeups, this counter is used to keep us in a
// consistent state.
shared.num_idle -= 1;
shared.num_notify += 1;
self.inner.condvar.notify_one();
None
}
};
if let Some(shutdown_tx) = shutdown_tx {
self.spawn_thread(shutdown_tx, rt);
}
}
fn spawn_thread(&self, shutdown_tx: shutdown::Sender, rt: &Handle) {
let mut builder = thread::Builder::new().name(self.inner.thread_name.clone());
if let Some(stack_size) = self.inner.stack_size {
builder = builder.stack_size(stack_size);
}
let rt = rt.clone();
builder
.spawn(move || {
// Only the reference should be moved into the closure
let rt = &rt;
rt.enter(move || {
rt.blocking_spawner.inner.run();
drop(shutdown_tx);
})
})
.unwrap();
}
}

The waiting tasks seem to be queued in a VecDeque meaning the size is only limited by available system memory if I understand correctly.

@Nemo157
Copy link
Contributor

Nemo157 commented Feb 29, 2020

Current behaviour appears to be to panic when exceeding the pool, running this example

#[tokio::main]
async fn main() {
    for i in 0..10000 {
        eprintln!("Running {}", i);
        tokio::task::spawn_blocking(|| std::thread::sleep(std::time::Duration::from_secs(1)));
    }
}

logs

Runinng 1
[...]
Running 508
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', src/libcore/result.rs:1188:5
backtrace
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:84
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:61
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1025
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1426
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:65
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:50
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:193
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:210
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:471
  11: rust_begin_unwind
             at src/libstd/panicking.rs:375
  12: core::panicking::panic_fmt
             at src/libcore/panicking.rs:84
  13: core::result::unwrap_failed
             at src/libcore/result.rs:1188
  14: core::result::Result<T,E>::unwrap
             at /rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196/src/libcore/result.rs:956
  15: tokio::runtime::blocking::pool::Spawner::spawn_thread
             at ./.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/runtime/blocking/pool.rs:181
  16: tokio::runtime::blocking::pool::Spawner::spawn
             at ./.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/runtime/blocking/pool.rs:168
  17: tokio::runtime::blocking::pool::spawn_blocking
             at ./.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/runtime/blocking/pool.rs:68
  18: tokio::task::blocking::spawn_blocking
             at ./.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/task/blocking.rs:69
  19: playground::main::{{closure}}
             at src/main.rs:5
  20: <std::future::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196/src/libstd/future.rs:43
  21: tokio::runtime::enter::Enter::block_on
             at ./.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/runtime/enter.rs:100
  22: tokio::runtime::thread_pool::ThreadPool::block_on
             at ./.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/runtime/thread_pool/mod.rs:93
  23: tokio::runtime::Runtime::block_on::{{closure}}
             at ./.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/runtime/mod.rs:413
  24: tokio::runtime::context::enter
             at ./.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/runtime/context.rs:72
  25: tokio::runtime::handle::Handle::enter
             at ./.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/runtime/handle.rs:34
  26: tokio::runtime::Runtime::block_on
             at ./.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/runtime/mod.rs:408
  27: playground::main
             at src/main.rs:1
  28: std::rt::lang_start::{{closure}}
             at /rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196/src/libstd/rt.rs:67
  29: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:52
  30: std::panicking::try::do_call
             at src/libstd/panicking.rs:292
  31: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:78
  32: std::panicking::try
             at src/libstd/panicking.rs:270
  33: std::panic::catch_unwind
             at src/libstd/panic.rs:394
  34: std::rt::lang_start_internal
             at src/libstd/rt.rs:51
  35: std::rt::lang_start
             at /rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196/src/libstd/rt.rs:67
  36: main
  37: __libc_start_main
  38: _start

@ghost
Copy link
Author

ghost commented Mar 9, 2020

@Nemo157 I can't reproduce what you're seeing locally. I think this is just a limitation of the playground. They probably use cgroups or something similar to limit the amount of threads a single program can create.

@ghost
Copy link
Author

ghost commented Mar 9, 2020

fn main() {
    let mut runtime = tokio::runtime::Builder::new()
        .threaded_scheduler()
        .max_threads(20)
        .build().expect("Failed to build runtime.");
    runtime.block_on(spawn_blocking());
}

async fn spawn_blocking() {
    for i in 0..100 {
        eprintln!("Spawning {}", i);
        tokio::task::spawn_blocking(move ||{
            println!("Running {}", i);
            std::thread::sleep(std::time::Duration::from_secs(1));
            println!("Finished {}", i);
        });
    }
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=fc67c17ebf449b2a55fff1445cf489d8

This shows that if the amount of threads is limited manually, Tasks are spawned immediately and queued to be run once one of the threads becomes available.

@Nemo157
Copy link
Contributor

Nemo157 commented Mar 9, 2020

Hmm, yes, running locally with a bit more debugging I see that it's running ~500 of the blocking tasks at a time and completes successfully. It took a while to figure out how to get the info, but I confirmed that the playground has a 512 process limit in cgroups.

It seems like Tokio should handle this case gracefully as well, but should be a separate issue.

@depombo
Copy link

depombo commented Apr 20, 2020

Version

tokio 0.2.9

Platform

Linux 5.4.13 amd64

Description

https://docs.rs/tokio/0.2.9/tokio/task/fn.spawn_blocking.html

Run the provided closure on a thread where blocking is acceptable.
In general, issuing a blocking call or performing a lot of compute in a future without yielding is not okay, as it may prevent the executor from driving other futures forward. A closure that is run through this method will instead be run on a dedicated thread pool for such blocking tasks without holding up the main futures executor.

This is missing the information that this thread pool for blocking tasks is going to spawn up to max_threads - core_threads (~512) new threads on calls to spawn_blocking if no idle worker thread is available.

When I read this section in the documentation I implicitly assumed a similar behavior to the default runtime for futures, that has as many threads as the system has cores, but that doesn't seem to be the case looking at the source code.

In my case I was using it for heavy computation. I previously used tokio-threadpool but on porting to 0.2 I switched to spawn_blocking resulting in a hugely increased system load.

Knowing this I can create a separate runtime for my compute intensive workloads, but adding this information to the documentation would be helpful.

I can try to make a documentation PR if you want, but I'm writing this issue first because I'm not 100% confident in my documentation writing skills and if I got all the details correct.

I ran into the same confusion between the documentation and the source code. Thanks for pointing this out @FSMaxB-dooshop ! I have a compute intensive workload too and having two separate run times, a basic_scheduler runtime for IO operations and a threaded_scheduler capped at the number of CPU threads - 1 seems like it might be the way to go for me. I am curious, what setup did you go with?

@Darksonn
Copy link
Contributor

If anyone is up for writing a documentation PR that addresses this on spawn_blocking's documentation, that would be great. A related piece of documentation was recently added to lib.rs in #2414, which you can use as inspiration. It could also be a good idea to link to the section from spawn_blocking.

@Darksonn Darksonn added A-tokio Area: The main tokio crate E-help-wanted Call for participation: Help is requested to fix this issue. M-blocking Module: tokio/task/blocking T-docs Topic: documentation labels Apr 20, 2020
palashahuja added a commit to palashahuja/tokio that referenced this issue Apr 22, 2020
The initial spawn_blocking documentation was unclear. This PR attempts to fix that.

Fixes tokio-rs#2143
@Darksonn Darksonn removed the E-help-wanted Call for participation: Help is requested to fix this issue. label Sep 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-blocking Module: tokio/task/blocking T-docs Topic: documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants