Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Clarify CPU-bound tasks on different pools, rather than not tokio #4105

Merged
merged 4 commits into from
Sep 15, 2021

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Sep 13, 2021

Resolves #4104 (I am sorry for the massive writeup for a small doc change, but I think it is important to understand the motivation)

Motivation

(copied from #4104)

Is your feature request related to a problem? Please describe.
First of all, Tokio is awesome and very flexible -- we have used it extensively in InfluxDB IOx, Apache Arrow DataFusion and elsewhere and it works great. Thank you very much ❤️

In DataFusion, InfluxDB IOx and other applications, there is a mix of CPU bound and I/O work and we use thread pools to manage the execution of those tasks.

I realize the design goals and optimization point for tokio is a large number of IO tasks, but its core threading model and support for the async / Future machinery of the Rust language and ecosystem makes it a very compelling thread (task) pool implementation as well. We have used tokio effectively for both types of work and found significant value of doing so.

However, on many occasions over the last few years, the following note from https://docs.rs/tokio/1.11.0/tokio/#cpu-bound-tasks-and-blocking-code

If your code is CPU-bound and you wish to limit the number of threads used to run it, you should run it on another thread pool such as rayon

Has generated significant discussion and confusion as it implies to some that the tokio Runtime should never be used for CPU bound tasks.

I believe the intent of this section is to warn people against using the same thread pool (Runtime) for I/O and CPU bound work, which is definitely sage advice to avoid significant and potentially unbounded latencies responding to IO. However I don't think there is anything specific to tokio that prevents it being used for CPU bound work.

I think this advice may be from an earlier time when it wasn't possible (or easy?) to create a separate tokio Runtime, for CPU bound tasks (which indeed serves as a "dedicated thread pool"). We have created a wrapper to do this for us in IOx DedicatedExecutor and it has worked well for us in practice.

Examples of confusion / questions:

Solution

Clarify in the the documentation that the situation to avoid is using the same tokio threadpool, and point readers to the apis to create their own thread pools.

tokio/src/lib.rs Outdated
Comment on lines 208 to 211
//! to run it, you should use a separate thread pool such as provided by [rayon] or
//! a different [`Runtime`] than the one used to handle IO tasks.
//! If using rayon, you can use an [`oneshot`] channel to send the result back
//! to Tokio when the rayon task finishes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with making this change, but I think the phrasing here is not explicit enough. Maybe give a few more details on when exactly you can do this?

Copy link
Contributor Author

@alamb alamb Sep 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry I am not clear on your suggestion -- what do you mean "when exactly you can do this"? Do you mean

  1. "when use a oneshot channel"?
  2. "when you can use a different Runtime?"
  3. Something else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want it to be more clear that the runtime should be dedicated to only CPU-bound tasks. It does sorta say that here, but I'd like it to be more precise about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attempted to clarify in f061eb2.

I have also been thinking of drafting a blog post on this topic, which perhaps help clarify the options in a longer format than available in the docstrings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For anyone who cares, said post is now available at https://thenewstack.io/using-rustlangs-async-tokio-runtime-for-cpu-bound-tasks/

@Darksonn Darksonn added A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime T-docs Topic: documentation labels Sep 13, 2021
Copy link
Contributor

@Darksonn Darksonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more along these lines.

tokio/src/lib.rs Outdated Show resolved Hide resolved
tokio/src/lib.rs Outdated Show resolved Hide resolved
Co-authored-by: Alice Ryhl <aliceryhl@google.com>
@Darksonn Darksonn merged commit 57563e2 into tokio-rs:master Sep 15, 2021
@alamb
Copy link
Contributor Author

alamb commented Sep 15, 2021

Thank you @Darksonn -- I think this change will reduce confusion going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime T-docs Topic: documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Documentation misleading suggests tokio should not be used for CPU bound tasks
2 participants