Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch back to using one threadpool #7898

Merged

Conversation

illicitonion
Copy link
Contributor

@illicitonion illicitonion commented Jun 19, 2019

This reverts #7848 and fixes the performance issue found in it.

The first commit is just a revert (with a few manual merge cleanups), the second commit fixes the performance issue.

Fixes #7896

The previous method ended up reading lots of small byte buffers, and
then joining them quadratically. This just fills a buffer.
Copy link
Contributor

@ity ity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great - thank you for coming back to it!

@illicitonion illicitonion merged commit fa50e6e into pantsbuild:master Jun 19, 2019
@illicitonion illicitonion deleted the dwagnerhall/tokio/unrevert-clean branch June 19, 2019 23:32
illicitonion added a commit to twitter/pants that referenced this pull request Jul 3, 2019
As per
https://docs.rs/tokio-threadpool/0.1.15/tokio_threadpool/fn.blocking.html
when you use a blocking block in a task, it blocks the _entire_ task,
not just the bit which you're marking blocking. This means that when
we're in one giant join_all calling read_dir on a bunch of directories,
we end up serializing them to only read one directory at a time.

This change introduces a new logging::Executor (a slightly weird place
for it, but it depends on the logging stuf, so...) which mirrors the
tokio::Runtime and the futures::sync::oneshot::spawn APIs to
conveniently allow blocking Futures to be spawned in their own tasks, so
that they don't hinder parallelism within a task.
illicitonion added a commit to twitter/pants that referenced this pull request Jul 3, 2019
As per
https://docs.rs/tokio-threadpool/0.1.15/tokio_threadpool/fn.blocking.html
when you use a blocking block in a task, it blocks the _entire_ task,
not just the bit which you're marking blocking. This means that when
we're in one giant join_all calling read_dir on a bunch of directories,
we end up serializing them to only read one directory at a time.

This change introduces a new logging::Executor (a slightly weird place
for it, but it depends on the logging stuf, so...) which mirrors the
tokio::Runtime and the futures::sync::oneshot::spawn APIs to
conveniently allow blocking Futures to be spawned in their own tasks, so
that they don't hinder parallelism within a task.
illicitonion added a commit to twitter/pants that referenced this pull request Jul 3, 2019
As per
https://docs.rs/tokio-threadpool/0.1.15/tokio_threadpool/fn.blocking.html
when you use a blocking block in a task, it blocks the _entire_ task,
not just the bit which you're marking blocking. This means that when
we're in one giant join_all calling read_dir on a bunch of directories,
we end up serializing them to only read one directory at a time.

This change introduces a new logging::Executor (a slightly weird place
for it, but it depends on the logging stuf, so...) which mirrors the
tokio::Runtime and the futures::sync::oneshot::spawn APIs to
conveniently allow blocking Futures to be spawned in their own tasks, so
that they don't hinder parallelism within a task.
illicitonion added a commit to twitter/pants that referenced this pull request Jul 5, 2019
tokio-fs has some weird semantics whereby using blocking in a larger
task (e.g. mapping stats from the result of a readdir) forces the work
to not be done in parallel.

I'm pretty sure we _can_ make tokio-fs do what we want efficiently, but
it's taking longer than I'd hoped, so I'm re-introducing the separate io
threadpool so we don't have a performance regression while I
investigate.

This reverts a small portion of pantsbuild#7898 manually.
illicitonion added a commit to twitter/pants that referenced this pull request Jul 5, 2019
tokio-fs has some weird semantics whereby using blocking in a larger
task (e.g. mapping stats from the result of a readdir) forces the work
to not be done in parallel.

I'm pretty sure we _can_ make tokio-fs do what we want efficiently, but
it's taking longer than I'd hoped, so I'm re-introducing the separate io
threadpool so we don't have a performance regression while I
investigate.

This reverts a small portion of pantsbuild#7898 manually.
illicitonion added a commit that referenced this pull request Jul 10, 2019
* Introduce task_executor

This is a single object we can pass around to allow all sorts of future
running to happen, with logging happening properly, rather than needing
to pass around different threadpools and manually sort out logging.

* Core has an Executor not a Runtime

* Scandir runs on io pool

* PosixFS uses IO pool

* ShardedLMDB uses io pool

* Calculate fingeprint on io pool

* Add TODO to move Executor some time

* Add docstring to spawn_on_io_pool

* Move Executor to its own crate

* fmt
illicitonion added a commit that referenced this pull request Jul 10, 2019
* Introduce task_executor

This is a single object we can pass around to allow all sorts of future
running to happen, with logging happening properly, rather than needing
to pass around different threadpools and manually sort out logging.

* Core has an Executor not a Runtime

* Scandir runs on io pool

* PosixFS uses IO pool

* ShardedLMDB uses io pool

* Calculate fingeprint on io pool

* Add TODO to move Executor some time

* Add docstring to spawn_on_io_pool

* Move Executor to its own crate

* fmt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Re-remove PosixFS threadpool
2 participants