Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch back to using one threadpool #7898

Merged

Conversation

Projects
None yet
2 participants
@illicitonion
Copy link
Contributor

commented Jun 19, 2019

This reverts #7848 and fixes the performance issue found in it.

The first commit is just a revert (with a few manual merge cleanups), the second commit fixes the performance issue.

Fixes #7896

illicitonion added some commits Jun 19, 2019

Use efficient file reading
The previous method ended up reading lots of small byte buffers, and
then joining them quadratically. This just fills a buffer.

@illicitonion illicitonion requested review from stuhood, ity and blorente Jun 19, 2019

fmt
@ity

ity approved these changes Jun 19, 2019

Copy link
Contributor

left a comment

looks great - thank you for coming back to it!

@illicitonion illicitonion merged commit fa50e6e into pantsbuild:master Jun 19, 2019

1 check failed

continuous-integration/travis-ci/pr The Travis CI build failed
Details

@illicitonion illicitonion deleted the twitter:dwagnerhall/tokio/unrevert-clean branch Jun 19, 2019

illicitonion added a commit to twitter/pants that referenced this pull request Jul 3, 2019

Fix performance regression introduced by pantsbuild#7898
As per
https://docs.rs/tokio-threadpool/0.1.15/tokio_threadpool/fn.blocking.html
when you use a blocking block in a task, it blocks the _entire_ task,
not just the bit which you're marking blocking. This means that when
we're in one giant join_all calling read_dir on a bunch of directories,
we end up serializing them to only read one directory at a time.

This change introduces a new logging::Executor (a slightly weird place
for it, but it depends on the logging stuf, so...) which mirrors the
tokio::Runtime and the futures::sync::oneshot::spawn APIs to
conveniently allow blocking Futures to be spawned in their own tasks, so
that they don't hinder parallelism within a task.

illicitonion added a commit to twitter/pants that referenced this pull request Jul 3, 2019

Fix performance regression introduced by pantsbuild#7898
As per
https://docs.rs/tokio-threadpool/0.1.15/tokio_threadpool/fn.blocking.html
when you use a blocking block in a task, it blocks the _entire_ task,
not just the bit which you're marking blocking. This means that when
we're in one giant join_all calling read_dir on a bunch of directories,
we end up serializing them to only read one directory at a time.

This change introduces a new logging::Executor (a slightly weird place
for it, but it depends on the logging stuf, so...) which mirrors the
tokio::Runtime and the futures::sync::oneshot::spawn APIs to
conveniently allow blocking Futures to be spawned in their own tasks, so
that they don't hinder parallelism within a task.

illicitonion added a commit to twitter/pants that referenced this pull request Jul 3, 2019

Fix performance regression introduced by pantsbuild#7898
As per
https://docs.rs/tokio-threadpool/0.1.15/tokio_threadpool/fn.blocking.html
when you use a blocking block in a task, it blocks the _entire_ task,
not just the bit which you're marking blocking. This means that when
we're in one giant join_all calling read_dir on a bunch of directories,
we end up serializing them to only read one directory at a time.

This change introduces a new logging::Executor (a slightly weird place
for it, but it depends on the logging stuf, so...) which mirrors the
tokio::Runtime and the futures::sync::oneshot::spawn APIs to
conveniently allow blocking Futures to be spawned in their own tasks, so
that they don't hinder parallelism within a task.

illicitonion added a commit to twitter/pants that referenced this pull request Jul 5, 2019

Switch to use IO pool instead of tokio-fs
tokio-fs has some weird semantics whereby using blocking in a larger
task (e.g. mapping stats from the result of a readdir) forces the work
to not be done in parallel.

I'm pretty sure we _can_ make tokio-fs do what we want efficiently, but
it's taking longer than I'd hoped, so I'm re-introducing the separate io
threadpool so we don't have a performance regression while I
investigate.

This reverts a small portion of pantsbuild#7898 manually.

illicitonion added a commit to twitter/pants that referenced this pull request Jul 5, 2019

Switch to use IO pool instead of tokio-fs
tokio-fs has some weird semantics whereby using blocking in a larger
task (e.g. mapping stats from the result of a readdir) forces the work
to not be done in parallel.

I'm pretty sure we _can_ make tokio-fs do what we want efficiently, but
it's taking longer than I'd hoped, so I'm re-introducing the separate io
threadpool so we don't have a performance regression while I
investigate.

This reverts a small portion of pantsbuild#7898 manually.

illicitonion added a commit that referenced this pull request Jul 10, 2019

Fix performance regression introduced by #7898 (#8006)
* Introduce task_executor

This is a single object we can pass around to allow all sorts of future
running to happen, with logging happening properly, rather than needing
to pass around different threadpools and manually sort out logging.

* Core has an Executor not a Runtime

* Scandir runs on io pool

* PosixFS uses IO pool

* ShardedLMDB uses io pool

* Calculate fingeprint on io pool

* Add TODO to move Executor some time

* Add docstring to spawn_on_io_pool

* Move Executor to its own crate

* fmt

illicitonion added a commit that referenced this pull request Jul 10, 2019

Fix performance regression introduced by #7898 (#8006)
* Introduce task_executor

This is a single object we can pass around to allow all sorts of future
running to happen, with logging happening properly, rather than needing
to pass around different threadpools and manually sort out logging.

* Core has an Executor not a Runtime

* Scandir runs on io pool

* PosixFS uses IO pool

* ShardedLMDB uses io pool

* Calculate fingeprint on io pool

* Add TODO to move Executor some time

* Add docstring to spawn_on_io_pool

* Move Executor to its own crate

* fmt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.