executor: rewrite the work-stealing thread pool #1657
This patch is a ground up rewrite of the existing work-stealing thread
At a high level, the following architectural changes were made:
This article goes into details of the implementation and would be helpful reading when reviewing this PR: https://tokio.rs/blog/2019-10-scheduler/
Local run queues
Move away from crossbeam's implementation of the Chase-Lev deque. This
Reduce cross-thread synchronization
This is done via many small improvements. Primarily, an upper bound is
Refactor task structure
Now that Tokio is able to target a rust version that supports
When possible, complexity is reduced in the implementation. This is done
Secondly, we have (temporarily) removed
The thread pool benchmarks have improved significantly:
Old thread pool
New thread pool
Real-world benchmarks improve significantly as well. This is testing the hyper hello world server using
This patch is a ground up rewrite of the existing work-stealing thread pool. The goal is to reduce overhead while simplifying code when possible. At a high level, the following architectural changes were made: - The local run queues were switched for bounded circle buffer queues. - Reduce cross-thread synchronization. - Refactor task constructs to use a single allocation and always include a join handle (#887). - Simplify logic around putting workers to sleep and waking them up. Move away from crossbeam's implementation of the Chase-Lev deque. This implementation included unnecessary overhead as it supported capabilities that are not needed for the work-stealing thread pool. Instead, a fixed size circle buffer is used for the local queue. When the local queue is full, half of the tasks contained in it are moved to the global run queue. This is done via many small improvements. Primarily, an upper bound is placed on the number of concurrent stealers. Limiting the number of stealers results in lower contention. Secondly, the rate at which workers are notified and woken up is throttled. This also reduces contention by preventing many threads from racing to steal work. Now that Tokio is able to target a rust version that supports `std::alloc` as well as `std::task`, the pool is able to optimize how the task structure is laid out. Now, a single allocation per task is required and a join handle is always provided enabling the spawner to retrieve the result of the task (#887). When possible, complexity is reduced in the implementation. This is done by using locks and other simpler constructs in cold paths. The set of sleeping workers is now represented as a `Mutex<VecDeque<usize>>`. Instead of optimizing access to this structure, we reduce the amount the pool must access this structure. Secondly, we have (temporarily) removed `threadpool::blocking`. This capability will come back later, but the original implementation was way more complicated than necessary.
I tried out the branch with some tokio-postgres benchmarks and hit this panic:
I’m sorry to barge in like that. I‘d just like to cross post what I wrote on Reddit over here where it might be a better fit.
In particular I‘d like to mention the
When shutting down the scheduler, the first step is to close the global (injection) queue. Once this is done, it is possible for a task to be notified from *outside* of a scheduler thread. This will attempt to push the task onto the injection queue but fail. When this happens, the task must be explicitly be shutdown or bad things will happen.