-
Notifications
You must be signed in to change notification settings - Fork 235
Potential sync optimizations for single-threaded schedulers #1471
Description
I've been playing around with stdexec and had a thought about sync_wait and async_scope that I wanted to run by the community. I might be missing something obvious here, so please let me know if this doesn't make sense!
I noticed that when using sync_wait with inline_scheduler, we still use mutexes and condition variables internally, even though we know the work will run on the calling thread. Same thing seems to happen with async_scope when all tasks run on a single thread.
I was wondering:
- Is there a way (or maybe there already is?) to optimize away this synchronization overhead when we know at compile-time that everything will run on the same thread?
- Does the sender type system provide enough information to detect these cases?
Quick example of what made me think about this (godbolt):
#include <stdexec/execution.hpp>
#include <exec/inline_scheduler.hpp>
int main(int argc, char* argv[]) {
auto sdr = stdexec::schedule(exec::inline_scheduler{})
| stdexec::then([argc] { return argc; })
| stdexec::then([](auto n) { return 2 * n; });
auto [res] = stdexec::sync_wait(std::move(sdr)).value();
return res;
}I expected the generated assembly to be equivalent, or extremely close, to simply writing:
int main(int argc, char* argv[]) {
auto then_f = [](auto n) { return 2 * n; };
auto [res] = std::optional{std::tuple{std::move(then_f)(argc)}}.value();
return res;
}However, the generated code uses mutexes and condition variables.
I was thinking: could sync_wait potentially use type traits from the active scheduler to determine if it needs to be thread-safe? That way, it could avoid mutexes and condition variables in single-threaded cases.
The same could apply to async_scopes that run in single-threaded environments.
If my understanding of sync_wait and inline_scheduler is incorrect, please let me know!
Otherwise, it would be interesting to know whether such overhead is avoidable in single-threaded cases or if it's inherently tied to the design.
Thanks for any insights!