-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement "perfect" waking for impl Merge for Vec
(1/2)
#50
Conversation
This keeps a vector of futures that are ready to be polled and only polls those. Also adds more testing to make sure the waker logic is correct.
reuse inline wakers
Merged #51 into this branch. see the patch for details, but Tldr: I've managed to reuse the inline wakers, bringing the perf regression down to ~16%, which seems acceptable. On the upside: we appear to be scaling with We should optimize this later, but for now this seems like a really good start and I'd like to merge it! |
} | ||
|
||
impl Wake for StreamWaker { | ||
fn wake(self: std::sync::Arc<Self>) { | ||
if !self.readiness.lock().unwrap().set_ready(self.id) { | ||
self.parent.wake_by_ref() | ||
let parent = self.parent_waker.as_ref().expect("No parent waker was set"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably okay to silently do nothing if the parent waker is not set.
impl Merge for Vec
(1/2)
This changes the implementation of
merge
forVec
to keep a vector of readiness information for each substream. This means we can avoid polling streams that we know have not been woken up yet.This also required significantly re-working the benchmark so wakers actually work right. I've tried to keep the new version true to the spirit of the old version.
Performance is currently a mixed bag, but I think there's room to do better. For small vectors, this change makes things much worse. Performance seems to get better as the vector size increases though.
I suspect in real life vectors will mostly be small, so this change is probably not what we want in the current iteration. I suspect we can improve things quite a bit though, such as by:
Deque
of indexes for ready tasks instead of a bitvector. This way we don't have to search the whole readiness vector. Again, this will probably help more on the high end and may even hurt on the low end.Vec<Arc<StreamWaker>>
on theMerge
struct. This will probably help across the board, but especially on smaller vecs.Summary of performance changes: