-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce foreign wake latency #434
Conversation
a14cfa3
to
edab172
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, and great numbers!
My main concern is that the eventfd handling code has broken a lot in the past. It is tricky and hard to test. Because of that, I am not convinced if using the same eventfd for both wake up notifications is the way to go. We need to add more special cases to a code that is already full of them.
Have you considered using a separate eventfd ?
@@ -1604,14 +1601,16 @@ impl Reactor { | |||
membarrier::heavy(); | |||
let events = process_remote_channels() + self.flush_syscall_thread(); | |||
if events == 0 { | |||
self.link_rings_and_sleep(&mut main_ring, &self.eventfd_src) | |||
.expect("some error"); | |||
if self.eventfd_src.is_installed().unwrap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we shouldn't register two eventfds for this?
The wakeup code is tricky, and you are effectively adding one more condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be better to check the eventfd before linking the rings, because if the eventfd is not installed, then we're not going to sleep, so there's no need to do the linking at all.
@@ -27,6 +27,9 @@ pub(crate) struct Header { | |||
/// Current state of the task. | |||
pub(crate) state: u8, | |||
|
|||
/// Latency matters or not | |||
pub(crate) latency_matters: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be better to represent it as Option<RawFd>
, and add Some(eventfd)
. None
would signal that we don't need to write to the eventfd.
@@ -221,6 +227,9 @@ where | |||
dbg_context!(ptr, "foreign", { | |||
let notifier = raw.notifier(); | |||
notifier.queue_waker(Waker::from_raw(Self::clone_waker(ptr))); | |||
if (*raw.header).latency_matters { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use a separate eventfd, we can then write if Some(fd) = latency_matters { write(fd) }
@glommer There seems to be an issue with the existing code here (https://github.com/DataDog/glommio/blob/master/glommio/src/sys/uring.rs#L997-L1016) in that if it failed to install the eventfd because of running out of sqe, the next time when the executor runs here, it will going to sleep without installing the eventfd, because the result of the eventfd_src was already taken during the last run. I'm not quite sure about it, maybe I missed something, but if the reasoning is correct, it could be solved by using the same eventfd for both foreign wake and sleep awakening. |
glommio/src/sys/uring.rs
Outdated
.or_else(Reactor::busy_ok) | ||
.is_err() | ||
{ | ||
// What to do here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you can't install the eventfd, then sleep should be denied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, and even in the case EBUSY was returned.
None, | ||
None, | ||
); | ||
assert!(main_ring.install_eventfd(&eventfd_src)); | ||
|
||
if !eventfd_src.is_installed().unwrap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
install_eventfd is already doing the outer check, so why do we need to check again ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reactor::wait
is directly called in sys::uring::tests::timeout_smoke_test
, so it is here to make the test pass.
Ok. It seems you are also moving the eventfd handler from the main ring to the latency ring in all cases. That makes it simpler to unify. This seems like an ok approach to me. The sleep code is tricky, but it is the kind of tricky that no amount of code reading will convince me that is ok... there are always hard to predict corner cases. @HippoBaro Would you mind taking this for a production spin? Please note that you have to remove |
I'd be happy to. I'll be OOO next week though so it may take a while before I get to that. |
edab172
to
4ec375d
Compare
Greetings @thirstycrow! It looks like your PR added a new or changed an existing dependency, and CI has failed to validate your changes.
Thank you! |
0b4e59b
to
c09df63
Compare
Greetings @thirstycrow! It looks like your PR added a new or changed an existing dependency, and CI has failed to validate your changes.
Thank you! |
17acb3e
to
fcde0e1
Compare
@glommer Two more commits were added to avoid unnecessary writes to the eventfd, which is reduced by 86% in the |
glommio/src/sys/mod.rs
Outdated
@@ -278,10 +271,10 @@ impl fmt::Debug for OsError { | |||
pub(crate) struct SleepNotifier { | |||
id: usize, | |||
eventfd: std::fs::File, | |||
memory: AtomicUsize, | |||
is_sleeping: AtomicBool, | |||
notified: AtomicBool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the difference between is_sleeping
and notified
?
Do you rely on them both being manipulated as part of the same routine ? I don't see it, but please clarify.
Usually when you have two different atomic variables that have to be set in a particular order, that's when Relaxed
memory order is no longer enough.
glommio/src/sys/mod.rs
Outdated
0 => None, | ||
x => Some(x as _), | ||
pub(crate) fn notify_if_sleeping(&self) { | ||
if self.is_sleeping.load(Ordering::Relaxed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very likely wrong.
Notice that this had Acquire
semantics before, and you are moving it to Relaxed.
There are two atomic variables now, which in most cases is a dead giveaway that stronger consistency is needed.
Note that if the order is Relaxed
, two threads can see those updates in arbitrary order. If you rely on a test for both variables to happen in a particular order, the last of them (inside notify_if_needed) should have Acquire
/ Release
semantics.
glommio/src/sys/mod.rs
Outdated
@@ -370,12 +360,11 @@ impl SleepNotifier { | |||
// This will allow this `eventfd` to be notified. This should not happen | |||
// for the placeholder (disconnected) case. | |||
assert_ne!(self.id, usize::MAX); | |||
self.memory | |||
.store(self.eventfd.as_raw_fd() as _, Ordering::SeqCst); | |||
self.is_sleeping.store(true, Ordering::Relaxed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
You are reducing the memory ordering guarantees while making the code more dependent on ordering by adding another atomic variable.
If you really do believe those should be Relaxed
, please try to make a case as to why (but very likely this can't be Relaxed
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thirstycrow I am still failing to see why we need two atomic variables.
If you swap is_sleeping
to false, wouldn't that have the same effect? Sure the name would be misleading, but we just have to rename the variable.
The sleep code would set this to 1 and then back to 0 at wake_up. The notification code would swap it to 0 and write to the eventfd if it was previously 1. There is a race here in that we can notify twice if we swapped to 0 on the notification side, and woke up at the same time. But I think your code has the same race, and also this is benign if the eventfd is handled correctly -- it will cause an extra exit but won't affect correctness.
I got around to trying this out on our in-house app and although I didn't see improvements, I didn't see any pref regression either (I disabled The change looks really good to me so pending Glauber's approval, I'll gladly merge this 🙇 |
I am happy overall but I think we need to make sure we nail the memory ordering. That's how you end up with Heisenbugs... Let's spend some more cycles on that, and I want to know why you think |
pub(crate) fn notify_if_sleeping(&self) {
if self.is_sleeping.load(Ordering::Relaxed) {
self.notify_if_needed();
}
}
pub(crate) fn notify_if_needed(&self) {
if !self.notified.swap(true, Ordering::Relaxed) {
write_eventfd(self.eventfd_fd());
}
} |
Greetings @thirstycrow! It looks like your PR added a new or changed an existing dependency, and CI has failed to validate your changes.
Thank you! |
1bf8752
to
254aca4
Compare
Greetings @thirstycrow! It looks like your PR added a new or changed an existing dependency, and CI has failed to validate your changes.
Thank you! |
@glommer You're right, one boolean should be enough, the code is updated to work that way. |
We are still, as a side-effect, moving the memory order to relaxed where it was SeqCst before. I think it should be fine, but if you have the chance to run the shared channel torture benchmark that would be a good test. Other than that I'm ready to merge this once it passes all tests @HippoBaro I strongly suspect that the dependency bot is broken, though. Maybe let's disable it for now if it keeps breaking. |
The test failed at the 84 round out of 100 with There's 2 executors created in each round of test and 12 more eventfds shown by |
The bot is not broken, if you look at the stdout of the job, it fails because
|
@HippoBaro How about letting that bot checks security problems against master branch or only trigger it when Cargo.toml changes? Control only emerging problems and leave the inspection of existing problems elsewhere. |
@waynexia This is a great idea. Let me merge this for now (I don't want to block you), and I will take some time to revisit the bot when I get the chance. |
What does this PR do?
Reduce foreign wake latency by sending notifications via the event fd installed on the latency ring. Notifications will only be sent if the task being waken was spawned on a queue with latency requirement. A comparison between with and without notifications produced by the foreign_wake benchmark:
Motivation
What inspired you to submit this pull request?
Related issues
A list of issues either fixed, containing architectural discussions, otherwise relevant
for this Pull Request.
Additional Notes
Anything else we should know when reviewing?
Checklist
[] I have added unit tests to the code I am submitting
[] My unit tests cover both failure and success scenarios
[] If applicable, I have discussed my architecture