New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Notifications out of order #108
Comments
Heyo. I believe this is what's happening:
|
I've just checked out the code, and it is currently certainly possible for this to happen when user code re-issues notification requests for times that are less than the current frontier. I'll explain the trade-offs, and maybe we can talk through what answer is best for you; the good news is that it is all user-level code now, so if you need a certain tweak, it is much easier to do.
This does come at a bit of a cost, in that heaps are more expensive than sorting (which we are already doing) but it seems reasonable that if the I'll whip up something today and report back. |
So, some very minor complexities. The If we switch over to a binary heap, Rust only has a max-heap, which means we will need to wrap each element in order to flip the comparison, and also that it will be a fair bit harder to return a type from I'm poking at the code, and this might work out with a static function passed to map (whose type we can name), or we could write a custom Otherwise the changes seem pretty easy, really. Just Rust getting in the way. ;) |
Something I noticed here, which I'm not clear on (and I don't think it's covered by this example): what do you think should be the semantics of requesting a notification at time Should the delivery be monotonic still, jumping back for the just-requested notification? Example: let index = worker.index();
let mut input = InputHandle::new();
let _probe = worker.dataflow(|scope| {
let mut cap = None;
scope.input_from(&mut input)
.exchange(|x| *x)
.unary_notify::<(), _, _>(Pipeline, "foo", Vec::new(),
move |input, _, notificator| {
input.for_each(|time, _| {
if cap.is_none() && *time == RootTimestamp::new(0) {
cap = Some(time.clone()); // keep a capability for (Root, 0)
}
notificator.notify_at(time);
});
notificator.for_each(|curr, _, notif| {
println!("curr: {:?}", curr.time());
if *curr == RootTimestamp::new(2) {
notif.notify_at(cap.take().unwrap()); // Use that capability to ask for a notification at the timestamp lower then the notification being delivered
}
});
})
.probe()
});
for round in 0..5 {
if index == 0 {
input.send(round);
}
input.advance_to(round + 1);
} Should this be (as it is currently):
or, curr: (Root, 0)
curr: (Root, 1)
curr: (Root, 2)
curr: (Root, 0) <--
curr: (Root, 3) |
@utaal The thing I was just about to do would hop backward in time for you, but it does raise a good point: we can't actually guarantee that you see a non-decreasing sequence of notifications because you could interactively issue a decreasing sequence of notifications that will clear. So it seems the question is mostly about "should notificator tell you about notifications relative to the input frontier, or should it tell you about notifications relative to some internal capability frontier?". E.g. perhaps the example output above should have been
because we should deliver nothing beyond the time zero capability as long as you are still holding on to a time zero capability. This is possibly hard to do, because the capabilities still exist (in the notificator), but we need to know about the existence of extra-notificator capabilities (e.g. the difference between the operator holding a It's a good question and worth thinking about. I believe we can address Sebastian's issue with a "strict improvement" (perhaps with |
Absolutely. I reported because while thinking about the issue @gandro reported, I realised I didn't know exactly what to expect in this other case.
Not sure this is that rare, actually. Sessionisation could totally use this, for example (emit the transaction tree at the timestamp it begun, but only once we've waited for all leaves, or a timeout). |
I have indeed not considered the fact that you can force a jump backwards in time by holding on to a capability. I would be absolutely fine with an alternative user-space way of getting monotonic notifications, and being required to write this myself, but I do think it's very useful thing to have for many stateful operators. |
I'm writing a bit of the code, and it seems I have underestimated the challenge in doing this correctly. :) While we can use a heap for the At the moment I see a few options:
I think 2 + optional 3 sounds pretty good, and is a nice motivation to use I'm not aware of the downside of 2 + 3, in that I don't know what the workload is that wants to avoid the work linear in Edit: Alternate option number 4.: Provide the in-order property for |
From some offline conversation: [In However, each time we change As long as we have the clear separation of And actually, I'm looking at
Actually, one way to think of We can add a |
I've pushed a branch that intends (and appears) to deliver the least available notification from https://github.com/frankmcsherry/timely-dataflow/tree/monotonic_notification A thing worth checking out is the modified test, which attempts to exercise the out-of-order behavior that was previously problematic. I think most of this comes at near-zero incremental cost. The observation, trickling out of the conversation @utaal posted above, is that as a There are still several pending questions about notificators that could be shaken out. Here are a few:
|
This looks great! I'll give it a shot and try to move the sessionization code to the new notificator API |
Ideally the API should be the same. If it seems like it is different, let me know! The intent was that |
Maybe I'm missing something, but the current version on the branch does not create the monotonic Notificator for |
All instances of |
Ah, then I indeed misread it! That simplifies things of course! Thanks
…On Nov 22, 2017 18:08, "Frank McSherry" ***@***.***> wrote:
All instances of Notificator are now monotonic. Instances of
FrontierNotificator may not be, depending on how you use them.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/frankmcsherry/timely-dataflow/issues/108#issuecomment-346414626>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADFhOBVh3uzE7-ELu8LEyVw9mrNPwWJks5s5FT6gaJpZM4QnRMD>
.
|
Sorry for the delay, but yes, it seems to do the trick. The old sessionizion code works again without modifications. Thanks! |
Sweet; PR #109 has been merged, so I'll close down the issue! |
I was under the assumption that the order in which notifications are delivered to an operator is following the partial order defined on timestamps. At least this seemed to be the case in past versions of timely, since our sessionization code relies on this.
However, the following code (which sometimes requests notification for future times during a notifications, like sessionization) observes notifications in a weird order:
Source code:
The text was updated successfully, but these errors were encountered: