-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UV_RUN_ONCE regression after commit 6600954 #4260
Comments
Thanks for posting, I wasn't sure. IMHO any change to the loop diagram is a no-no since it shows a change in behavior, which many applications have grown to expect. |
I think originally this code was implemented on top of ev_check in libev, with the intent that check should run before IO callbacks and timers should run after IO. This was so that check handlers could correctly consume events from the poll handle before the IO callbacks (which might feed changes to those). The libuv version of this diagram has always been slightly incomplete and inaccurate however, since it ignores and combines several IO phases. It appeared that check was not intended for keeping the loop alive (in particular, close callbacks have always run after it, so any attempt to use it to keep the loop alive would have already been incorrect in the presence of close callbacks). There also appears to be some discussion about how the library is intended to be used by the Perl Coro for And specifically also covers how to implement your use (calling it "Abusing an ev_check watcher for its side-effect"): |
Not really. It was defined to run after the polling phase. Yes, there is no naming baggage, which was to some extent inherited from libev to be sure, but it is what it is. An interesting example is uv_idle_t, which has nothing to do with being idle. but since we promised to do v1.x forever, it is what it is.
The main building blocks were correct, however.
No, check was never meant to keep the loop alive. Check was defined to run last, which gave you a chance to do something to keep the loop alive. That something can be to start an idle handle for example. While the PR didn't break the API or ABI it did break userland applications, which I think we shouldn't do. I don't think this was intentional, we can fix it, bit we first need to understand how we wanrt to do that. |
It wasn't intentional and I agree we shouldn't be breaking userland applications. We've always told users that they shouldn't write code that depends on event loop internals, but I guess that's not realistic in all cases.
TBH I'm not following this. check handles still run, which should still create the idle handle, so the return from Unless somehow |
Arguably this is observable (and documented, in the diagram) behavior. IIRC we used to say that to Node land, not to those using libuv directly.
Let me try to rephrase: With the previous behavior:
With the current one:
|
Makes sense, and thanks for the example. That clears things up. When I made the change I thought the event loop diagram was only informative on how things are run, and didn't realize it was a guarantee on execution order (couldn't find any other documentation that check should always run last). Hypothetically it would make the most sense for timers to run after the poll phase, but that causes |
Is the consensus here to revert? |
Or is it to move check to be last again? |
I am confused though, since actually 'close' runs last in nodejs, so using check for the stated use case sounds like it would always have been wrong |
Please take a look at #4260 (comment) I think I did a better job explaining the situation there. Close callbacks don't have a relevance here. |
I may be missing something from the context, but why are promises that were started by close events (or one of the other run-once modes that nodejs may use at times) not relevant to correct uses of the framework? Shouldn't the code need to immediately ref the idle callback as soon as the promise is created, or otherwise will risk skipping the check and idle callbacks entirely in a variety of existing scenarios? |
I personally think that it would be better to move running check handles and closing handles after timers handles, but unfortunately I find this slightly confusing since if the old diagram was literal then the test should have failed already. The test asserts that the check handle runs before the timer handle. Which is a contradiction to what @saghul said was expected. So there's a discrepancy in what was documented and what is expected. Here's the diff on linux to make sure I'm clear about what I mean: diff --git a/src/unix/core.c b/src/unix/core.c
index 25c5181f..0e923ae8 100644
--- a/src/unix/core.c
+++ b/src/unix/core.c
@@ -460,5 +460,2 @@ int uv_run(uv_loop_t* loop, uv_run_mode mode) {
- uv__run_check(loop);
- uv__run_closing_handles(loop);
-
uv__update_time(loop);
@@ -466,2 +463,5 @@ int uv_run(uv_loop_t* loop, uv_run_mode mode) {
+ uv__run_check(loop);
+ uv__run_closing_handles(loop);
+
r = uv__loop_alive(loop); |
Not quite. It expects the check cb to be called before the timer fires because it's not due. Check here for example, in an older release: Line 385 in 4e69e33
We poll for the given timeout, then we run check handles, and since the loop was ran with UV_RUN_DEFAULT, we go back to the top and run the timers. |
Note I'm nor using Node here. It's a different app, but it so happens to also be a JavaScript runtime :-) In my case, close handles don't run any JS code, they just cleanup, so it's not possible for them to enqueue any work / promises. At this point I do have a workaround that is not ugly and does the job, but I raised the problem because it's very subtle and I think it could be hard to find in a large app for someone not intiamtely familiar with libuv's internals. I wrote my code based on my understanding of the guarantees libuv provides, which I documented in that diagram 9 years ago. For the past 9 years that has held true, until now. |
The test showcases a hypothetical race condition. Take the following change: diff --git a/test/test-timer-from-check.c b/test/test-timer-from-check.c
index e5f5cb2f..e50c3115 100644
--- a/test/test-timer-from-check.c
+++ b/test/test-timer-from-check.c
@@ -69,2 +69,3 @@ TEST_IMPL(timer_from_check) {
ASSERT_OK(uv_timer_start(&timer_handle, timer_cb, 50, 0));
+ uv_sleep(100);
ASSERT_OK(uv_run(uv_default_loop(), UV_RUN_DEFAULT)); This snippet causes the test to fail both before and after the change. The process being artificially hung up and preventing timers from running before reaching the timeout is unlikely to happen, but it's more about the mental modal of how the event loop is supposed to work. It's a demonstration of why I thought timer execution placement within the loop was only informative and not a guarantee. Because timer execution is unreliable by nature. This might be more of an argument to support why I'd prefer the change I suggested above, and to change the test, but seems that might open another can of worms. |
Did we reach a conclusion here? |
I'm somewhere in the neighbourhood of -0.5 on keeping the existing behavior. Maybe not many would run into problems, it's hard to tell. I wonder if internally separating I could maybe take crack at that this week or the next. |
IMO we should use the change from #4260 (comment), which would fix the problem @saghul is having, and we change the test |
Refs: 6600954
Here is how my code works in a nutshell:
After this change, timers run last, and thus my check handle hasn't run as the last thing, which is what I used to keep the loop alive by starting the idle handle. This means the loop could exit early, if new promises were created on a timer, since check handles no longer run afterwards.
Originally posted by @saghul in #3686 (comment)
The text was updated successfully, but these errors were encountered: