Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DefaultExecutor not being able to exit #1876

Merged
merged 7 commits into from
Mar 30, 2020
Merged

Conversation

dkhalanskyjb
Copy link
Collaborator

Solves #856.

Copy link
Contributor

@elizarov elizarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are here, fixing this code, I'd also suggest to make the following performance improvement.

See, right now looping for events works like this:

  1. The next event is taken from the queue and executed.
  2. nextTime is called to see when the event after this happens (and the queue is analyzed for that). Assume we have an event in the queue, so
  3. We immediately go to 1, where we look at the queue again to remove this event.

It seems inefficient to check queues twice in a row. Instead, I'd suggest to always return 0 from processNextEvent when even was just executed and return nextTime only when no event was executed.

kotlinx-coroutines-core/common/src/EventLoop.common.kt Outdated Show resolved Hide resolved
@@ -249,9 +258,15 @@ internal abstract class EventLoopImplBase: EventLoopImplPlatform(), Delay {
}
}

override fun processNextEvent(): Long {
override fun popNextTask(onlyUnconfined: Boolean): Runnable? {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot find a place where this method if called with onlyUnconfined = true.

@dkhalanskyjb
Copy link
Collaborator Author

If we do this optimization, then the decomposition of process into (poorly named) pop and run is not needed: I introduced it only so that DefaultExecutor could discern between a situation where tasks arrive slower than they are executed (and process always returns Long.MAX_VALUE) and one where there are no tasks at all (and process always returns Long.MAX_VALUE).

Should I keep the decomposition or should I revert it and instead implement the small change to the logic of process that is also a performance optimization?

@elizarov
Copy link
Contributor

elizarov commented Mar 24, 2020

@dkhalanskyjb

If we do this optimization, then the decomposition of process into (poorly named) pop and run is not needed

It would be also great to avoid decomposition. It will make the code easier to understand.

This test could in theory already fail on the second
`checkNotEmpty`: after the first `checkNotEmpty` has passed,
first, the ticker channel awakens to produce a new element, and
then the main thread awakens to check if it's there. However, if
the ticker channel is sufficiently slowed down, it may not produce
the element in time for the main thread to find it.

After introducing the change that allows the worker thread in
`DefaultExecutor` to shut down, the initial delay of 2500 ms is
sufficient for the worker to shut down (which, by default, happens
after 1000 ms of inactivity), and then the aforementioned race
condition worsens: additional time is required to launch a new
worker thread and it's much easier to miss the deadline.

Now, the delays are much shorter, meaning that the worker thread is
never inactive long enough to shut down.
Also adds the performance optimization. See the discussion on the
PR.
@elizarov elizarov merged commit 96b41b4 into develop Mar 30, 2020
@elizarov elizarov deleted the defaultDispatcherTimeout branch March 30, 2020 15:54
qwwdfsad pushed a commit that referenced this pull request Apr 24, 2020
* Fix DefaultExecutor not being able to exit.
* Also adds the performance optimization. See the discussion on the PR.
* Add a stress test for the DefaultExecutor worker shutting down.
* Make `testDelayChannelBackpressure2` not fail:

This test could in theory already fail on the second
`checkNotEmpty`: after the first `checkNotEmpty` has passed,
first, the ticker channel awakens to produce a new element, and
then the main thread awakens to check if it's there. However, if
the ticker channel is sufficiently slowed down, it may not produce
the element in time for the main thread to find it.

After introducing the change that allows the worker thread in
`DefaultExecutor` to shut down, the initial delay of 2500 ms is
sufficient for the worker to shut down (which, by default, happens
after 1000 ms of inactivity), and then the aforementioned race
condition worsens: additional time is required to launch a new
worker thread and it's much easier to miss the deadline.

Now, the delays are much shorter, meaning that the worker thread is
never inactive long enough to shut down.

Fixes #856
recheej pushed a commit to recheej/kotlinx.coroutines that referenced this pull request Dec 28, 2020
* Fix DefaultExecutor not being able to exit.
* Also adds the performance optimization. See the discussion on the PR.
* Add a stress test for the DefaultExecutor worker shutting down.
* Make `testDelayChannelBackpressure2` not fail:

This test could in theory already fail on the second
`checkNotEmpty`: after the first `checkNotEmpty` has passed,
first, the ticker channel awakens to produce a new element, and
then the main thread awakens to check if it's there. However, if
the ticker channel is sufficiently slowed down, it may not produce
the element in time for the main thread to find it.

After introducing the change that allows the worker thread in
`DefaultExecutor` to shut down, the initial delay of 2500 ms is
sufficient for the worker to shut down (which, by default, happens
after 1000 ms of inactivity), and then the aforementioned race
condition worsens: additional time is required to launch a new
worker thread and it's much easier to miss the deadline.

Now, the delays are much shorter, meaning that the worker thread is
never inactive long enough to shut down.

Fixes Kotlin#856
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants