Switch to fully batched scheduling & add blocking task warning #127

njsmith · 2016-11-23T10:31:20Z

This PR makes curio's scheduling proceed in strict batches: each batch
starts with polling for I/O, and if a task becomes runnable now, it won't
actually run until the next batch. Basically this gives us the invariant
that no single task can be scheduled twice without checking for I/O in
between.

This prevents various obscure pathologies (see gh-112), and brings us in
line with how most (all?) other event loops work; see for example:
http://docs.libuv.org/en/v1.x/design.html#the-i-o-loop

It also adds a warning if any individual task blocks the event loop
for more than 50 ms. This value is pretty arbitrary, but at least it's
a start. Combined with the above change, this should at least be
sufficient to detect any gratuitous starvation issues.

dabeaz · 2016-11-23T12:37:38Z

Any idea how this impacts performance?

dabeaz · 2016-11-23T12:43:50Z

On the performance comment. One of my concerns is about surrounding the execution of each task with timer calls. In my experience, timer calls aren't free and I've been burned by this sort of thing in the past (accidentally creating code that becomes "timer-bound" if you will). The uvloop document referenced makes a note about avoiding too many timer-related system calls as well.

With that in mind, I might be inclined to make the timer-related warning optional.

ZhukovAlexander · 2016-11-23T12:58:50Z

Maybe just relax the timer to only measure the total time of the loop cycle instead of per-task? Sure, we would lose some information, but this is less of an overhead IMHO.

dabeaz · 2016-11-23T15:56:46Z

I did some looking at the source to gevent to see how it is handling this. So far as I can tell, it is NOT batching tasks around the event loop quite like this. If I'm reading the source correctly (the _run_callbacks() method in https://github.com/gevent/gevent/blob/master/src/gevent/libev/corecffi.py), it looks like it runs all available callbacks (including new ones added by run_callback()), but with a counter. After 1000 callbacks, it goes back to the event loop.

dabeaz · 2016-11-23T16:05:00Z

Looking at asyncio, it appears to be doing some kind of batching similar to what's seen in this PR. The implementation is a bit simpler. It simply computes the length of the ready queue when it starts scheduling and only schedules that many tasks--regardless of whether or not more tasks get added to the queue in the process. I think I like that better (you don't need to manage two separate queue objects).

asyncio also includes some code to check for slow callbacks using timers. However, that code is only enabled if it's running in debug mode.

ZhukovAlexander · 2016-11-23T16:15:21Z

I think we're making a Kernel.run method pretty fat with all those changes to how we run a single task. Maybe it's time to refactor it into something like Kernel._run_single_task?

ZhukovAlexander · 2016-11-23T16:24:05Z

I've created a wiki page where we can put some description about the curio's event loop internals. I think we need some generalisation to prevent the big chaotic fluctuations of the course in which the event loop should develop.

dabeaz · 2016-11-23T16:28:01Z

Although the Kernel.run() method is a bit fat, I also consider it to be fairly performance critical. All things equal, I'd rather it be fast than overly abstracted.

ZhukovAlexander · 2016-11-23T18:57:58Z

@dabeaz, ok, if we stick with keeping all functions in a local namespace, we can at least introduce something like _run_task similar to the _cleanup_task so we at least make sure this code live in one place. Otherwise, we might eventually end up with a barely maintainable code. Lets keep it fat but structured in some way. I deeply believe, that maintainability is crucial for curio.

This commit makes curio's scheduling proceed in strict batches: each batch starts with polling for I/O, and if a task becomes runnable now, it won't actually run until the next batch. Basically this gives us the invariant that no single task can be scheduled twice without checking for I/O in between. This prevents various obscure pathologies (see dabeazgh-112), and brings us in line with how most (all?) other event loops work; see for example: http://docs.libuv.org/en/v1.x/design.html#the-i-o-loop

There's a lot more we could do than this in terms of metrics and debugging, but it's a start :-). 50 ms was chosen pretty much completely arbitrarily. I suspect it's too high, but we can always fix it later.

Starting with some simple scheduler microbenchmarks. For information about asv, see: https://asv.readthedocs.io/ https://github.com/spacetelescope/asv Or to get started: pip install asv cd benchmarks asv run

njsmith · 2016-11-24T09:20:01Z

The implementation is a bit simpler. It simply computes the length of the ready queue when it starts scheduling and only schedules that many tasks--regardless of whether or not more tasks get added to the queue in the process.

Oo, that is clever. Rewrote this to use that trick.

Any idea how this impacts performance?

I just added a new commit to this PR with the beginnings of a benchmark suite using asv. Seems like a useful thing to have in general. And I added two tiny pure-scheduler microbenchmarks. (Sorry this PR is getting so unfocused -- I can split it up if you want.)

I then tried using these microbenchmarks to compare version with the time_monotonic calls versus the version without, and wasn't able to get any reliable effect -- the impact seems to be smaller than the trial-to-trial noise. I also tried using perf, which works harder to try and tease out small differences, and it told me that on average the version with the time_monotonic calls was faster. So, I can't put a precise number on it, but it seems that even on a microbenchmark that does nothing but scheduling the slowdown is pretty small (less than, say, 10%?), or I'd have been able to detect it :-).

ZhukovAlexander · 2016-11-24T09:41:41Z

I just added a new commit to this PR with the beginnings of a benchmark suite using asv

This actually looks really promising. Any idea on how we can incorporate the graphs with results in curio's documentation page or somewhere publicly available? Would be really helpful to know the impact that a each PR will have on performance before merging them (without actually running any benchmarks against the target branch in your local env to check this).

njsmith · 2016-11-24T17:00:39Z

Well, if we want public results pages then someone has to set up a server to run the benchmarks and generate the pages. The problem is that benchmarks need a fairly stable, quiet system, so you can't really use travis-ci or similar services. ASV is set up to support fairly ad hoc benchmark running (you can do things like ask it: "please benchmark all commits that are new since the last time I ran this", and then upload the results files to a server). But it isn't as strong at measuring single commits as it is at detecting trends over time, and it's hard to have an automated system run benchmarks on PRs because of the security issues around running untrusted code. . In the mean time, it's certainly possible to use ASV to test a PR locally!

…

On Nov 24, 2016 1:41 AM, "Alexander Zhukov" ***@***.***> wrote: I just added a new commit to this PR with the beginnings of a benchmark suite using asv This actually looks really promising. Any idea on how we can incorporate the graphs with results in curio's documentation page or somewhere publicly available? Would be really helpful to know the impact that a each PR will have on performance before merging them. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#127 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAlOaALKN6L0mSzSSQ78XzXcLpYUJ1xNks5rBVvWgaJpZM4K6a_R> .

dabeaz · 2016-12-01T10:23:10Z

I'm going to merge this so that it doesn't languish on Github. I want to modify it so that the time check is optional though--only enabled if in a debugging mode or if set to a non-zero value.

njsmith added 2 commits November 24, 2016 00:27

Issue a warning if a task blocks the event loop for more than 50 ms

e87857a

There's a lot more we could do than this in terms of metrics and debugging, but it's a start :-). 50 ms was chosen pretty much completely arbitrarily. I suspect it's too high, but we can always fix it later.

njsmith force-pushed the batched-scheduling branch from b235a6a to e87857a Compare November 24, 2016 08:28

Add ASV-based benchmark suite

eb0f0bd

Starting with some simple scheduler microbenchmarks. For information about asv, see: https://asv.readthedocs.io/ https://github.com/spacetelescope/asv Or to get started: pip install asv cd benchmarks asv run

dabeaz merged commit 720019b into dabeaz:master Dec 1, 2016

ZhukovAlexander mentioned this pull request Dec 1, 2016

Unconditionally poll for I/O #115

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to fully batched scheduling & add blocking task warning #127

Switch to fully batched scheduling & add blocking task warning #127

njsmith commented Nov 23, 2016

dabeaz commented Nov 23, 2016

dabeaz commented Nov 23, 2016

ZhukovAlexander commented Nov 23, 2016

dabeaz commented Nov 23, 2016

dabeaz commented Nov 23, 2016

ZhukovAlexander commented Nov 23, 2016 •

edited

Loading

ZhukovAlexander commented Nov 23, 2016

dabeaz commented Nov 23, 2016

ZhukovAlexander commented Nov 23, 2016

njsmith commented Nov 24, 2016

ZhukovAlexander commented Nov 24, 2016 •

edited

Loading

njsmith commented Nov 24, 2016 via email

dabeaz commented Dec 1, 2016

Switch to fully batched scheduling & add blocking task warning #127

Switch to fully batched scheduling & add blocking task warning #127

Conversation

njsmith commented Nov 23, 2016

dabeaz commented Nov 23, 2016

dabeaz commented Nov 23, 2016

ZhukovAlexander commented Nov 23, 2016

dabeaz commented Nov 23, 2016

dabeaz commented Nov 23, 2016

ZhukovAlexander commented Nov 23, 2016 • edited Loading

ZhukovAlexander commented Nov 23, 2016

dabeaz commented Nov 23, 2016

ZhukovAlexander commented Nov 23, 2016

njsmith commented Nov 24, 2016

ZhukovAlexander commented Nov 24, 2016 • edited Loading

njsmith commented Nov 24, 2016 via email

dabeaz commented Dec 1, 2016

ZhukovAlexander commented Nov 23, 2016 •

edited

Loading

ZhukovAlexander commented Nov 24, 2016 •

edited

Loading