add sugar methods/properties to AsyncResult #1548

minrk · 2012-04-04T02:47:04Z

Adds the following attributes / methods:

ar.wall_time = received - submitted
ar.serial_time = sum of serial computation time
ar.elapsed = time since submission (wall_time if done)
ar.progress = (int) number of sub-tasks that have completed
len(ar) = # of tasks
ar.wait_interactive(): prints progress

These are simple methods derived from the metadata/timestamps already created. But I've been persuaded by @wesm's practice of including simple methods that do useful (and/or cool) things.

This also required/revealed some minor fixes/cleanup to clear_output in some cases:

dedent base core.displaypub.clear_output, so it's actually defined in the class
clear_output publishes '\r\b', so it will clear terminal-like frontends that don't understand full clear_output behavior.
core.display.clear_output() still works, even outside an IPython session.

minrk · 2012-04-04T04:57:08Z

At @fperez request, I had a go at addressing the flicker when using clear_output in the notebook. I used a simple timeout, which gets cleared/flushed immediately on the next output-related action.

I won't object to moving that to a separate PR, but I did it here as it is related, and this new code makes significant use of the clear;print;repeat pattern for which the flicker is most problematic.

minrk · 2012-04-04T04:59:02Z

this notebook shows a few examples of print, the new AR.wait_interactive, and plotting with clear_output, all of which seem better behaved than previously.

minrk · 2012-04-09T22:21:22Z

This PR now depends on PR #1563

fperez · 2012-04-14T04:13:25Z

@minrk, I merged #1563 but now it's producing some conflicts. Could you rebase this one? I'll be happy to review it quickly...

fperez · 2012-04-14T04:14:58Z

And as I commented in that other one, might as well make the notebook with examples part of the PR, so we beef up the notebooks we provide with the system for users. That will become even more useful once we start including their html for in the docs, along with a link to the original .ipynb file, as we'll be able to point users to those pages in the docs.

* ar.wall_time = received - submitted * ar.serial_time = sum of serial computation time * ar.elapsed = time since submission (wall_time if done) * ar.progress = (int) number of sub-tasks that have completed * ar.wait_interactive(): prints progress * len(ar) = # of tasks

minrk · 2012-04-14T04:48:42Z

rebased - I'll add the notebook in the morning, probably.

minrk · 2012-04-14T05:37:38Z

Progress notebook added, demos:

clear_output/print/flush
AsyncResult.wait_interactive
clear_output/display(Figure)
HTML/JS progress bar
PyMC ProgressBar class

fperez · 2012-04-14T07:30:04Z

docs/examples/notebooks/Progress.ipynb

+     "collapsed": false,
+     "input": [
+      "from IPython.core.display import clear_output",
+      "for i in range(10):",


It's missing an import time here.

ah, yes. I have import os,sys,time in my startup files, so I always forget to import them.

fperez · 2012-04-14T07:38:51Z

Minor comments inline, easy to fix up and we'll be good to go. This is awesome.

demos intermediate progress with: * clear_output/print/flush * AsyncResult.wait_interactive * clear_output/display(Figure) * HTML/JS progress bar * PyMC ProgressBar class

minrk · 2012-04-14T19:32:56Z

I made your edits to the notebook, and started a doc for the AsyncResult object. I really want to make it clear that the AsyncResult object itself is where most of our API lives. At least the nifty parts.

I have one question we should resolve before merging. In wall_time, I use received - submitted, which is the actual roundtrip time for the Client. The only problem with this one is that the received timestamp is not especially reliable, particularly in interactive use cases. The reason being that this timestamp is made when the result is pulled off of the queue in the Client, as a result of Client.spin(). So if the result of the computation arrives either while the user is sitting idle at an interactive prompt or just performing some client-side computation, that time is added to the wall_time measure.

This is a larger issue for AsyncHubResults (those fetched from the Hub), where the received stamp could technically be days after the computation finished.

Possible partial solutions to this problem:

HubResults should use completed-submitted, ignoring result-reply overhead
all AsyncResults should use completed-submitted (most consistent, but probably not most useful/interesting)
HubResults should just not have a wall_time property
Add an analogue to received to the Hub's DB, and use that in HubResults
Leave it as-is, and let users deal with wall_time rarely being useful in HubResults (it would still be accurate if the HubResult is requested and waited upon while the computation is pending)

I'm personally inclined towards either one of the last two, but I want there to be a convenient

There are a couple of other timings that might be useful to add:

last completed - first started (actual time spent working, excluding overhead - I don't know what it's name should be)
last completed - first submitted (wall_time excluding reply overhead, which may not be interesting or meaningful)
last received - first started (excludes start overhead, which could have just been waiting for other jobs in the queue)
idle time (total time spent on each engine not during a computation) - this might be computed per-engine to draw one of those pipeline concurrency figures.

I don't know how many of these I should actually include. Perhaps I should just make a method that makes it easy to do any of the last/first X - first/last Y deltas.

appease crappy tools that can't deal with reasonable filenames.

allows 'wall_time' to make sense in cases other than simple waiting AsyncResult.

for computing various comparisons of timestamps in AsyncResults

minrk · 2012-04-14T22:26:56Z

Okay, review addressed I think:

spaces removed from filename
new properties documented
some basic tests added
I went with adding received to the Hub's DB for resolving the wall_time issue for HubResults
I added AsyncResult.timedelta() for comparing a pair of timestamp sets, which is used in AR.wall_time.

Did I miss anything else?

runs Client.spin() in a background thread at a set interval

minrk · 2012-04-15T04:04:56Z

Added Client.spin_thread(interval) / stop_spin_thread() for running spin in a background thread, to keep zmq queue clear.

fperez · 2012-04-15T04:53:54Z

Great, merging now. Awesome job, thanks!

@wesm

Add sugar methods/properties to AsyncResult that are generically useful: * `ar.wall_time` = received - submitted * `ar.serial_time` = sum of serial computation time * `ar.elapsed` = time since submission (wall_time if done) * `ar.progress` = (int) number of sub-tasks that have completed * `len(ar)` = # of tasks * `ar.wait_interactive()`: prints progress These are simple methods derived from the metadata/timestamps already created. But I've been persuaded by @wesm's practice of including simple methods that do useful (and/or cool) things. This also required/revealed some minor fixes/cleanup to clear_output in some cases: * dedent base `core.displaypub.clear_output`, so it's actually defined in the class * clear_output publishes `'\r\b'`, so it will clear terminal-like frontends that don't understand full clear_output behavior. * `core.display.clear_output()` still works, even outside an IPython session. Added a new notebook that shows how to use these new methods and how to do simple animations/progress bars using `clear_output()`. Added `Client.spin_thread(interval)` / `stop_spin_thread()` for running spin in a background thread, to keep zmq queue clear. This can be used to ensure that timing information is as accurate as possible (at the cost of having a background thread active).

@wesm

Add sugar methods/properties to AsyncResult that are generically useful: * `ar.wall_time` = received - submitted * `ar.serial_time` = sum of serial computation time * `ar.elapsed` = time since submission (wall_time if done) * `ar.progress` = (int) number of sub-tasks that have completed * `len(ar)` = # of tasks * `ar.wait_interactive()`: prints progress These are simple methods derived from the metadata/timestamps already created. But I've been persuaded by @wesm's practice of including simple methods that do useful (and/or cool) things. This also required/revealed some minor fixes/cleanup to clear_output in some cases: * dedent base `core.displaypub.clear_output`, so it's actually defined in the class * clear_output publishes `'\r\b'`, so it will clear terminal-like frontends that don't understand full clear_output behavior. * `core.display.clear_output()` still works, even outside an IPython session. Added a new notebook that shows how to use these new methods and how to do simple animations/progress bars using `clear_output()`. Added `Client.spin_thread(interval)` / `stop_spin_thread()` for running spin in a background thread, to keep zmq queue clear. This can be used to ensure that timing information is as accurate as possible (at the cost of having a background thread active).

minrk mentioned this pull request Apr 9, 2012

clear_output improvements #1563

Merged

fperez reviewed Apr 14, 2012
View reviewed changes

add Animations and Progress example notebook

f5e3cc0

demos intermediate progress with: * clear_output/print/flush * AsyncResult.wait_interactive * clear_output/display(Figure) * HTML/JS progress bar * PyMC ProgressBar class

minrk added 5 commits April 14, 2012 15:21

remove spaces in progress notebook filename

978ac5b

appease crappy tools that can't deal with reasonable filenames.

add 'received' timestamp to DB

bd8a8ec

allows 'wall_time' to make sense in cases other than simple waiting AsyncResult.

add AsyncResult.timedelta

d004201

for computing various comparisons of timestamps in AsyncResults

test new AsyncResult properties

66cc8f8

document new AsyncResult properties

89a00db

add Client.spin_thread()

472cd24

runs Client.spin() in a background thread at a set interval

fperez merged commit e0b4311 into ipython:master Apr 15, 2012

minrk mentioned this pull request May 8, 2012

Progress indicator in the notebook (and perhaps the Qt console) #788

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add sugar methods/properties to AsyncResult #1548

add sugar methods/properties to AsyncResult #1548

minrk commented Apr 4, 2012

minrk commented Apr 4, 2012

minrk commented Apr 4, 2012

minrk commented Apr 9, 2012

fperez commented Apr 14, 2012

fperez commented Apr 14, 2012

minrk commented Apr 14, 2012

minrk commented Apr 14, 2012

fperez Apr 14, 2012

minrk Apr 14, 2012

fperez commented Apr 14, 2012

minrk commented Apr 14, 2012

minrk commented Apr 14, 2012

minrk commented Apr 15, 2012

fperez commented Apr 15, 2012

add sugar methods/properties to AsyncResult #1548

add sugar methods/properties to AsyncResult #1548

Conversation

minrk commented Apr 4, 2012

minrk commented Apr 4, 2012

minrk commented Apr 4, 2012

minrk commented Apr 9, 2012

fperez commented Apr 14, 2012

fperez commented Apr 14, 2012

minrk commented Apr 14, 2012

minrk commented Apr 14, 2012

fperez Apr 14, 2012

Choose a reason for hiding this comment

minrk Apr 14, 2012

Choose a reason for hiding this comment

fperez commented Apr 14, 2012

minrk commented Apr 14, 2012

minrk commented Apr 14, 2012

minrk commented Apr 15, 2012

fperez commented Apr 15, 2012