Skip to content
This repository

Update parallel magics #1768

Merged
merged 23 commits into from almost 2 years ago

3 participants

Min RK Brian E. Granger Fernando Perez
Min RK
Owner
minrk commented May 27, 2012

I've updated the parallel magics, so they display more than just stdout. They now display all output, so you can do parallel plotting, and everything:

px plot

I need to do a pass on the docs to reflect this change and update a couple of tests before it gets in, but it's pretty much there already.

A largely aesthetic question:

The output is displayed in-order:

  1. engine stdout printed to stdout with '[stdout:eid]' prefix
  2. engine stderr printed to stderr with '[stderr:eid]' prefix
  3. displaypub outputs per-engine
  4. pyout for each engine

Alternatives:

  1. go through the whole shebang for each engine (so stdout/err/output/pyout are contiguous per engine).
  2. collate displaypub, so if there is a sequence of plots from each engine, the plots would go together.

Anyone have a preference on these, or still other options?

An example function that publishes on all channels:

In [9]: def test_outputs():
   ...:     print >> sys.stderr, "err"
   ...:     print "stdout"
   ...:     display(5)
   ...:     print >> sys.stderr, "err again"
   ...:     display('hello displaypub')
   ...:     print "out again"
   ...:     return "return value"

In [10]: dv['test_outputs'] = test_outputs

In [11]: %px test_outputs()
Parallel execution on engine(s): [0, 1]
[stdout: 0] stdout
out again
[stdout: 1] stdout
out again
[stderr: 0] err
err again
[stderr: 1] err
err again
5
'hello displaypub'
5
'hello displaypub'
[ 0] Out[15]: 'return value'
[ 1] Out[15]: 'return value'

In [12]:
added some commits May 27, 2012
Min RK cleanup parallel magics
* remove unused run_code override
* use formatting to build composite docstring
* update docstring
* remove `__init__` that had been reduced to a no-op
* raise UsageError on no active view
e561ea5
Min RK displaypub b64 encoding is non-destructive
`_encode_binary` helper fund encoded in-place, which is destructive
if the dict is ever re-used.

also prevents double-encoding, by only encoding bytes objects, since unicode objects are necessarily already encoded.
79ea1ea
Min RK add colored repr_pretty to ExecuteReply 3b53cb7
Min RK restore %result functionality
It now does what it once did in the days of Twisted long ago:
display the output of the most recent command (e.g. after async %px).
0b80d26
Min RK skip get_dict call in parallel magics 98b4db9
Brian E. Granger
Owner

I have similar logic in my DirectViewWidget that republishes output that comes back from the engines. Here is how I think we should handle this.

  • All of the output republishing logic should be put into a method of the DirectView or AyncResult class (not sure which is better).
  • That method should have a flag that can be use to control if the output is done a) by engine or b) by output type.
  • The paralllel magics can call that method to trigger the output republishing.
  • My DirectViewWidget can also call that method to get the same republishing.

Also, I think we want a cell magic that works like a stateless %autopx. Should that go into this PR? What should we call that new magic?

Brian E. Granger
Owner

Oh, I should also mention that the various magics and direct view widget can expose the flag the determines the order of the output.

Fernando Perez
Owner
fperez commented May 27, 2012

I agree with @ellisonbg that the collate/don't collate behavior needs to have a flag: this is unfortunately one of those things where the answer can legitimately be either, depending on what the user is trying to see. I can easily imagine cases where I'd want either, so making it easy to flip (and having a way for users set a default, if their use cases tend in one direction consistently) is probably the answer here.

Fernando Perez
Owner
fperez commented May 27, 2012

And btw, awesome job here. Once all of this lands in, it's going to be seriously impressive!

Min RK
Owner
minrk commented May 27, 2012

Anything that deals with results like 'republish' definitely belongs on AsyncResult, and not the View. I'll move the contents of _display_result to AysncResult.republish_ouput. That will have to be a tiny bit more careful, as it does not guarantee access to an IPython instance.

Min RK
Owner
minrk commented May 27, 2012

As for collation, I see three basic cases:

  1. group all outputs of each engine
  2. group each class of output (stdout/pyout/etc.) together across engines
  3. group each individual output across engines (collate the inner outputs list, which need not match in length)

Currently I am dong 2 (which is the easiest and most logical, to me), and I doubt we want to support all 3, so what should the flag mean?

Fernando Perez
Owner
fperez commented May 27, 2012

I think not doing 3 is sensible, and how about --group={engine/kind} to flag 1 and 2 respectively? Or --group-outputs, with -o as a shorthand alternative?

Min RK
Owner
minrk commented May 27, 2012

Interesting - I was expecting #3 to be the most likely variation on 2. I think colate=True|False is the best UI, but I suppose unless we are supporting only 1 and 3 or 2 and 3, we can't do that.

What I have currently is AsyncResult.display_outputs(group='type'), where group = engine | type | collate. for cases 1,2,3 respectively. How does that sound?

Brian E. Granger
Owner
Fernando Perez
Owner
fperez commented May 27, 2012
Min RK
Owner
minrk commented May 27, 2012

Okay, I've moved the functionality into AsyncResult.display_outputs, and the three choices for groupby are called: engine, type, order.

I haven't added options to the magics, and I'm not sure I want to, because we are still so bad at parsing magic arguments when the tail of the line should be Python code. For instance, the %result magic, ~100% of the times an arg is specified, it is a negative integer. Do we have parsing in place, where the following would work:

%result --group=engine -2

?

I don't think the value is worth the pain of parsing magic arguments.

As for the %%px magic, what do we want from it? I can imagine wanting to specify targets, but the parallel magics are currently associated with a View, not a Client. Do we want to change that? If we just want existing %px but for a cell, and exposing the above discussed flags (and at least block), that's obvious quite easy.

I would also greatly appreciate any suggestions on output presentation, because while it does contain everything, I don't really like how any of it looks other than stdout/err.

Min RK
Owner
minrk commented May 27, 2012

Also, has anyone actually used long options with magics before? Because the signature for longopts in magic.py is simply wrong.

Fernando Perez
Owner
fperez commented May 27, 2012

I don't think so. In general, we have a bit of a mishmash of approaches to options handling in the various magics.

Now that they have clean, standalone OO setup, it wouldn't be a bad idea to do some weed cleaning in there, and coming up with good canonical practices for all new magics. I couldn't go that far in the big PR due to time constraints...

Min RK
Owner
minrk commented May 27, 2012

I added a super simple %%px cell magic (accepts some group and blocking args), and a bug in Magic.parse_options is fixed and tested.

Fernando Perez
Owner
fperez commented May 28, 2012

Should we think about using perhaps more descriptive names here? Back when all of this was purely for terminal use, terseness was the overriding design criteria, because we were trying to minimize typing above all else. Now in the notebook, I think the balance shifts more towards long-term readability. Thoughts?

Fernando Perez
Owner
fperez commented May 28, 2012

Test results for commit a688afc merged into master
Platform: linux2

Not available for testing:

Fernando Perez
Owner
fperez commented May 28, 2012

I got the same failures in 2.x and 3.x, they look legit. Did you forget to push something perhaps?

Min RK
Owner
minrk commented May 28, 2012

The failures were just a couple of typos, and should be fixed. I have some tests to add before this is done, so I haven't been running the suite too rigorously yet.

As for verbosity - I would be very sad to lose %px a=5, which benefits hugely from its brevity, and the notebook accounts for a tiny fraction of my own usage of it (I think all of the existing parallel line magics are of negligible value in the notebook). The same cannot be said for use as a cell magic, so I think exposing the cell magic as %%parallel or something similarly verbose makes plenty of sense.

Fernando Perez
Owner
fperez commented May 28, 2012

Right, I was thinking mostly of cell magics, that will really see the bulk of their usage in the notebook. While we made it possible to use cell magics in the terminal, it's just like any other multiline construct: ok for a handful of lines, not what you want to use for extended coding.

Min RK
Owner
minrk commented May 28, 2012

Sounds good - I will split %px and %%parallel, mainly so they aren't sharing a docstring.

Fernando Perez
Owner
fperez commented May 28, 2012

Great! I'll play with it some tomorrow then.

Min RK
Owner
minrk commented May 28, 2012

okay, magics split and example notebook added.

Brian E. Granger
Owner
Min RK
Owner
minrk commented May 28, 2012

I think I would rather have a single name (%px and %%px) magic as they really do the same thing.

That makes sense, though I can see them potentially diverging, as more options are added to the cell magic, perhaps to select targets, etc. One thing I do not want to do is support flags/options on the %px line magic. Far too many things go wrong when we parse flags prior to Python code on the same line.

Brian E. Granger
Owner

I am fine with the flags/options diverging. But conceptually, at the core, the line and call versions do that same exact thing. I think this same issue will come up with other magics, such as %timeit. Even there I could imagine the cell and line versions having different options in the log run. But I really like keeping the same name for line and cell versions that do the same thing at the core.

Min RK
Owner
minrk commented May 28, 2012

That makes sense, I'll put it back as %%px, unless @fperez has a reason to object.

Fernando Perez
Owner
fperez commented May 28, 2012

No, not really. I don't think it's great for readability in production code, but we can leave it be for now and reconsider later with more experience.

Brian E. Granger
Owner

I am not actually attached to the name px, I just think the line and cell names should be the same/

Fernando Perez
Owner
fperez commented May 28, 2012

Since I've shifted most of my production usage to the notebook I've become more partial to longer, more descriptive names. But I'm not really too set on anything one way or another, so I'm happy leaving the decision up to Min, since he worked on the implementation. We can always revisit with more experience.

Min RK
Owner
minrk commented May 28, 2012

Okay, so then the choice is which should have priority:

  1. notebook readability for the cell magic (%parallel)
  2. line succinctness for the line magic (%px)
  3. historical precedence (people are not happy when we break backward compatibility because we didn't like the name).

I think %parallel is a bad name for the line magic, and %px is not a good name for the cell magic, so I would lean towards %px if they are to be kept the same.

Fernando Perez
Owner
fperez commented May 28, 2012

I'm ok with leaving it px for now, so we don't break people's habits... We may later come up with other things to do in a yet-to-be-writte parallel magic.

Brian E. Granger
Owner
Min RK
Owner
minrk commented May 28, 2012

Okay, then I'll just add some more tests and scan the docs tonight, then it should be ready.

If anyone wants to play with the output presentation and make suggestions, that would also be much appreciated.

Brian E. Granger
Owner
Fernando Perez
Owner
fperez commented May 28, 2012
Min RK
Owner
minrk commented May 28, 2012

a few tests added for the new stuff, and parallel magic section of the docs has been updated.

I'm still hoping for an aesthetic improvement on the output display, but I don't have any great ideas, and I don't want to block Brian's progress on that, so as far as I'm concerned, this is good to go.

Min RK skip ugly %2i formatting
I decided that misalignment when digits don't match is less ugly than the asymmetrical space caused by '[%2i]' with single-digit numbers, which are most common.
8da04fb
Min RK
Owner
minrk commented May 28, 2012

Test results for commit 53860c4 merged into master
Platform: darwin

  • python2.6: Failed, log at https://gist.github.com/2822753 (libraries not available: matplotlib pymongo qt tornado wx wx.aui)
  • python2.7: OK (libraries not available: wx wx.aui)
  • python3.2: OK (libraries not available: matplotlib pymongo qt wx wx.aui)

Not available for testing:

Min RK
Owner
minrk commented May 28, 2012

The only 2.6 failures above are the assert_ins discussed elsewhere.

Fernando Perez
Owner
fperez commented May 28, 2012

This looks solid, great job! I'm merging it now, so that we can move forward quickly with Brian's branch and stabilize this massive amount of new stuff. Thanks!

Fernando Perez fperez merged commit a136082 into from May 28, 2012
Fernando Perez fperez closed this May 28, 2012
Min RK minrk deleted the branch March 31, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 23 unique commits by 1 author.

May 27, 2012
Min RK cleanup parallel magics
* remove unused run_code override
* use formatting to build composite docstring
* update docstring
* remove `__init__` that had been reduced to a no-op
* raise UsageError on no active view
e561ea5
Min RK displaypub b64 encoding is non-destructive
`_encode_binary` helper fund encoded in-place, which is destructive
if the dict is ever re-used.

also prevents double-encoding, by only encoding bytes objects, since unicode objects are necessarily already encoded.
79ea1ea
Min RK add colored repr_pretty to ExecuteReply 3b53cb7
Min RK restore %result functionality
It now does what it once did in the days of Twisted long ago:
display the output of the most recent command (e.g. after async %px).
0b80d26
Min RK skip get_dict call in parallel magics 98b4db9
Min RK update %result test to reflect new behavior e03ba8e
Min RK add engine_info to execute errors d72b23c
May 28, 2012
Min RK add AsyncResult.display_outputs
for republishing stdout/err/displaypub/pyout.
fe15831
Min RK test Magic.parse_options with long options f0c2b9e
Min RK add `%%px` cell magic 8003883
Min RK add %%px options to docstring c7afc3e
Min RK update %px tests with recent changes 58d39e0
Min RK split `%px` and `%%parallel` line/cell magics 8778c61
Min RK parallel cell magic is now `%%px` 091e1b5
Min RK return AsyncResult on nonblocking %px f31e7a0
Min RK during `%autopx` call display_outputs inside builtin_trap
it detects IPython with get_ipython NameErrors
527525b
Min RK add parallel magics notebook 30a685a
Min RK move some general parallel test utilities to clienttest 0ae639e
Min RK split parallel magics tests into own file
and add several more tests for new behavior
963e6af
Min RK add missing '>> sys.stderr' for one display_outputs case ab9572e
Min RK update parallel magics doc section 03a76f7
Min RK fix a print statement for py3 in %px test aa84e64
Min RK skip ugly %2i formatting
I decided that misalignment when digits don't match is less ugly than the asymmetrical space caused by '[%2i]' with single-digit numbers, which are most common.
8da04fb
Something went wrong with that request. Please try again.