Skip to content

newparallel branch (add zmq.parallel submodule) #254

Merged
merged 137 commits into from Apr 8, 2011

3 participants

@minrk
IPython member
minrk commented Jan 28, 2011

This shouldn't be merged for a while. This is mainly to provide a venue for the review conversation.

For the client API: zmq.parallel.client,view,remotefunction,asyncresult,dependency

For the controller: zmq.parallel.controller,scheduler,heartmonitor

For the engine: zmq.parallel.streamkernel,engine

General: zmq.forward, zmq.tunnel, zmq.parallel.streamsession

@fperez
IPython member
fperez commented Feb 22, 2011

Just opening a comment so I'm cc'd on all future comments here (it would be nice if you could subscribe to a pull/issue without commenting, like the 'nosy' list that roundup has or the equivalent feature in launchpad).

We'll need to start working on this monster soon. It's big and scary, but also powerful :)

@ellisonbg
IPython member

Min, I just saw that you had moved the code over to IPython.parallel. Very nice! One question. There are lots of modules now in IPython.parallel. Do you think it would make sense to make subpackages like IPython.parallel.client|controller|engine to localize the code for the different parts. Obviously we might want common stuff in IPython.parallel, but that would help the rest of us to navigate the code more easily.

@minrk
IPython member
minrk commented Apr 7, 2011

Sure, that would make sense.

Logical subpackages:

client (client, asyncresult, view, dependency, remotefunction)
controller (hub, controller, *db, scheduler, heartmonitor)
engine (engine, streamkernel)
apps (*app, launcher, winhpc)

top level (cluster_dir, factory, streamsession)

There is some cross-linking:

scheduler also imports from dependency

engine also imports from heartmonitor

Does that mean they should be top-level?

@ellisonbg
IPython member
@minrk
IPython member
minrk commented Apr 7, 2011

I did some reorganization.

  • engine has engine-only code
  • client has all the client-only code
  • hub has all the hub-only code
  • scheduler has scheduler and dependency
  • apps has *app, controller, launcher, logwatcher, etc.

I can merge 'hub+scheduler' into 'controller', but it made more sense to me to do from ...scheduler import dependency than from ...controller import dependency.

I can definitely put the *app.py code in engine/controller/etc., if you think that sounds better. If I do, it would be mean adding 'log' and 'cluster' subpackages, and merging 'hub,' 'scheduler' into 'controller'.

Browse here:

https://github.com/ipython/ipython/tree/newparallel/IPython/parallel

@ellisonbg
IPython member
@fperez
IPython member
fperez commented Apr 7, 2011
@minrk
IPython member
minrk commented Apr 7, 2011

I presume you mean 'scheduler is only two files', because 'hub' is five. Honestly, the only reason they are separate is the point I mentioned earlier, that 'from controller import dependency' doesn't feel right to me, and 'from scheduler import dependency' does. That said, user code will only ever import from IPython.parallel at the top-level, as that's where the API resides, so it's less important to have an intuitive import path.

i.e.:

from IPython.parallel import Dependency, require

which both live in IPython.parallel.scheduler.dependency.

The Hub and Scheduler really are entirely decoupled - there are no links between the code in parallel.hub and parallel.scheduler, besides expected TCP connections. The only coupling of the two appears in the controller startup script, that configures, launches, and connects them both. You could definitely start a Hub without any schedulers, and schedulers without a hub, but we just haven't written those startup scripts.

As for your last question, I can certainly imagine schedulers growing, but I don't think we have any direct plans to really do so.

What do you think? Merge Hub+Scheduler into Controller? That would leave engine/controller/client, which has a certain appeal, but while it is possible (and the norm) to use the Hub+Schedulers in the same way as the Controller was used before, they really are separate.

Currently, I think I am leaning towards merging them, I guess.

@fperez
IPython member
fperez commented Apr 7, 2011
@fperez
IPython member
fperez commented Apr 7, 2011

@ellisonbg, Min and I had lunch and talked quite a bit about this. At this point, my vote is to go ahead with the merge, once Min finishes a few small things he has in-flight locally. There are still obviously improvements to be made, but since we nuked twisted already, I don't see a benefit to holding this in a branch any longer, given that the main outline is in pretty good shape and we agree on the fundamentals. This will also give everybody a chance to start testing it and possibly helping us with some of the necessary improvements.

So from my side, this is done. Brian, let us know if you have anything else you want to go over before merge, otherwise Min should go ahead when he's ready.

@ellisonbg
IPython member
@fperez
IPython member
fperez commented Apr 8, 2011
minrk and others added some commits Oct 13, 2010
@minrk minrk control channel progress c369179
@minrk minrk prep newparallel for rebase
This mainly involves checking out files @ 568f2f4, to allow for cleaner application of changes after that point, where there are
no longer name conflicts.
fcee637
@minrk minrk use new stream.flush() 48fd5d1
@minrk minrk heartmonitor uses flush 69e9c2c
@minrk minrk view decorators for syncing history/results a329cbf
@minrk minrk whitespace 5cee18e
@minrk minrk fixed buffer serialization for buffers below threshold 7992112
@minrk minrk added dependency decorator 5297768
@minrk minrk dependency cleanup 48c8946
@minrk minrk added dependencies & Python scheduler 48fdd06
@minrk minrk added squash_unicode to prevent unicode pollution of json 44caf84
@minrk minrk added zmq controller/engine entry points 3ab94ec
@minrk minrk scheduler progress 7f42766
@minrk minrk added simple cluster entry point d405cd6
@minrk minrk use print_function 77b4226
@minrk minrk ipclusterz notices if controller fails to start 9715461
@minrk minrk major cleanup of client code + purge_request implemented 077c1af
@minrk minrk general parallel code cleanup b8dd49c
@minrk minrk codeutil into zmq, to prevent IPython.kernel import f2c5960
@minrk minrk removed unicode from error dict 595c3ac
@minrk minrk basic LoadBalancedView, RemoteFunction fe4b48c
@minrk minrk use wrap_exception in controller, fix clear on kernel 89abd30
@minrk minrk added zmq.parallel to setupbase 1da17dd
@minrk minrk added bound arg to RemoteFunction e198829
@minrk minrk quiet down scheduler printing, fix dep_id check in update_dependencies f1b3eb2
@minrk minrk added py4science demos as examples + NetworkX DAG dependencies b78cec8
@minrk minrk add all completed task IDs to Scheduler.all_done 168e61b
@minrk minrk reorganized a few files 6596e8a
@minrk minrk ignore docs/build,_build 88a0371
@minrk minrk util files into utils dir 8473145
@minrk minrk Parallel kernel/engine startup looks a bit more like pykernel f6e1461
@minrk minrk Controller won't raise on invalid identity prefixes. 1028b8d
@minrk minrk added basic tunneling with ssh or paramiko 9b5cebb
@minrk minrk added preliminary ssh tunneling support for clients 99528f8
@minrk minrk Moved parallel test files to parallel subpackages
and tweaks related to tests
0bea148
@minrk minrk mostly docstrings 9d70cce
@minrk minrk added exec_key and fixed client.shutdown 1160d9f
@fperez fperez Allow argv and namespace control to be passed to engines/controller. cde7916
@fperez fperez Fix small bug when no key given to client 5bb27f6
@minrk minrk add timestamps all messages; fix reply on wrong channel bug. 4eb1ed7
@minrk minrk Started DB backend with mongoDB support. 7289c9f
@minrk minrk Clients can now shutdown the controller. ffd81f5
@minrk minrk add to some client docstrings, fix some typos 4d5f89a
@minrk minrk str(etype) 65bfa60
@minrk minrk adapt kernel/error.py to zmq, improve error propagation. b0d94c7
@minrk minrk some docstring cleanup 689bb47
@minrk minrk ignore PyCrypto RandPool warning on paramiko import 8e2ebb2
@minrk minrk clone parallel docs to parallelz a4b0811
@minrk minrk add map/scatter/gather/ParallelFunction from kernel 15b7567
@minrk minrk split pendingresult and remotefunction into own files, add view.map. 44e6eb1
@minrk minrk PendingResult->AsyncResult; match multiprocessing.AsyncResult api cd84656
@minrk minrk tweaks related to docs + add activate() for magics 5172055
@minrk minrk initial draft of core zmq.parallel docs c98ee05
@minrk minrk protect LBView.targets, AsyncResult._msg_ids -> .msg_ds 7d08ffd
@minrk minrk match return shape in AsyncResult to sync results e9e0d81
@minrk minrk docs include 'apply' 03b00f7
@minrk minrk added preliminary tests for zmq.parallel 39eab52
@minrk minrk multitarget returns list instead of dict 6431092
@minrk minrk parallelz updates a49b646
@minrk minrk improved client.get_results() behavior 6f48516
@minrk minrk Controller renamed to Hub (keeping ipcontrollerz) b484660
@minrk minrk update connection/message docs for newparallel 10b1790
@minrk minrk add rich AsyncResult behavior 84a3424
@minrk minrk propagate iopub to clients 8554e33
@minrk minrk improved logging + Hub,Engine,Scheduler are Configurable 600411f
@fperez fperez Add little soma workflow example e2a7b43
@minrk minrk Refactor newparallel to use Config system
This is working, but incomplete.
2c04431
@minrk minrk adapt kernel's ipcluster and Launchers to newparallel 2d79d3e
@minrk minrk tweak dagdeps for new AsyncResult objects 9f1a03a
@minrk minrk Improvements to dependency handling
Specifically:
  * add 'success_only' switch to Dependencies
  * Scheduler handles some cases where Dependencies are impossible to meet.
d51586b
@minrk minrk updated newparallel examples, moved into docs 8d078bc
@minrk minrk rework logging connections c221efd
@minrk minrk add timeout for unmet dependencies in task scheduler 23d1906
@minrk minrk tasks on engines when they die fail instead of hang
This is only true in the Python scheduler, and
not for any ZMQ scheduler (MUX,control,pure)
4094d44
@minrk minrk untwist PBS, WinHPC Launchers in newparallel ba75686
@minrk minrk persist connection data to disk as json 68dde43
@minrk minrk add ipcluster engines; fix ssh process shutdown 65f8e24
@minrk minrk StreamSession better handles invalid/missing keys 7fb6c41
@minrk minrk update process/security docs in parallelz ca65815
@minrk minrk parallelz doc updates, metadata bug fixed. 0ef3375
@minrk minrk ssh tunneling utils into IPython.external.ssh 2dbc452
@minrk minrk client.run is now like %run, not TaskClient.run 5bcc966
@minrk minrk update parallel demos for newparallel afabc88
@minrk minrk newparallel tweaks, fixes
* warning on unregistered engine
* Python LRU scheduler is now default
* Client registration includes task scheme
* Fix typos associated with some renaming
* fix demo links
* warning typo
86f3ca8
@minrk minrk update with new client registration reply 487a2d0
@minrk minrk dependency tweaks + dependency/scheduler docs 75d9c51
@minrk minrk allow load-balancing across subsets of engines a514d13
@minrk minrk fix DictDB/MongoDB arguments disagreement d865e85
@minrk minrk support iterating through map results as they arrive 46d7c9d
@minrk minrk small bugs e5447b0
@minrk minrk copy default kernel configs for newparallel 9a84a98
@minrk minrk add default ip<x>z_config files d34f193
@minrk minrk resort imports in a cleaner order d72427c
@minrk minrk initial loglevel back to INFO 9d2c45b
@minrk minrk use new ip<x>z_config.py defaults b6802e2
@minrk minrk add scripts for non-setuptools install of zmq.parallel 02c97fe
@minrk minrk API update involving map and load-balancing 498a93f
@minrk minrk Client -> HasTraits, update examples with API tweaks 154798b
@minrk minrk some initial tests for newparallel e0abb03
@minrk minrk fix/test pushed function globals 11c0dbe
@minrk minrk split get_results into get_result/result_status, add AsyncHubResult 7724ff7
@minrk minrk add zmq checking in iptest 60c800a
@minrk minrk testing fixes 4f574e1
@minrk minrk eliminate relative imports a0bfb1f
@minrk minrk add Reference object 7c24f2e
@minrk minrk cleanup pass f36800d
@minrk minrk launcher updates for PBS 8a8cbf5
@minrk minrk Add SQLite backend, DB backends are Configurable
also fix small numpy typo in newserialized
97ce9f8
@minrk minrk adjustments to PBS/SGE, SSH Launchers + docs update b18ddd8
@minrk minrk pickle length-0 arrays. 33bd9f2
@minrk minrk update mpi doc 194772e
@minrk minrk fix small client bugs + tests 5e3faa6
@minrk minrk more graceful handling of dying engines cfbd77b
@minrk minrk add sqlitedb backend 681a891
@minrk minrk add inter-engine communication example 864a845
@minrk minrk add message tracking to client, add/improve tests 62f8971
@minrk minrk reflect revised apply_bound pattern e5c3761
@minrk minrk Add wave2D example 1f49077
@minrk minrk remove all PAIR sockets, Merge registration+query 6115ed9
@minrk minrk update connections and diagrams for reduced sockets 8fb951e
@minrk minrk Update PBS/SGE launchers with 0.10.1 options and defaults 4286001
@minrk minrk copyright statements 35ad828
@minrk minrk pyzmq-2.1.3 related testing adjustments ee9089a
@minrk minrk wave2d example using single view, instead of repeated 'rc[:]' 540ed02
@minrk minrk Doc tweaks and updates 11592a7
@minrk minrk update API after sagedays29
tests, docs updated to match

* Client no longer has high-level methods (only in Views)
* module functions can be pushed
* clients can have a connection timeout
* dependencies have separate switches for success/failure, not just success_only
* add `with view.temp_flags(**flags):` for temporary flags

Also updated some docs and examples
e90463b
@minrk minrk add DirectView.importer contextmanager, demote targets to mutable flag
* @require now also takes modules, and will import
* IPython.zmq.parallel is the new entrypoint, not client
b5b9a12
@minrk minrk move IPython.zmq.parallel to IPython.parallel a6a0636
@minrk minrk add shutdown to Views 037d01b
@minrk minrk SGE test related fixes
* allow iopub messages to arrive first on Hub
* SQLite no longer commits immediately
* parallelwave example
* typos in launcher.py
9634efd
@minrk minrk add missing external.ssh to setupbase.py 4d0058f
@minrk minrk updates to docs and examples 45e272d
@minrk minrk move old parallel figures into newparallel dir 24641d1
@minrk minrk rebase IPython.parallel after removal of IPython.kernel
This commit removes all '*z' suffixes from scripts and docs,
as there is no longer conflict with IPython.kernel.
e950e62
@minrk minrk organize IPython.parallel into subpackages b9f5480
@minrk minrk add pymongo to iptest exclusions 8c88c90
@minrk minrk remove kernel examples already ported to newparallel 1a971fc
@minrk minrk merged commit 1a971fc into master Apr 8, 2011
@dwrensha dwrensha pushed a commit that referenced this pull request Jun 25, 2014
@andrewrk andrewrk improved streaming playback reliability
closes #254
closes #247
d33b6f9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.