Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

parallelz doc updates, metadata bug fixed.

  • Loading branch information...
commit 0ef33756e30cd0c34f13aaf6d7020f09b54f567c 1 parent ca65815
@minrk minrk authored
View
3  IPython/zmq/parallel/asyncresult.py
@@ -30,6 +30,9 @@ class AsyncResult(object):
Provides the same interface as :py:class:`multiprocessing.AsyncResult`.
"""
+
+ msg_ids = None
+
def __init__(self, client, msg_ids, fname=''):
self._client = client
self.msg_ids = msg_ids
View
4 IPython/zmq/parallel/client.py
@@ -455,8 +455,8 @@ def _extract_metadata(self, header, parent, content):
md = {'msg_id' : parent['msg_id'],
'received' : datetime.now(),
'engine_uuid' : header.get('engine', None),
- 'follow' : parent['follow'],
- 'after' : parent['after'],
+ 'follow' : parent.get('follow', []),
+ 'after' : parent.get('after', []),
'status' : content['status'],
}
View
2  IPython/zmq/parallel/ipclusterapp.py
@@ -263,7 +263,7 @@ def create_default_config(self):
self.default_config.Global.engine_launcher = \
'IPython.zmq.parallel.launcher.LocalEngineSetLauncher'
self.default_config.Global.n = 2
- self.default_config.Global.delay = 1
+ self.default_config.Global.delay = 2
self.default_config.Global.reset_config = False
self.default_config.Global.clean_logs = True
self.default_config.Global.signal = signal.SIGINT
View
75 docs/source/parallelz/parallel_intro.txt
@@ -53,7 +53,8 @@ Architecture overview
The IPython architecture consists of four components:
* The IPython engine.
-* The IPython controller.
+* The IPython hub.
+* The IPython schedulers.
* The controller client.
These components live in the :mod:`IPython.zmq.parallel` package and are
@@ -79,9 +80,9 @@ to the user.
IPython controller
------------------
-The IPython controller provides an interface for working with a set of engines. At a
-general level, the controller is a collection of processes to which IPython engines and
-clients can connect. The controller is composed of a :class:`Hub` and a collection of
+The IPython controller processes provide an interface for working with a set of engines.
+At a general level, the controller is a collection of processes to which IPython engines
+and clients can connect. The controller is composed of a :class:`Hub` and a collection of
:class:`Schedulers`. These Schedulers are typically run in separate processes but on the
same machine as the Hub, but can be run anywhere from local threads or on remote machines.
@@ -92,8 +93,8 @@ the client's :meth:`.Client.apply` method, with various arguments, or
constructing :class:`.View` objects to represent subsets of engines. The two
primary models for interacting with engines are:
-* A MUX interface, where engines are addressed explicitly.
-* A Task interface, where the Scheduler is trusted with assigning work to
+* A **Direct** interface, where engines are addressed explicitly.
+* A **LoadBalanced** interface, where the Scheduler is trusted with assigning work to
appropriate engines.
Advanced users can readily extend the View models to enable other
@@ -108,7 +109,7 @@ styles of parallelism.
The Hub
*******
-The center of an IPython cluster is the Controller Hub. This is the process that keeps
+The center of an IPython cluster is the Hub. This is the process that keeps
track of engine connections, schedulers, clients, as well as all task requests and
results. The primary role of the Hub is to facilitate queries of the cluster state, and
minimize the necessary information required to establish the many connections involved in
@@ -138,16 +139,42 @@ Security
IPython uses ZeroMQ for networking, which has provided many advantages, but
one of the setbacks is its utter lack of security [ZeroMQ]_. By default, no IPython
-connections are secured, but open ports only listen on localhost. The only
+connections are encrypted, but open ports only listen on localhost. The only
source of security for IPython is via ssh-tunnel. IPython supports both shell
-(`openssh`) and `paramiko` based tunnels for connections.
+(`openssh`) and `paramiko` based tunnels for connections. There is a key necessary
+to submit requests, but due to the lack of encryption, it does not provide
+significant security if loopback traffic is compromised.
In our architecture, the controller is the only process that listens on
network ports, and is thus the main point of vulnerability. The standard model
for secure connections is to designate that the controller listen on
-localhost, and use ssh-tunnels on the same machine to connect clients and/or
+localhost, and use ssh-tunnels to connect clients and/or
engines.
+To connect and authenticate to the controller an engine or client needs
+some information that the controller has stored in a JSON file.
+Thus, the JSON files need to be copied to a location where
+the clients and engines can find them. Typically, this is the
+:file:`~/.ipython/clusterz_default/security` directory on the host where the
+client/engine is running (which could be a different host than the controller).
+Once the JSON files are copied over, everything should work fine.
+
+Currently, there are two JSON files that the controller creates:
+
+ipcontroller-engine.json
+ This JSON file has the information necessary for an engine to connect
+ to a controller.
+
+ipcontroller-client.json
+ The client's connection information. This may not differ from the engine's,
+ but since the controller may listen on different ports for clients and
+ engines, it is stored separately.
+
+More details of how these JSON files are used are given below.
+
+A detailed description of the security model and its implementation in IPython
+can be found :ref:`here <parallelsecurity>`.
+
.. warning::
Even at its most secure, the Controller listens on ports on localhost, and
@@ -157,9 +184,6 @@ engines.
Controller is insecure. There is no way around this with ZeroMQ.
-.. TODO: edit parallelsecurity
-A detailed description of the security model and its implementation in IPython
-can be found :ref:`here <parallelsecurity>`.
Getting Started
===============
@@ -170,7 +194,7 @@ simply start a controller and engines on a single host using the
:command:`ipclusterz` command. To start a controller and 4 engines on your
localhost, just do::
- $ ipclusterz -n 4
+ $ ipclusterz start -n 4
More details about starting the IPython controller and engines can be found
:ref:`here <parallel_process>`
@@ -189,12 +213,21 @@ everything is working correctly, try the following commands:
Out[4]: set([0, 1, 2, 3])
In [5]: c.apply(lambda : "Hello, World", targets='all', block=True)
- Out[5]: {0: 'Hello, World', 1: 'Hello, World', 2: 'Hello, World', 3:
- 'Hello, World'}
+ Out[5]: [ 'Hello, World', 'Hello, World', 'Hello, World', 'Hello, World' ]
+
+
+When a client is created with no arguments, the client tries to find the corresponding
+JSON file in the local `~/.ipython/clusterz_default/security` directory. If it finds it,
+you are set. If you have put the JSON file in a different location or it has a different
+name, create the client like this:
+
+.. sourcecode:: ipython
+
+ In [2]: c = client.Client('/path/to/my/ipcontroller-client.json')
-Remember, a client needs to be able to see the Hub. So if they
-are on a different machine, and you have ssh access to that machine,
-then you would connect to it with::
+Remember, a client needs to be able to see the Hub's ports to connect. So if they are on a
+different machine, you may need to use an ssh server to tunnel access to that machine,
+then you would connect to it with:
.. sourcecode:: ipython
@@ -203,8 +236,8 @@ then you would connect to it with::
Where 'myhub.example.com' is the url or IP address of the machine on
which the Hub process is running.
-You are now ready to learn more about the :ref:`MUX
-<parallelmultiengine>` and :ref:`Task <paralleltask>` interfaces to the
+You are now ready to learn more about the :ref:`Direct
+<parallelmultiengine>` and :ref:`LoadBalanced <paralleltask>` interfaces to the
controller.
.. [ZeroMQ] ZeroMQ. http://www.zeromq.org
View
32 docs/source/parallelz/parallel_multiengine.txt
@@ -1,10 +1,10 @@
.. _parallelmultiengine:
-===============================
-IPython's multiengine interface
-===============================
+==========================
+IPython's Direct interface
+==========================
-The multiengine interface represents one possible way of working with a set of
+The direct, or multiengine, interface represents one possible way of working with a set of
IPython engines. The basic idea behind the multiengine interface is that the
capabilities of each engine are directly and explicitly exposed to the user.
Thus, in the multiengine interface, each engine is given an id that is used to
@@ -19,7 +19,7 @@ To follow along with this tutorial, you will need to start the IPython
controller and four IPython engines. The simplest way of doing this is to use
the :command:`ipclusterz` command::
- $ ipclusterz -n 4
+ $ ipclusterz start -n 4
For more detailed information about starting the controller and engines, see
our :ref:`introduction <ip1par>` to using IPython for parallel computing.
@@ -36,16 +36,17 @@ module and then create a :class:`.Client` instance:
In [2]: rc = client.Client()
-This form assumes that the controller was started on localhost with default
-configuration. If not, the location of the controller must be given as an
-argument to the constructor:
+This form assumes that the default connection information (stored in
+:file:`ipcontroller-client.json` found in `~/.ipython/clusterz_default/security`) is
+accurate. If the controller was started on a remote machine, you must copy that connection
+file to the client machine, or enter its contents as arguments to the Client constructor:
.. sourcecode:: ipython
- # for a visible LAN controller listening on an external port:
- In [2]: rc = client.Client('tcp://192.168.1.16:10101')
- # for a remote controller at my.server.com listening on localhost:
- In [3]: rc = client.Client(sshserver='my.server.com')
+ # If you have copied the json connector file from the controller:
+ In [2]: rc = client.Client('/path/to/ipcontroller-client.json')
+ # for a remote controller at 10.0.1.5, visible from my.server.com:
+ In [3]: rc = client.Client('tcp://10.0.1.5:12345', sshserver='my.server.com')
To make sure there are engines connected to the controller, use can get a list
@@ -63,8 +64,8 @@ Quick and easy parallelism
In many cases, you simply want to apply a Python function to a sequence of
objects, but *in parallel*. The client interface provides a simple way
-of accomplishing this: useing the builtin :func:`map` and the ``@remote``
-function decorator.
+of accomplishing this: using the builtin :func:`map` and the ``@remote``
+function decorator, or the client's :meth:`map` method.
Parallel map
------------
@@ -72,7 +73,7 @@ Parallel map
Python's builtin :func:`map` functions allows a function to be applied to a
sequence element-by-element. This type of code is typically trivial to
parallelize. In fact, since IPython's interface is all about functions anyway,
-you can just use the builtin :func:`map`, or a client's :map: method:
+you can just use the builtin :func:`map`, or a client's :meth:`map` method:
.. sourcecode:: ipython
@@ -179,7 +180,6 @@ blocks until the engines are done executing the command:
In [2]: rc.block=True
In [3]: dview = rc[:] # A DirectView of all engines
In [4]: dview['a'] = 5
-
In [5]: dview['b'] = 10
View
2  docs/source/parallelz/parallel_task.txt
@@ -24,7 +24,7 @@ To follow along with this tutorial, you will need to start the IPython
controller and four IPython engines. The simplest way of doing this is to use
the :command:`ipclusterz` command::
- $ ipclusterz -n 4
+ $ ipclusterz start -n 4
For more detailed information about starting the controller and engines, see
our :ref:`introduction <ip1par>` to using IPython for parallel computing.
Please sign in to comment.
Something went wrong with that request. Please try again.