In [None]:
%%html
<style type="text/css">
.rendered_html .float-diagram {
  height: 400px;
  margin-top: -4em;
  float: right;
}
</style>

# Overview and getting started

<img src="files/figs/wideView.png" class="float-diagram"/>

The Jupyter protocol provides a mechanism for doing *remote* execution.
IPython Parallel extends the same protocol to *parallel* remote execution for the IPython kernel.

## Architecture overview

<img src="files/figs/wideView.png" class="float-diagram"/>

The IPython architecture consists of four components:

-   The IPython engine
-   The IPython hub
-   The IPython schedulers
-   The cluster client

These components live in the `ipyparallel` package.

### IPython engine

The IPython engine is a Python instance that accepts Python commands over
a network connection.  When multiple engines are started, parallel
and distributed computing becomes possible. An important property of an
IPython engine is that it blocks while user code is being executed. Read
on for how the IPython controller solves this problem to expose a clean
asynchronous API to the user.

### IPython controller
<img src="files/figs/wideView.png" class="float-diagram"/>

The IPython controller processes provide an interface for working with a
set of engines. At a general level, the controller is a collection of
processes to which IPython engines and clients can connect. The
controller is composed of a `Hub` and a collection of
`Schedulers`, which may be in processes or threads.

The controller provides a single point of contact for users who
wish to utilize the engines in the cluster. There is a variety of
different ways of working with a controller, but all of these
models are implemented via the `View.apply` method, after
constructing `View` objects to represent different collections engines.
The two primary models for interacting with engines are:

-   A **Direct** interface, where engines are addressed explicitly.
-   A **LoadBalanced** interface, where the Scheduler is trusted with
    assigning work to appropriate engines.

Advanced users can readily extend the View models to enable other styles
of parallelism.

## IPython client and views

There is one primary object, the `Client`, for
connecting to a cluster. For each execution model, there is a
corresponding `View`. These views allow users to
interact with a set of engines through the interface. Here are the two
default views:

-   The `DirectView` class for explicit addressing.
-   The `LoadBalancedView` class for destination-agnostic
    scheduling.

## Getting Started

## Starting the IPython controller and engines

To follow along with this tutorial, you will need to start the IPython
controller and four IPython engines. The simplest way of doing this is
with the [clusters tab](/#clusters),
or you can use the `ipcluster` command in a terminal:

    $ ipcluster start -n 4

There isn't time to go into it here, but ipcluster can be used to start engines
and the controller with various batch systems including:

* SGE
* PBS
* LSF
* MPI
* SSH
* WinHPC

More information on starting and configuring the IPython cluster in 
[the IPython.parallel docs](http://ipython.org/ipython-doc/stable/parallel/parallel_process.html).

Once you have started the IPython controller and one or more engines,
you are ready to use the engines to do something useful. 

To make sure everything is working correctly, let's do a very simple demo:

In [None]:
import ipyparallel as parallel
rc = parallel.Client()
rc.block = True

In [None]:
rc.ids

In [None]:
def mul(a,b):
    return a*b

In [None]:
def summary():
    """summarize some info about this process"""
    import os
    import socket
    import sys
    return {
        'cwd': os.getcwd(),
        'Python': sys.version,
        'hostname': socket.gethostname(),
        'pid': os.getpid(),
    }

In [None]:
mul(5, 6)

In [None]:
summary()

What does it look like to call this function remotely?

Just turn `f(*args, **kwargs)` into `view.apply(f, *args, **kwargs)`!

In [None]:
rc[0].apply(mul, 5, 6)

In [None]:
rc[0].apply(summary)

And the same thing in parallel?

In [None]:
rc[:].apply(mul, 5, 6)

In [None]:
rc[:].apply(summary)

Python has a builtin map for calling a function with a variety of arguments

In [None]:
list(map(mul, range(1,10), range(2,11)))

So how do we do this in parallel?

In [None]:
view = rc.load_balanced_view()
list(view.map(mul, range(1,20), range(2,21)))

And a preview of parallel magics:

In [None]:
%%px
import os, socket
print(os.getpid())
print(socket.gethostname())

Now let's get into some more detail about how to use IPython for [remote execution](tutorial/Remote Execution.ipynb).