# The basic interface for remote computation with IPython

A Client is the low-level object which manages your connection to the various Schedulers and the Hub.
Everything you do passes through one of these objects, either indirectly or directly.

It has an `ids` property, which is always an up-to-date list of the integer engine IDs currently available.

In [1]:
import os,sys,time
import numpy

import ipyparallel as ipp
rc = ipp.Client()

In [2]:
rc.ids

[0, 1, 2, 3]

The most basic function of the **Client** is to create the **View** objects,
which are the interfaces for actual communication with the engines.

There are two basic models for working with engines.  Let's start with the simplest case for remote execution, a DirectView of one engine:

In [3]:
e0 = rc[0] # index-access of a client gives us a DirectView
e0.block = True # let's start synchronous
e0

<DirectView 0>

It's all about:

```python
view.apply(f, *args, **kwargs)
```

We want the interface for remote and parallel execution to be as natural as possible.
And what's the most natural unit of execution?  Code!  Simply define a function,
just as you would use locally, and instead of calling it, pass it to `view.apply()`,
with the remaining arguments just as you would have passed them to the function.

In [4]:
def get_norms(A, levels=[2]):
    """get all the requested norms for an array"""
    norms = {}
    
    for level in levels:
        norms[level] = numpy.linalg.norm(A, level)
    return norms

A = numpy.random.random(1024)
get_norms(A, levels=[1,2,3,numpy.inf])

{1: 519.38110952531167,
 2: 18.614004390578689,
 3: 6.3766767872130785,
 inf: 0.99937396005763779}

To call this remotely, simply replace '`get_norms(`' with '`e0.apply(get_norms,`'. This replacement is generally true for turning local execution into remote.

Note that this will probably raise a `NameError` on numpy:

In [5]:
e0.apply(get_norms, A, levels=[1,2,3,numpy.inf])

RemoteError: NameError(name 'numpy' is not defined)

The simplest way to import numpy is to do:

In [6]:
%px import numpy

But if you want to simultaneously import modules locally and globally, you can use `view.sync_imports()`:

In [7]:
with e0.sync_imports():
    import numpy

importing numpy on engine(s)


In [9]:
e0.apply(get_norms, A, levels=[1,2,3,numpy.inf])

{1: 519.38110952531167,
 2: 18.614004390578689,
 3: 6.3766767872130785,
 inf: 0.99937396005763779}

Functions don’t have to be interactively defined, you can use module functions as well:

In [8]:
e0.apply(numpy.linalg.norm, A, 2)

18.614004390578689

### execute and run

You can also run files or strings with `run` and `execute`
respectively.

For instance, I have a script `myscript.py` that defines a function
`mysquare`:

```python
import math
import numpy
import sys

a=5

def mysquare(x):
    return x*x
```

I can run that remotely, just like I can locally with `%run`, and then I
will have `mysquare()`, and any imports and globals from the script in the
engine's namespace:

In [10]:
%pycat myscript.py

[0;32mimport[0m [0mmath[0m[0;34m[0m
[0;34m[0m[0;32mimport[0m [0mnumpy[0m[0;34m[0m
[0;34m[0m[0;32mimport[0m [0msys[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0ma[0m[0;34m=[0m[0;36m5[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0;32mdef[0m [0mmysquare[0m[0;34m([0m[0mx[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0;32mreturn[0m [0mx[0m[0;34m*[0m[0mx[0m[0;34m[0m[0m


In [None]:
%run myscript.py

In [11]:
e0.run("myscript.py")

<AsyncResult: execute:finished>

In [12]:
e0.execute("b=mysquare(a)")

<AsyncResult: execute:finished>

In [13]:
e0['a']

5

In [14]:
e0['b']

25

## Working with the engine namespace

The namespace on the engine is accessible to your functions as
`globals`. So if you want to work with values that persist in the engine namespace, you just use
global variables.

In [18]:
def inc_a(increment):
    global a
    a += increment

print("   %2i" % e0['a'])
e0.apply(inc_a, 5)
print(" +  5")
print(" = %2i" % e0['a'])

   20
 +  5
 = 25


In [19]:
a = 2
inc_a(4)
a

6

And just like the rest of Python, you don’t have to specify global variables if you aren’t assigning to them:

In [20]:
def mul_by_a(b):
    return a*b

e0.apply(mul_by_a, 10)

250

If you want to do multiple actions on data, you obviously don’t want to send it every time. For this, we have a `Reference` class. A Reference is just a wrapper for an identifier that gets unserialized by pulling the corresponding object out of the engine namespace.

In [21]:
def is_it_a(b):
    return a is b

e0.apply(is_it_a, 5)

False

In [22]:
e0.apply(is_it_a, ipp.Reference('a'))

True

`ipp.Reference` is useful to avoid repeated data movement.

## Moving data around

In addition to calling functions and executing code on engines, you can
transfer Python objects to and from your IPython session and the
engines. In IPython, these operations are called `push` (sending an
object to the engines) and `pull` (getting an object from the
engines).

push takes a dictionary, used to update the remote namespace:

In [23]:
e0.push(dict(a=1.03234, b=3453))

pull takes one or more keys:

In [24]:
e0.pull('a')

1.03234

In [25]:
e0.pull(('b','a'))

[3453, 1.03234]

### Dictionary interface

treating a DirectView like a dictionary results in push/pull operations:

In [27]:
e0['a'] = list(range(5))
e0.execute('b = a[::-1]')
e0['b']

[4, 3, 2, 1, 0]

`get()` and `update()` work as well.

### Exercise: Remote matrix operations

Can you get the eigenvalues (`numpy.linalg.eigvals` and norms (`numpy.linalg.norm`) of an array that's already on e0:

In [28]:
A = numpy.random.random((16,16))
A = A.dot(A.T)
e0['A'] = A

In [29]:
numpy.linalg.eigvals(A)

array([  5.81123779e+01,   4.53010620e+00,   3.20752972e+00,
         3.06123187e+00,   2.13805323e+00,   1.70146182e+00,
         1.10901237e+00,   9.78731510e-01,   9.14248851e-01,
         6.67435609e-01,   1.14621883e-04,   2.29021560e-02,
         7.24785221e-02,   1.53508907e-01,   2.77292206e-01,
         2.27526718e-01])

In [30]:
numpy.linalg.norm(A, 2)

58.112377904224964

In [33]:
e0.apply(numpy.linalg.eigvals, ipp.Reference('A'))

array([  5.81123779e+01,   4.53010620e+00,   3.20752972e+00,
         3.06123187e+00,   2.13805323e+00,   1.70146182e+00,
         1.10901237e+00,   9.78731510e-01,   9.14248851e-01,
         6.67435609e-01,   1.14621883e-04,   2.29021560e-02,
         7.24785221e-02,   1.53508907e-01,   2.77292206e-01,
         2.27526718e-01])

In [34]:
e0.apply(numpy.linalg.norm, ipp.Reference('A'), 2)

58.112377904224964

In [35]:
def both(A):
    return numpy.linalg.eigvals(A), numpy.linalg.norm(A, 2)

In [36]:
e0.apply(both, ipp.Reference('A'))


(array([  5.81123779e+01,   4.53010620e+00,   3.20752972e+00,
          3.06123187e+00,   2.13805323e+00,   1.70146182e+00,
          1.10901237e+00,   9.78731510e-01,   9.14248851e-01,
          6.67435609e-01,   1.14621883e-04,   2.29021560e-02,
          7.24785221e-02,   1.53508907e-01,   2.77292206e-01,
          2.27526718e-01]), 58.112377904224964)

# Asynchronous execution

We have covered the basic methods for running code remotely, but we have been using `block=True`.  We can also do non-blocking execution.

In [37]:
e0.block = False

In non-blocking mode, `apply` submits the command to be executed and
then returns a `AsyncResult` object immediately. The `AsyncResult`
object gives you a way of getting a result at a later time through its
`get()` method.

The AsyncResult object provides a superset of the interface in [`multiprocessing.pool.AsyncResult`](http://docs.python.org/library/multiprocessing#multiprocessing.pool.AsyncResult).
See the official Python documentation for more.

In [38]:
def wait(t):
     import time
     tic = time.time()
     time.sleep(t)
     return time.time()-tic

In [39]:
ar = e0.apply(wait, 10)
ar

<AsyncResult: wait>

`ar.ready()` tells us if the result is ready

In [40]:
ar.ready()

False

`ar.get()` blocks until the result is ready, or a timeout is reached, if one is specified

In [41]:
ar.get(1)

TimeoutError: Result not ready.

In [42]:
%time ar.get()

CPU times: user 22 µs, sys: 1 µs, total: 23 µs
Wall time: 26 µs


10.004847049713135

For convenience, you can set block for a single call with the extra sync/async methods:

In [43]:
e0.apply_sync(os.getpid)

23698

In [44]:
ar = e0.apply_async(os.getpid)
ar

<AsyncResult: getpid>

In [45]:
ar.get()

23698

In [46]:
ar.metadata

{'after': [],
 'completed': datetime.datetime(2017, 6, 28, 14, 49, 5, 589643, tzinfo=tzutc()),
 'data': {},
 'engine_id': 0,
 'engine_uuid': '16810d92-48ab975d414a438df85f0098',
 'error': None,
 'execute_input': None,
 'execute_result': None,
 'follow': [],
 'msg_id': 'ae229bde-dc91b9bf2f5177a885fbe156',
 'outputs': [],
 'received': datetime.datetime(2017, 6, 28, 14, 49, 5, 593520, tzinfo=tzutc()),
 'started': datetime.datetime(2017, 6, 28, 14, 49, 5, 588775, tzinfo=tzutc()),
 'status': 'ok',
 'stderr': '',
 'stdout': '',
 'submitted': datetime.datetime(2017, 6, 28, 14, 49, 5, 560464, tzinfo=tzutc())}

In [47]:
ar.wall_time

0.033056

In [48]:
ar.serial_time

0.000868

Now that we have the basic interface covered, we can really get going [in Parallel](Multiplexing.ipynb).