# Experiments as Iterators:  asyncio in Science

<ul>
<li>Daniel Allan</li>
<li>Thomas Caswell</li>
<li>Kenneth Lauer</li>
</ul>

<p>Brookhaven National Lab</p>
<p>Source: https://github.com/NSLS-II</p>
<p>Project Documentation: https://NSLS-II.github.io</p>
</center>

## Origin of this Project

National Synchrontron Light Source II at Brookhaven National Lab

![NSLS-II](https://www.bnl.gov/ps/images/NSLS2-arial-1080px.jpg)

## NSLS-II

* 60 semi-independent research groups (10 so far)
* Scaling up to 19 Pb/year in "expensive pixels"
* No sacred data formats...
* ... but one validatable, extensible (NoSQL) schema for all:
  * metadata
  * data or *references* to data

## Data Acquisition Software Design Goals

* Integrate with the **scipy stack**.
* Support **streaming** data analysis.
* Capture metadata to record
  * a detailed **snapshot** of the hardware (all experiment state);
  * and the scientist's **intention**, the meaning of the measurements.
* Make datasets **searchable** with rich queries on metadata and data.
* As much as possible, minimize inventing a domain-specific language.

## Layers of Software, from the bottom up

* EPICS: Experimental Physics and Industrial Control System
* ophyd, our device abstraction layer
* bluesky, our experiment specification and execution framework

### EPICS

```python
In [1]: import epics

In [2]: epics.caget('SOME_INSCRUTABLE_DEVICE_ID')
Out[2]: 5.0
```

Old-style data acquisition program:

```python
for i in range(5):
    try:
        epics.caput('MOTOR_ID', i)
        value = epics.caget('DETECTOR_ID')
        # bespoke I/O code
    except:
        # bespoke cleanup to ensure hardware safety
```

Metadata is stuffed into filenames or custom headers.

### ophyd (device abstraction layer)

```python
In [3]: import ophyd

In [4]: motor = ophyd.EpicsMotor('MOTOR_ID', name='motor')

In [5]: motor.read()
Out[5]: {'motor':
            {'value': 5.0,
             'timestamp': 1468325228.751564}}
         
In [6]: motor.set(6.0)
```

Devices are expected to support a common interface: `read`, `set`, `stop`, ....

Our devices talk to EPICS, but yours could talk to LabView, RasberryPi, etc.

Devices provide human-friendly names (good for analysis) and a hierarchical structure.

```python
class MultiAxisMirror(ophyd.Device):
    x = ophyd.Component(ophyd.EpicsMotor, ':X')
    y = ophyd.Component(ophyd.EpicsMotor, ':Y')
    pitch = ophyd.Component(ophyd.EpicsMotor, ':P')


mirror = MultiAxisMirror('SOME_ID', name='mirror')

In [1]: mirror.read()
Out[1]: {'mirror_x': {'value': 1.0, ...},
   ...:  'mirror_y': {'value': 1.5, ...},
   ...:  'mirror_pitch': {'value': 0.3, ...}}
```


### bluesky (experiment specification/execution)

New-style data acquisition program:

```python
from bluesky.plans import (open_run, close_run,
                           abs_set, RunEngine,
                           trigger_and_read)
                           
def plan():
    "scan 'motor' from 1 to 5 while reading 'detector'"
    yield from open_run(some_metadata_dict)
    for i in range(5):
        yield from abs_set(motor, i)
        yield from trigger_and_read([detector])
    yield from close_run()
```

The interpreter-like RunEngine performs I/O, safe hardware cleanup, and more.

```python
from bluesky import RunEngine
RE = RunEngine({})

RE(plan())  # execute
```

### A One-Slide Crash Course in ``yield`` and ``yield from``

In [1]:
# Python 2.5+ (PEP 342)

def g():
    # g is a 'generator'
    yield 1
    yield 2
    
a = g()  # a is a 'generator instance'

list(a)

[1, 2]

In [5]:
# Python 3.3+ (PEP 380)

def h():
    yield 0
    yield from g()
    yield 4
    
b = h()

list(b)

[0, 1, 2, 4]

Go watch James Powell's *Generators Will Free Your Mind* on YouTube!

## The rest of this talk:

(1) What exactly is happening with `RunEngine`

(2) Neat implications of expressing a science experiment as a generator

## The RunEngine from Scratch

In Which We Built Progressively More Complex Implementations

### A 'plan' is an iterable of 'messages'

A `Msg` is a `namedtuple` with four fields:

* a **command**, given as a string, e.g., ``'set'`` or ``'sleep'``
* a target **obj**, e.g., ``motor`` (if applicable)
* positional **args**
* **kwargs**

In [59]:
from bluesky import Msg

Msg('sleep', None, 1)

sleep: (None), (1,), {}

### Version 0: the simplest possible RunEngine

In [72]:
import time
from bluesky import Msg

function_map = {'sleep': lambda msg: time.sleep(*msg.args)}

def RE_v0(plan):
    for msg in plan:
        print('PROCESSING: %r' % (msg,))
        func = function_map[msg.command]
        func(msg)
        
sleepy_plan = [Msg('sleep', None, 1)]

RE_v0(sleepy_plan)

PROCESSING: sleep: (None), (1,), {}


### Version 1: a RunEngine that supports adaptive plan logic

In [73]:
from bluesky.utils import ensure_generator

def RE_v1(plan):
    plan = ensure_generator(plan)
    last_result = None

    while True:
        try:
            msg = plan.send(last_result)
        except StopIteration:
            break
        print('PROCESSING: %r' % (msg,))
        func = function_map[msg.command]
        last_result = func(msg)

function_map['sum'] = lambda msg: sum(msg.args)

In [136]:
def adding_plan():
    "Ask the RunEngine to add to numbers. Print the result."
    yield Msg('sleep', None, 1)
    ret = yield Msg('sum', None, 3, 4)
    print('RECEIVED:', ret)
    
RE_v1(adding_plan())

PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (3, 4), {}
RECEIVED: 7


In [138]:
def adaptive_adding_plan():
    "Keep adding 3 until the result is greater than 8."
    ret = 1
    while True:
        yield Msg('sleep', None, 1)
        ret = yield Msg('sum', None, ret, 3)
        print('RECEIVED:', ret)
        if ret > 8:
            print('we are done')
            break
            
RE_v1(adaptive_adding_plan())

PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (1, 3), {}
RECEIVED: 4
PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (4, 3), {}
RECEIVED: 7
PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (7, 3), {}
RECEIVED: 10
we are done


### Version 2: refactor as a callable class

A class, unlike a simple function, gives us access to the internal state.

In [139]:
class RunEngine_v2:
    def __call__(self, plan):
        self._run(plan)
    
    def _run(self, plan):
        plan = ensure_generator(plan)
        last_result = None

        while True:
            try:
                msg = plan.send(last_result)
            except StopIteration:
                break
            print('PROCESSING: %r' % (msg,))
            func = function_map[msg.command]
            last_result = func(msg)

In [140]:
# It still works.
RE_v2 = RunEngine_v2()
RE_v2(adaptive_adding_plan())

PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (1, 3), {}
RECEIVED: 4
PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (4, 3), {}
RECEIVED: 7
PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (7, 3), {}
RECEIVED: 10
we are done


### Version 3: a RunEngine that supports interruptions / resuming

* The RunEngine still manages the main loop, processing the plan
* asyncio provides an outer event loop that manages multiple frames of execution

In [141]:
import asyncio

loop = asyncio.get_event_loop()

# Reimplement all command functions as coroutines.

@asyncio.coroutine
def _sum(msg):
    return sum(msg.args)

@asyncio.coroutine
def _sleep(msg):
    yield from asyncio.sleep(*msg.args, loop=loop)

In [143]:
class RunEngine_v3:
    def __init__(self):
        loop = asyncio.new_event_loop()
        self.coroutine_map = {'sleep': _sleep,
                             'sum': _sum}
        
    def __call__(self, plan):
        self._task = loop.create_task(self._run(plan))
        loop.run_until_complete(self._task)
        
        if self._task.done() and not self._task.cancelled():
            exc = self._task.exception()
            if exc is not None:
                raise exc
                
    @asyncio.coroutine
    def _run(self, plan):
        plan = ensure_generator(plan)
        last_result = None

        while True:
            try:
                msg = plan.send(last_result)
            except StopIteration:
                break
            print('PROCESSING: %r' % (msg,))
            coroutine = self.coroutine_map[msg.command]
            last_result = yield from coroutine(msg)

In [144]:
# And it still works
RE_v3 = RunEngine_v3()
RE_v3(adaptive_adding_plan())

PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (1, 3), {}
RECEIVED: 4
PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (4, 3), {}
RECEIVED: 7
PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (7, 3), {}
RECEIVED: 10
we are done


In [149]:
from bluesky import RunEngine  # finally, the real thing

# Make a RunEngine.
RE = RunEngine({})

# Teach it our toy command, 'sum'.
@asyncio.coroutine
def _sum(msg):
    return sum(msg.args)

RE.register_command('sum', _sum)

# Make it verbose.
RE.msg_hook = lambda msg: print("PROCESSING:", msg)


RE(adaptive_adding_plan())

PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (1, 3), {}
RECEIVED: 4
PROCESSING: sleep: (None), (1,), {}
A 'deferred pause' has been requested. The RunEngine will pause at the next checkpoint. To pause immediately, hit Ctrl+C again in the next 10 seconds.
Deferred pause acknowledged. Continuing to checkpoint.

Your RunEngine is entering a paused state. These are your options for changing
the state of the RunEngine:

resume()  --> will resume the scan
 abort()  --> will kill the scan with an 'aborted' state to indicate
              the scan was interrupted
  stop()  --> will kill the scan with a 'finished' state to indicate
              the scan stopped normally

Pro Tip: Next time, if you want to abort, tap Ctrl+C three times quickly.

Pausing...


[]

In [150]:
RE.resume()

PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (1, 3), {}
PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (4, 3), {}
RECEIVED: 7
PROCESSING: sleep: (None), (1,), {}
PROCESSING: sum: (None), (7, 3), {}
RECEIVED: 10
we are done


[]

## Why is it useful to think of an experiment as an iterator?

* The experiment must be given as an *expression*, not a *statement*

## Three different times metadata can be injected

* before experiments starts, in a global stash
* when an experimental plan is written
* interactively, when an experimental plan is executed

In [12]:
from bluesky import RunEngine, Msg
RE = RunEngine({'user': 'dan'})

In [9]:
plan = [Msg('open_run', plan_name='demo plan', num_readings=1),
        Msg('close_run')]

In [15]:
RE(plan, print, mood='optimistic')

start {'plan_name': 'demo plan', 'plan_type': 'list', 'uid': 'eb5d8aae-16f5-4bef-bac7-ab607b5037ae', 'num_readings': 1, 'scan_id': 2, 'time': 1468246826.637021}
stop {'exit_status': 'success', 'reason': '', 'uid': 'dbb3b54c-cce4-4fb0-9f26-e49e7e3a10a5', 'run_start': 'eb5d8aae-16f5-4bef-bac7-ab607b5037ae', 'time': 1468246826.639693}


['eb5d8aae-16f5-4bef-bac7-ab607b5037ae']

# Document Model

* An "Event" is a group of readings that, for scientific purposes, are synchronous.

## Exception handling

In [17]:
from bluesky.plans import finalize_wrapper

In [None]:
finalize_wrapper()