# What's New - 2017 Holiday Edition

* Better automated visualization and peak statistics via the "Best-Effort Callback" and hints
* More convenient ways to access saved data
* Easier-to-use "supplemental data": baseline readings and asynchronous acquisition (monitoring and flying)

## New releases

Bluesky and Ophyd are (more) stable -- v1.0.0!

In [None]:
import bluesky
bluesky.__version__

In [None]:
import ophyd
ophyd.__version__

Data Broker is... getting there!

In [None]:
import databroker
databroker.__version__

## Setup (nothing up my sleeves...)

There's no IPython profile or external startup script. For educational purposes, we'll type it all out here in the notebook.

In [None]:
# Make plotting interative in the notebook.
%matplotlib notebook
from bluesky.utils import install_nb_kicker
install_nb_kicker()

# Create a RunEngine.
from bluesky import RunEngine
RE = RunEngine({})

from bluesky.plans import scan
from ophyd.sim import det, motor  # simulated Devices

# Simulate the motor movement.
motor.delay = 0.5

In [None]:
d = [det]  # my list of detectors

## Better automated visualization and peak statistics via the "Best-Effort Callback" and hints

### The Bad Old Days

Recall that a vanilla RunEngine gives no useful visual feedback. In bluesky jargon, there are no callbacks subscribed to receive the data.

In [None]:
RE(scan(d, motor, -1, 1, 5))

Getting a table and a plot required either:
* writing custom plans that subscribed certain callbacks (for example, `ct` and `ascan` did this)
* passing in callbacks as arguments, such as

  ```python
  RE(scan(d, motor, -1, 1, 5), [LiveTable(['det', 'motor']), LivePlot('det', 'motor')])
  ```
  
Sometimes this level of power and flexbility is awesome, but usually this is obnoxiously verbose just to get a table and a plot!

_There must be a better way!_

### A Table and a Plot for Every Occasion

In [None]:
# Set up the BestEffortCallback.
from bluesky.callbacks.best_effort import BestEffortCallback
bec = BestEffortCallback()
RE.subscribe(bec)  # Attach bec to the RunEngine itself, applying it to all future executions.

peaks = bec.peaks  # just an alias for less typing

In [None]:
RE(scan(d, motor, -1, 1, 5))

How does the callback decide what to plot, and which columns to show? There are often a multiple choices, and not enough room to just show everything.

In [None]:
motor.describe()

The Device(s) (``det``, ``motor``) and the plan (``scan``) provide **hints**.

In [None]:
# Devices and Signals have a new (optional) 'hints' attribute.

motor.hints

In [None]:
# Plans provide 'hints' in their metadata. We can see that by printing it.

def print_hints_metadata(name, doc):
    if name == 'start':
        # Prints hints if they exist.
        print('HINTS:', doc.get('hints', 'NO HINTS WERE GIVEN'))

RE(scan(d, motor, -1, 1, 5), print_hints_metadata)

### About Hints

* The hints are not always guaranteed to be correct -- hence the "Best-Effort" in Best-Effort Callback.
* They are intentionally generic, intended to be future-proof.
* They are an experimental feature that will likely be extended and changed in the future.

### Bonus Feature: Click the plot and hit P (capital P!)

Peak stats are always on the in background. Of course they don't always make physical sense; it's up to you whether to decide to look at them.

In [None]:
peaks

Here's a simple plan that uses `peaks` to implement what SPEC users know as "cen" -- moving the motor to the center of the peak.

In [None]:
from bluesky.plan_stubs import mv, input_plan

def cen():
    pos = peaks['cen']['det']
    print(f'Moving motor to {pos}')
    yield from mv(motor, pos)

In [None]:
RE(cen())

## More convenient ways to access saved data

In [None]:
# For demo purposes, we set up a Broker backed by a temporary directory
# containing JSON files and a sqlite database.
# In production, a Broker is usually backed by a Mongo database.
import os
import tempfile
tempdir = tempfile.mkdtemp()
config = {
    'description': 'temporary',
    'metadatastore': {
        'module': 'databroker.headersource.mongoquery',
        'class': 'MDS',
        'config': {
            'directory': tempdir,
            'timezone': 'US/Eastern'}
    },
    'assets': {
        'module': 'databroker.assets.sqlite',
        'class': 'Registry',
        'config': {
            'dbpath': os.path.join(tempdir, 'assets.sqlite')}
    }
}
from databroker import Broker
db = Broker.from_config(config)

# Send all data from RE into db.
RE.subscribe(db.insert)

In [None]:
# Use a more complex (realistic) simulated detector that has some configuration.
from ophyd.sim import det_with_count_time
det_with_count_time.configuration_attrs.append('count_time')
det_with_count_time.count_time.set(1)

# Take some data. Pretty boring data.
RE(scan([det_with_count_time], motor, -1, 1, 5))

### The Old Way (it still works, but it's rarely the best way)

In [None]:
h = db[-1]
db.get_table(h)

Why should I have to type `db` twice?

### The New Way

In [None]:
h = db[-1]
h.table()

In [None]:
# or, if you don't need the Header for anything else, do it in one line
db[-1].table()

### There are a lot of new, convenient methods hanging off of `Header`.

See [this section](https://nsls-ii.github.io/databroker/api.html#the-header-object) of the recently revamped databroker documentation.

Quick Hits:

In [None]:
h.fields()  # i.e. columns in the table

In [None]:
h.data('det_intensity')  # lazy access to one column of data

In [None]:
list(h.data('det_intensity'))

In [None]:
h.devices()  # i.e. names of devices, which is useful for...

In [None]:
# ...accessing device configuration metadata
h.config_data('det')

Confused about the difference between `h.devices()` and `h.fields()`? `h.devices()` gives the names of the Devices involved...

In [None]:
motor.name

...and `h.fields()` gives the labels of the readings that they provided. Some of these labels might be the same as the Device names, as is the case with our example `motor`.

In [None]:
motor.describe()

### And course we still have the classics

Everything we know before we start talking to hardware is in `Header.start`. (This is effectively just a Python dictionary with some tricks to make it display nicely in the notebook.)

In [None]:
h.start

And everything we only know at the end of a run is in `Header.stop`.

In [None]:
h.stop

There's more. We'll revisit `Header` after the next topic.

## Easier-to-use "supplemental data": baseline readings and asynchronous acquisition (monitoring and flying)

In [None]:
# Get some more simulated Devices and scatter them about.
from ophyd.sim import motor1, motor2, motor3
motor1.set(13)
motor2.set(-2)
motor3.set(42)

### The Old Way

It was so painful and inconsistent that I'm not even going to show you.

### The New Way

In [None]:
# Set up SupplementalData.
from bluesky import SupplementalData
sd = SupplementalData()
RE.preprocessors.append(sd)
# All plans executed by RE will now be modified (preprocessed) by sd.

In [None]:
sd

Before we do anything new, let's remember what we see when we do a scan.

In [None]:
RE(scan([det], motor, -1, 1, 5))

### Baseline Readings

Now, at the beginning and end of every run, record the positions of these motors.

In [None]:
sd.baseline = [motor1, motor2, motor3]

In [None]:
# same as above
RE(scan([det], motor, -1, 1, 5))

Notice

``New stream: 'baseline'``

as well as the boxes of readings. How can we access that data later? It doesn't show up in the table!

In [None]:
db[-1].table()

In [None]:
db[-1].table('primary')  # the default

In [None]:
db[-1].table('baseline')

If the table gets too wide, it's handy to know how to take the transpose of a `DataFrame`.

In [None]:
db[-1].table('baseline').T

### Asynchronous Monitoring

In [None]:
# The `rand` Signal updates to a random number at irregular intervals.
from ophyd.sim import SynPeriodicSignal
import random
rand = SynPeriodicSignal(name='rand', func=random.random, period=0.2, period_jitter=0.1)

In [None]:
sd.monitors = [rand]

In [None]:
RE(scan([det], motor, -1, 1, 10))

Notice

``New stream: 'rand_monitor'``

and a new figure. But the Best-Effort Callback doesn't show us the numbers from monitor readings. (There's just not enough room.)

In [None]:
h = db[-1]
h.table('rand_monitor')

What streams do we have? Another good `Header` method to know:

In [None]:
h.stream_names

### Resampling data on time to compare across streams

`Header.table` returns a `pandas.DataFrame`. Pandas is a great library for handling time series data.

How can we plot `rand` vs `motor` to see if they are correlated?

In [None]:
import pandas as pd
df = pd.concat([h.table('rand_monitor').set_index('time'),
                h.table('primary').set_index('time')],axis=0)
df

Look at where the `NaN`s (indicating missing data) are. We have a block matrix. Let's sort it by time.

In [None]:
df.sort_index()

In [None]:
df.sort_index().ffill()  # 'forward-fill' the last non-empty value

Now that `rand` and `motor` have a shared time base, we can plot them against each other.

In [None]:
df.sort_index().ffill().plot(x='motor', y='rand')

Other more sophisticated possibilities....

In [None]:
df.sort_index().interpolate()  # lienarly interpolate (more advanced options are available)

In [None]:
df.sort_index().ffill().groupby('motor').mean()  # Average 'rand' over each 'motor' point.

### Reviewing and removing Supplemental Data sources

In [None]:
sd

In [None]:
sd.monitors

In [None]:
sd.monitors.clear()  # or just sd.monitors = []

In [None]:
sd.monitors

### Hiding baseline readings (but still taking them)

In [None]:
RE(scan(d, motor, -1, 1, 10))

In [None]:
bec.disable_baseline()  # turns off VISUALIZATION only

In [None]:
RE(scan(d, motor, -1, 1, 10))

We know that baseline data is still being recorded because ``New stream: 'baseline'`` is still there, and we can of course access the data.

In [None]:
db[-1].table('baseline')

Use `bec.<TAB>` to see other options for tuning the Best-Effort Callback. It will become much more extensible/customizable in later versions.