# Accessing Saved Data

## Configuration

This code would normally go in a script automatically run at startup. The user would not have to worry about this.

In [None]:
%matplotlib notebook
%run startup.py

# Set up simulated hardware.
from ophyd.sim import SynAxis, SynGauss
motor = SynAxis(name='motor')
det = SynGauss('det', motor, 'motor', center=0, Imax=1,
               noise='uniform', sigma=1, noise_multiplier=0.1)

## Data Acquisition

### Execute a scan, retrieve the data as a table, and export it to CSV.

In [None]:
RE(scan([det], motor, -1, 1, 5))

Access the most recent run, and get the data as a table (a `pandas.DataFrame`).

In [None]:
header = db[-1]
header.table()

In [None]:
header.table().to_csv('my_data.csv')

In [None]:
!cat my_data.csv

### Use metadata to generate a nice filename

Most of the useful metadata in stored in the "Run Start document," which we can access in `header.start`. Let's see what's in there. (See [this page of the bluesky documentation](https://nsls-ii.github.io/bluesky/documents.html) for more about "documents".)

In [None]:
header.start

In [None]:
def export_csv(header):
    filename = "{plan_name}_{num_points}.csv".format(**header.start)
    header.table().to_csv(filename)
    print("Exported data to", filename)

In [None]:
export_csv(header)

Execute a new scan with a different number of points.

In [None]:
RE(scan([det], motor, -1, 1, 8))

In [None]:
export_csv(db[-1])

### Provide metadata and using it in the filename

In [None]:
def export_csv2(header):
    """Export to CSV. Expect header to have 'operator' and 'purpose' metadata."""
    filename = "{operator}_{plan_name}_{num_points}_{purpose}.csv".format(**header.start)
    header.table().to_csv(filename)
    print("Exported data to", filename)

In [None]:
# When RE receives extra keyword arguments it does not recognize,
# it captures them as metadata.
RE(scan([det], motor, -1, 1, 8), purpose='calibration', operator='Dan')

In [None]:
export_csv2(db[-1])

In [None]:
def overnight():
    "A multi-run plan. Each run gets different 'purpose' metadata."
    yield from scan([det], motor, -1, 1, 10, md={'purpose': 'calibration'})
    # open shutter or something
    yield from scan([det], motor, -1, 1, 10, md={'purpose': 'rough measurement'})
    yield from scan([det], motor, -1, 1, 100, md={'purpose': 'fine measurement'})

In [None]:
RE(overnight(), operator='Dan')
headers = db[-3:]  # grab the last three runs
for header in headers:
    export_csv2(header)

In [None]:
run_ids = RE(overnight(), operator='Dan')  # stash the unique IDs of these runs...
headers = db[run_ids]  # ... and use them to look up the data
for header in headers:
    export_csv2(header)

### Search for runs using user-specified metadata and plot search results together

In [None]:
fig, ax = plt.subplots()

for header in db(operator='Dan'):
    label = header.start['scan_id']
    data = header.table()
    ax.plot('motor', 'det', label=label, data=data)

ax.legend()

In [None]:
fig, ax = plt.subplots()

for header in db(purpose='fine measurement'):
    label = header.start['scan_id']
    data = header.table()
    ax.plot('motor', 'det', label=label, data=data)

ax.legend()

## Exercises

1. Experiment with writing variations of `export_csv`. Try writing one that sorts files into subdirectories based on operator name. (Hint: make directories in advance using `!mkdir DIRECTORY_NAME`.)
2. Try various search queries with `db()`.