# Unpack the external data archive

*APS Training for Bluesky Data Acquisition*.

**Objective**

We have been provided a ZIP file (`class_data_examples.zip`) with data exported from `databroker` on another MongoDB database.  Install that data locally for use on this workstation.

## Extract data from archive to local directory

Unzip the data into `~/class_data_examples`

In [1]:
import os
import zipfile

In [2]:
path = os.path.join(os.environ["HOME"], "data")
print(f"{path = }")

path = '/home/apsu/data'


In [3]:
with zipfile.ZipFile("class_data_examples.zip", "r") as zf:
    # zf.printdir()
    zf.extractall(path=path)

## Prepare databroker configuration

Prepare the databroker configuration (`.yml`) file using a command from [databroker-pack](https://blueskyproject.io/databroker-pack/usage.html#unpacking-a-packed-catalog).  There are two choices here, to unpack a few runs into a local directory (`inplace`) or unpack many runs into a MongoDB (`mongo_normalized`).  Since we have less than 100 files, we choose the `inplace` installation.

In [4]:
!databroker-unpack inplace /home/apsu/data/class_data_examples class_data_examples

Placed configuration file at /home/apsu/.local/share/intake/databroker_unpack_class_data_examples.yml


Show the contents of this configuration.

In [5]:
!cat /home/apsu/.local/share/intake/databroker_unpack_class_data_examples.yml

sources:
  class_data_examples:
    args:
      paths:
      - /home/apsu/data/class_data_examples/documents/*.msgpack
      root_map:
        a1a737180dacd6b068e429c62e66ab2d: /home/apsu/data/class_data_examples/external_files/a1a737180dacd6b068e429c62e66ab2d
    driver: bluesky-msgpack-catalog
    metadata:
      generated_by:
        library: databroker_pack
        version: 0.3.0
      relative_paths:
      - ./documents/*.msgpack


## Test it with databroker

Test that the configuration exists and is readable by *databroker*.  First the import.

In [6]:
import databroker

Show the available catalog names:

In [7]:
list(databroker.catalog)

['class_data_examples', 'training']

Pick the new catalog, `class_data_examples`, and show how many runs it contains.

In [8]:
cat = databroker.catalog["class_data_examples"]
print(f"{len(cat) = }")

len(cat) = 59


Show information about this catalog.

In [11]:
cat.metadata

{'generated_by': {'library': 'databroker_pack', 'version': '0.3.0'},
 'relative_paths': ['./documents/*.msgpack'],
 'catalog_dir': '/home/apsu/.local/share/intake/'}

In [12]:
cat.describe()

{'name': 'class_data_examples',
 'container': 'catalog',
 'plugin': ['bluesky-msgpack-catalog'],
 'driver': ['bluesky-msgpack-catalog'],
 'description': '',
 'direct_access': 'forbid',
 'user_parameters': [],
 'metadata': {'generated_by': {'library': 'databroker_pack',
   'version': '0.3.0'},
  'relative_paths': ['./documents/*.msgpack']},
 'args': {'paths': ['/home/apsu/data/class_data_examples/documents/*.msgpack'],
  'root_map': {'a1a737180dacd6b068e429c62e66ab2d': '/home/apsu/data/class_data_examples/external_files/a1a737180dacd6b068e429c62e66ab2d'}}}

Use the `listruns()` command from *apstools* to get a listing of the runs.

In [9]:
from apstools.utils import listruns
listruns(db=cat, num=59)

catalog: class_data_examples


Unnamed: 0,scan_id,time,plan_name,detectors
0,90,2021-03-06 14:16:41,scan,[noisy]
1,89,2021-03-06 14:15:35,scan,[noisy]
2,88,2021-03-06 14:14:45,scan,[noisy]
3,87,2021-03-06 14:13:44,scan,[noisy]
4,86,2021-03-06 14:10:46,rel_scan,[noisy]
5,85,2021-03-06 14:10:43,rel_scan,[noisy]
6,84,2021-03-06 14:10:37,rel_scan,[noisy]
7,83,2021-03-06 14:10:19,rel_scan,[noisy]
8,82,2021-03-03 10:01:32,count,[adsimdet]
9,81,2021-03-03 09:50:41,count,[adsimdet]
