# Tiled Python Client Demonstration

Demonstrate a Python client accessing a tiled data server (running on `localhost`).  The server provides two databroker catalogs (`bdp2022` and `20idb_usaxs`).

Show two types of Python client:

- The `requests` package from the Python Standard Library.
- The `tiled` package from the Bluesky Framework.

For each type of client, show some specific queries and responses.

* [x] Find all runs in a catalog between these two ISO8601 dates.
* [x] Find run(s) which match given metadata.
* [x] Get overall metadata from given run.
* [x] What are the data streams in this run?
* [x] What is the metadata for this stream?
* [ ] Get the data from the data stream named primary (the canonical main data).
* [ ] What about loose matches?  Maybe not now.  Might require some deeper expertise.

## Client using `requests` package

Using the `requests` package, search the tiled server's  API using the `http://` interface by assembling a URI.  The tiled server will respond and we'll return the response as JSON.  We'll let Python handle report any Exceptions that might occur.

In [53]:
import requests

def requests_tiled(server, catalog, api="/api/v1/node/search", suffix="", port=8000):
    return requests.get(f"http://{server}:{port}{api}/{catalog}{suffix}").json()

As a convenience, make a function that converts a string representation of the date and time in ISO-8601 format into the Linux EPOCH floating-point representation needed for tiled's API.

In [54]:
import datetime

def isotime_to_timestamp(isotime):
    return datetime.datetime.fromisoformat(isotime).timestamp()

We'll search the BDP project's databroker catalog, known to the tiled server (running on workstation `localhost`) by the text name `bdp2022`.

In [55]:
server = "localhost"
catalog = "bdp2022"

### Find runs within range of dates

Define the ends of the time span for the search query:

In [56]:
# Find all runs in a catalog between these two ISO8601 dates.
start_time = "2022-05-01"
end_time = "2022-11-01"
tz = "US/Central"

Using the `requests` package, ask the tiled server for all runs in the catalog that match the time range.

Here, we build up the URI suffix in parts to expose how the search query is constructed.  The response is a Python dictionary.  We won't print the entire dictionary here since it likely contains a lot of information, perhaps too much to show in full.

In [62]:
r = requests_tiled(
    server, catalog, suffix=(
        "?page[limit]=0"  # 0: find all matching runs
        f"&filter[time_range][condition][since]={isotime_to_timestamp(start_time)}"
        f"&filter[time_range][condition][until]={isotime_to_timestamp(end_time)}"
        f"&filter[time_range][condition][timezone]={tz}"
        "&sort=time"
    )
)

Summarize the results (in object `r`):

In [58]:
print(f'Search of {catalog=} has {len(r["data"])} runs.')
xref = dict(First=0, Last=-1)
for k, v in dict(First=0, Last=-1).items():
    md = r["data"][v]["attributes"]["metadata"]
    # md keys: start  stop  summary
    # summary key is composed by tiled server
    plan_name = md["summary"]["plan_name"]
    scan_id = md["summary"]["scan_id"]
    started = md["summary"]["datetime"]
    print(f"{k:5s} run: {started=} {scan_id=} {plan_name=}")

Search of catalog='bdp2022' has 397 runs.
First run: started='2022-05-03T08:37:21.510276' scan_id=1596 plan_name='take_image'
Last  run: started='2022-09-08T13:54:25.178280' scan_id=1960 plan_name='push_images'


### Find runs matching a given plan name

Find run(s) which match some given metadata.  In this search, let's find all the runs that match a given `plan_name`.  Let's use the most recent `plan_name` from the previous results.

In [68]:
# Find run(s) which match given metadata: given plan_name
print(f"Search for {plan_name=}")
case_sensitive = True
r = requests_tiled(
    server, catalog, suffix=(
        "?page[limit]=0"  # 0: all matching
        "&filter[eq][condition][key]=plan_name"
        f'&filter[eq][condition][value]="{plan_name}"'
        "&sort=time"
    )
)

Search for plan_name='push_images'


In [61]:
print(f'Search of {catalog=} has {len(r["data"])} runs.')
xref = dict(First=0, Last=-1)
for k, v in dict(First=0, Last=-1).items():
    md = r["data"][v]["attributes"]["metadata"]
    # md keys: start  stop  summary
    # summary key is composed by tiled server
    plan_name = md["summary"]["plan_name"]
    scan_id = md["summary"]["scan_id"]
    started = md["summary"]["datetime"]
    print(f"{k:5s} run: {started=} {scan_id=} {plan_name=}")

Search of catalog='bdp2022' has 125 runs.
First run: started='2022-07-15T23:14:54.974411' scan_id=1 plan_name='push_images'
Last  run: started='2022-09-08T13:54:25.178280' scan_id=1960 plan_name='push_images'


### Show a run's metadata

Let's show the various metadata available from a Bluesky *run*.  We'll use the last run from the previous search.

In [64]:
run = r["data"][-1]  # most recent run from previous results

The `run` object is a dictionary.  The interesting keys are:

key | content
:--- | :---
`id` | `uid` universal identifier of this `run` (used by the database)
`attributes` | contents of this `run`

The `attributes` contents are a dictionary with these interesting keys (there are other keys, as well):

key | content
:--- | :---
`metadata` | metadata dictionary of this `run`

The `metadata` dictionary has these keys:

key | content
:--- | :---
`start` | Metadata created as the run started (includes user-supplied, scan-specific, facility-specific, and bluesky metadata).  The `start` dictionary keys will vary between runs and catalogs.  Only a few are expected, including: `uid`, `time`, & `versions`.
`stop` | Metadata about how the run ended (exit status and reason if problem, stream names, end time stamp)
`summary` | tiled server provides this additional high-level summary with ISO8601 start date and run duration

Note: the run's data streams are obtained by a different query, using the run's `uid`.  Keep track of the `uid` for that reason.

To show the structure of this dictionary, we just access Python to show the object's value.

In [67]:
run["attributes"]["metadata"]

{'start': {'uid': 'ae762f9c-4933-4aa4-a720-147f4aaab6fd',
  'time': 1662663265.17828,
  'versions': {'apstools': '1.6.2',
   'bluesky': '1.8.3',
   'bluesky_queueserver': '0.0.15',
   'databroker': '1.2.5',
   'epics': '3.5.0',
   'h5py': '3.7.0',
   'matplotlib': '3.5.2',
   'numpy': '1.20.3',
   'ophyd': '1.6.4',
   'pyRestTable': '2020.0.3',
   'spec2nexus': '2021.2.1'},
  'databroker_catalog': 'bdp2022',
  'login_id': 'bdp@terrier.xray.aps.anl.gov',
  'beamline_id': 'BDP',
  'instrument_name': 'APS-U Beamline Data Pipelines project in 2022',
  'proposal_id': 'bdp2022',
  'milestone': 'BDP M6 demo',
  'pid': 13229,
  'scan_id': 1960,
  'plan_type': 'generator',
  'plan_name': 'push_images',
  'purpose': 'push TIFF files to PVaccess PV',
  'num_images': 10000,
  'frame_rate': 200.0,
  'run_time': 60.0,
  'datetime': '2022-09-08 13:54:24.802956',
  'client': 'DM/workflows/example-05/qserver_client.py',
  'session': 'M6 demo'},
 'stop': {'run_start': 'ae762f9c-4933-4aa4-a720-147f4aaab6

### Search for runs containing given text.

It is possible to search a catalog for runs containing given text.  Here is one example searching for `M9` (upper or lower case):

In [85]:
search_text = "M9"
case_sensitive = True
r = requests_tiled(
    server, catalog, suffix=(
        "?page[limit]=0"  # 0: all matching
        f"&filter[fulltext][condition][text]={search_text}"
        f"&filter[fulltext][condition][case_sensitive]={str(case_sensitive).lower()}"
        "&sort=time"
    )
)

In [86]:
print(f'Search of {catalog=} has {len(r["data"])} runs which contain "{search_text}".')
xref = dict(First=0, Last=-1)
for k, v in dict(First=0, Last=-1).items():
    md = r["data"][v]["attributes"]["metadata"]
    # md keys: start  stop  summary
    # summary key is composed by tiled server
    plan_name = md["summary"]["plan_name"]
    scan_id = md["summary"]["scan_id"]
    started = md["summary"]["datetime"]
    print(f"{k:5s} run: {started=} {scan_id=} {plan_name=}")

Search of catalog='bdp2022' has 75 runs which contain "M9".
First run: started='2022-11-11T01:34:27.938719' scan_id=1961 plan_name='m9_push_images'
Last  run: started='2022-11-23T11:17:32.495794' scan_id=2035 plan_name='m9_push_images'


### What data streams are available with this run?

Use the last run from the previous search.  The stream names are in the `stop` metadata, where the number of data events is shown for each stream.

In [89]:
stop_md = r["data"][-1]["attributes"]["metadata"]["stop"]
streams = list(stop_md["num_events"].keys())
uid = stop_md["run_start"]
print(f'Run {uid=} has {streams=}')

Run uid=a1233634-1259-438f-b9f0-f77c26f48f54 has streams=['primary']


### What is the metadata for the `primary` stream of this run?

In [109]:
stream_name = streams[0]
r = requests_tiled(
    server, catalog,
    api="/api/v1/node/metadata",
    suffix=(
        f"/{uid}"
        f"/{stream_name}"
    )
)

In [119]:
print(f'Run {uid=}, {stream_name=} has {len(r["data"])=} data descriptor attributes')
for i, descriptor in enumerate(r["data"]["attributes"]):
    print(len(descriptor), descriptor)
    # md = descriptor["attributes"]["metadata"]
    # print(f"descriptor {i} has {len(md)} metadata keys.")
raise RuntimeError("W-I-P at this point.")

Run uid='a1233634-1259-438f-b9f0-f77c26f48f54', stream_name='primary' has len(r["data"])=4 data descriptor attributes
9 ancestors
16 structure_family
5 specs
8 metadata
9 structure
7 sorting
10 references


RuntimeError: W-I-P at this point.

Show one of the non-trivial keys.

In [104]:
r["data"][0]["attributes"]["metadata"]

{'descriptors': [{'run_start': 'a1233634-1259-438f-b9f0-f77c26f48f54',
   'time': 1669223872.8681245,
   'data_keys': {},
   'uid': '43eb9fb2-7900-47aa-96a6-face7e7f5fca',
   'name': 'primary',
   'configuration': {'m9_flyer': {'data': {'m9_flyer_frame_rate': 1000.0,
      'm9_flyer_num_images': 12000,
      'm9_flyer_position_chunk_size': 100},
     'timestamps': {'m9_flyer_frame_rate': 1669223852.4870157,
      'm9_flyer_num_images': 1669223852.4887526,
      'm9_flyer_position_chunk_size': 1669223852.4916937},
     'data_keys': {'m9_flyer_frame_rate': {'source': 'PV:bdpgp:gp:float3',
       'dtype': 'number',
       'shape': [],
       'units': '',
       'lower_ctrl_limit': 0.0,
       'upper_ctrl_limit': 0.0,
       'precision': 4},
      'm9_flyer_num_images': {'source': 'PV:bdpgp:gp:int3',
       'dtype': 'integer',
       'shape': [],
       'units': '',
       'lower_ctrl_limit': 0,
       'upper_ctrl_limit': 0},
      'm9_flyer_position_chunk_size': {'source': 'PV:bdpgp:gp:in

### Get the data from the data stream named primary (the canonical main data).

To get the data, we need to change the type of search using `/api/v1/node/full` (so far, we have been using the default search for metadata: `/api/v1/node/search`) and specify the format of the result.  One format is `json`.

In [108]:
data_format = "json"
r = requests_tiled(
    server, catalog, suffix=(
        f"/{uid}"
        f"/{stream_name}"
        "/data"
        # f"%format={data_format}"
    )
)

In [None]:
for i, data in enumerate(r["data"]):
    print()

tba: tiled server shows this exception (for the area detector image):

```text
event_model.UndefinedAssetSpecification: "Resource document with uid c89a0a1b-7195-46fd-8d27-64f7d94e3cf7 refers to spec 'AD_HDF5' which is not defined in the Filler's handler registry."
```

## Client using `tiled` package

In [4]:
from tiled.client import from_uri
from tiled.client.cache import Cache
import tiled.queries
from tiled.utils import tree