# Getting started using data collected at CSX

## Notebooks created by CSX beamline staff
## Trust only the [source](https://github.com/ambarb/csx_primer_notebooks/blob/main/CSX_data_what_is_what.ipynb) of this notebook
Report issues with this notebook on [github](https://github.com/ambarb/csx_primer_notebooks/issues) (aside from github rendering)

```python
import databroker
databroker.__version__


'2.0.0b23'
```

## What is what
### essentially the name of things in `bluesky` and how to retrieve specific data using `databroker`
* what is the key name for the thing want to plot or look up
    * the **theta motor** can be extracted using `tardis_theta` within the `databroker` API
* whether we recommend you plot/use readback or setpoint
    * the beamline **energy motor** has a noisy readback so **ONLY use the setpoint** 

| Beamline Componenet | bluesky root name | standard databroker field to access for plotting| movement details |
|--|--|--|--|
|**sample positions**   |   |  | |
| sample x | `sx` | `sx` | parallel to sample surface, transverse to incident beam if tardis.theta = 0, + inboard|
| sample y | `sy` | `sy_readback`| parallel to sample surface, parallel to incident beam if tardis.theta = 0, + away from source|
| sample z | `sz` | `sz_readback`| vertical to sample surface, vertically to if tardis.theta = 0, + "up"|
|   |   |  | |
|   |   |  | |
|**x-ray light control**   |   |  | |
| incident wavelength/energy  | `pgm.energy`  | `pgm_energy_setpoint` | `_readback` is also available, but performance is better indicated by `_setpoint` |
| x-ray polarization  | `epu2.phase`  | `epu2_phase_readback` | horizontal linear = 0.0, veritcal linear = 24.6, circular - see beamline staff|
|   |   |  | |
|   |   |  | |
|**diffractometer positions**   |   |  | |
| theta  | `tardis.theta`  | `tardis_theta` | incident angle for six circle geometry  |
| delta  | `tardis.delta`  | `tardis_delta` | detector arm position in "vertical plane" for six circle geometry when gamma = 0 |
| gamma  | `tardis.gamma`  | `tardis_gama` | detector arm position in "horizontal plane" for six circle geometry when delta = 0 |
|  H | `tardis.h`  | `tardis.h` | calculated each time, so not really a readback/setpoint |
|  K | `tardis.k`  | `tardis.k` | calculated each time, so not really a readback/setpoint|
|  L | `tardis.l`  | `tardis.l` | calculated each time, so not really a readback/setpoint|
|  UB | `tardis.UB`  | `tardis_UB` in `descriptors` | other API's to convert diffractometer angles per pixel to HKL per pixel|
|  lattice |   | `sample` dictionary in `start` document | other API's to convert diffractometer angles per pixel to HKL per pixel|
|  E_offset |   |  | manual offset of beamline energy to "correct" or calibrate incident energy |
|   |   |  | |
|   |   |  | |
|**sample environment - temperature**   |   |  | |
| sample control temperature  | `stemp.temp.A.T`  | `stemp_temp_A_T` | heater is close to this Si diode inside the cryostat (not in tardis vacuum space)|
| sample temperature  | `stemp.temp.B.T`  | `stemp_temp_B_T` | Si diode is close to sample |
| sample setpoint  | `stemp.ctrl1`  | _ususally not recorded in datastore_ | can be moved like a motor|
|   |   |  | |
|   |   |  | |
|**in-chamber optics**   |   |  | |
|  <u>prior to 06-14-2023</u> |   |  | |
|  pinhole/OSA x |  `nanop.bx` | `nanop_bx_user_setpoint` | horizontal (+ outboard, before sample); OSA = typically 0; 5,8,10um = ~6 or 7|
| pinhole/OSA y |  `nanop.by` |  `nanop_by_user_setpoint` | parallel to beam (+ closer to sample, before sample)|
|  pinhole/OSA z  |  `nanop.bz` | `nanop_bz_user_setpoint`  | vertical (+ up, before sample, sample at 0); in beam = 0 +/- .05|
|  ZP x |  `nanop.bx` | `nanop_bx_user_setpoint` | horizontal (+ outboard, before sample), typically near 0|
|  ZP y |  `nanop.by` |  `nanop_by_user_setpoint` | parallel to beam (+ closer to sample, before sample, sample at 0)|
|  ZP z |  `nanop.bz` | `nanop_bz_user_setpoint`  | vertical  (+ up, before sample); in beam = 0 +/- .05|
|   |   |  | |
|  <u>after 06-14-2023</u> |   |  | |
|  optic/OSA x |  `nanop.bx` | `nanop_bx_user_setpoint` | horizontal (+ outboard, before sample); OSA = typically 0 / optic ~ 2.5|
|  optic/OSA y |  `nanop.by` |  `nanop_by_user_setpoint` | parallel to beam (+ closer to sample, before sample)|
|  optic/OSA z  |  `nanop.bz` | `nanop_bz_user_setpoint`  | vertical (+ up, before sample, sample at 0); in beam = 0 +/- .1 / optic ~ -2 |
|  pinhole/ZP x |  `nanop.bx` | `nanop_bx_user_setpoint` | horizontal (+ outboard, before sample), typically near 0 / pinhole 5,8,10um ~ 2.5|
|  pinhole/ZP y |  `nanop.by` |  `nanop_by_user_setpoint` | parallel to beam (+ closer to sample, before sample, sample at 0)|
|  pinhole/ZP z |  `nanop.bz` | `nanop_bz_user_setpoint`  | vertical  (+ up, before sample); in beam = 0 +/- .1 / pinhole 5,8,10um ~ -2 |
|   |   |  | |
|   |   |  | |
|**exit-slit size**   |   |  | |
|  <u>prior to 09-21-2022</u> |   |  | |
| exitslit horz gap  |  `slt3.x` | `slt3_x` | 2mm hole = -15.00 ; 50um = - 6.08 ; 20um =   2.65 ; 10um =  11.27  |
| exitslit vert gap   | `slt3.y`  | `slt3_y` |2mm hole = - 0.30 ; 50um = - 0.25 ; 20um = - 0.25 ; 10um = - 0.10  |
|   |   |  | |
|  <u>after 09-21-2022</u> |   |  | |
| exitslit horz gap  |  `slt3.x` | `slt3_x` | 2mm hole = - 8.52 ; 50um =   0.00 ; 20um =   8.74 ; 10um =  17.38  |
| exitslit vert gap   | `slt3.y`  | `slt3_y` |2mm hole = - 0.05 ; 50um =   0.00 ; 20um =   0.02 ; 10um =   0.20  |

### Generally
**in-chamber optics** are **"out"** if the **z** value is **> 3.5 mm**

## Configure your notebook for data access and plot control
This notebook will show you how to access the csx datastore using the databroker API.  Currently, we use what the databroker project refers to as version 1 for the access methodology. 
https://nsls-ii.github.io/databroker/ 
* See Version 1 interface and its tutorial. https://nsls-ii.github.io/databroker/v1/tutorial.html 

In [1]:
from databroker import Broker
db = Broker.named('csx')

%matplotlib inline 
###use non-interactive plots
# %matplotlib widget 
###use interactive plots

OBJECT CACHE: Will use up to 121_499_065_958 bytes (15% of total physical RAM)


You may control the plotting method at anytime.  If you are new to jupyter notebooks, then you may see that memory error notifications appear over time when using `widget` if you are not managing the plots appropriately.  You can clear this error by briefly switching from `widget` to `inline` and then back to widget.  There is no need to restart your kernel.

Please note the usage of `'csx'` in the definition of `db`. Once a central datastore exists, you can simultaenously access data collected at other beamlines in the same notebook. This is partially the case on our facility jupyterhub.

## Accessing information
* `scan_id` are usually utilized by most users as this is a sequential number that has some context to the experiment
* `uid` or the **unique id** may also be used.  At minimum, one will need to use the first 6 characters
* All scan data is returned as an object that the API referes to as `header`

In [2]:
header = db[150959]
print(f'{header["start"]["uid"]} \n\tOR')
print(f'{header.start["uid"]} \n\tOR')

header = db['851a80bc']
print(f'{header.start["scan_id"]}')

851a80bc-c5cc-49fb-8c2d-1fd9e7c470f7 
	OR
851a80bc-c5cc-49fb-8c2d-1fd9e7c470f7 
	OR
150959


## Data format
The `header` consists of 3 documents that contains information about the scan (aka metadata) https://nsls-ii.github.io/bluesky/documents.html
* `start` 
    * "default" information that is user/beamline/experiment dependant
    * "default" information that is scan (or plan) dependant https://nsls-ii.github.io/bluesky/plans.html
    * optional information user configured metadata "on-the-fly" at collection time 
* `stop`
    * "default" information about the data collected 
    * `list(num_events)` contains the **steam names** that can be called by the databroker `table` function
    * `exit_status` can be used to filter "good" from "bad" data during collection but this is not infallible 
    * `reason` is populated if scan encounters a problem or the user prematurely stops the scan
* `descriptors`
    * "default" information about the motors and detectors used during the scan.  items like exposure time will be found here

In [3]:
h = db[150959]
h.start


0,1
beamline_id,CSX
detectors,['dif_beam']
fccd_intersection,"[440, 431]"
group,CSX Team
hints,"dimensions [[['sz_readback'], 'primary']]"
motors,['sz']
nanop_setup,"ANT_MIT ZP, 50 mum OSA and 10 um pinhole"
num_intervals,20
num_points,21
plan_args,"args [""SamplePosVirtualMotor(prefix='XF:23ID1-ES{Dif-Ax:SZ}', name='sz', settle_time=0.0, timeout=None, read_attrs=['readback', 'setpoint'], configuration_attrs=[], limits=None, egu='')"", -0.2, 0.2]  detectors [""StandardCam(prefix='XF:23ID1-ES{Dif-Cam:Beam}', name='dif_beam', read_attrs=['stats1', 'stats1.total', 'stats2', 'stats2.total', 'stats3', 'stats3.total', 'stats4', 'stats4.total', 'stats5', 'stats5.total'], configuration_attrs=['cam', 'cam.acquire_period', 'cam.acquire_time', 'cam.image_mode', 'cam.manufacturer', 'cam.model', 'cam.num_exposures', 'cam.num_images', 'cam.trigger_mode', 'stats1', 'stats2', 'stats3', 'stats4', 'stats5'])""]  num 21  per_step None"

0,1
dimensions,"[[['sz_readback'], 'primary']]"

0,1
args,"[""SamplePosVirtualMotor(prefix='XF:23ID1-ES{Dif-Ax:SZ}', name='sz', settle_time=0.0, timeout=None, read_attrs=['readback', 'setpoint'], configuration_attrs=[], limits=None, egu='')"", -0.2, 0.2]"
detectors,"[""StandardCam(prefix='XF:23ID1-ES{Dif-Cam:Beam}', name='dif_beam', read_attrs=['stats1', 'stats1.total', 'stats2', 'stats2.total', 'stats3', 'stats3.total', 'stats4', 'stats4.total', 'stats5', 'stats5.total'], configuration_attrs=['cam', 'cam.acquire_period', 'cam.acquire_time', 'cam.image_mode', 'cam.manufacturer', 'cam.model', 'cam.num_exposures', 'cam.num_images', 'cam.trigger_mode', 'stats1', 'stats2', 'stats3', 'stats4', 'stats5'])""]"
num,21
per_step,

0,1
args,"[""SamplePosVirtualMotor(prefix='XF:23ID1-ES{Dif-Ax:SZ}', name='sz', settle_time=0.0, timeout=None, read_attrs=['readback', 'setpoint'], configuration_attrs=[], limits=None, egu='')"", -0.2, 0.2]"
num,21

0,1
composition,razor
type,laser cryo exp

0,1
bluesky,1.6.3
ophyd,1.5.1


In [4]:
h = db[150959]
h.stop


0,1
exit_status,success
num_events,baseline 2  primary 21
reason,
run_start,851a80bc-c5cc-49fb-8c2d-1fd9e7c470f7
time,"1 year, 6 months ago (2021-12-17T14:24:58.356668)"
uid,4fc79d8f-2edc-43de-97cf-f13dfd397c4b

0,1
baseline,2
primary,21


In [5]:
print(f'The stream names available for this scan are:\n\t {list(h["stop"]["num_events"])}')

The stream names available for this scan are:
	 ['baseline', 'primary']


## Access recorded data "during the scan"
* The **primary** stream contains the tabulated data recorded during a typical scan
* databroker returns a pandas dataframe that can be thought of as an excel sheet or table in a .csv file
* The number of points can be confirmed in the **start document** (`h.start.num_points`)
* XPCS scans, dark images, flatfield images have 1 "point" with muliple images per point
* "motor" scans usualy have more than 1 point and the returned table data contains the "motor positions" and "detector values"
* `list(table)` to see what objects can be plotted / analyzed.  
* Unlike spec, anything that is not being "scanned" as a motor argument must be explicity added to the detector list in bluesky 
* At CSX, we use mostly the [pre-assembled bluesky plans](https://blueskyproject.io/bluesky/plans.html)
* Depending on the motor, you may want to plot the readback or the setpoint.  Read further for more details.

In [6]:
table = h.table("primary")
print(len(table))
print(h.start["num_points"])


21
21


In [7]:
print(h.start["detectors"])
print(h.start["motors"])

['dif_beam']
['sz']


In [8]:
table

Unnamed: 0_level_0,dif_beam_stats1_total,dif_beam_stats2_total,dif_beam_stats3_total,dif_beam_stats4_total,dif_beam_stats5_total,sz_readback,sz_setpoint,time
seq_num,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1,27298461.0,9100658.0,44215721.0,18221.0,615101511.0,0.6,0.6,2021-12-17 19:23:39.437476608
2,26805433.0,8748789.0,42954021.0,18275.0,612776771.0,0.62,0.62,2021-12-17 19:23:43.418611200
3,26240236.0,8387524.0,41582186.0,18269.0,610226687.0,0.64,0.64,2021-12-17 19:23:47.320774912
4,25737975.0,8061553.0,40291398.0,18225.0,607703344.0,0.66,0.66,2021-12-17 19:23:51.169652736
5,25106213.0,7702543.0,38839788.0,18380.0,604735292.0,0.68,0.68,2021-12-17 19:23:55.066531072
6,24401410.0,7360982.0,37341224.0,18288.0,601467844.0,0.7,0.7,2021-12-17 19:23:59.070247168
7,23635620.0,7013856.0,35776319.0,18233.0,598168688.0,0.72,0.72,2021-12-17 19:24:03.058582784
8,22882576.0,6695959.0,34307025.0,18102.0,594579817.0,0.74,0.74,2021-12-17 19:24:07.002060288
9,22109278.0,6383190.0,32826426.0,18051.0,591362215.0,0.76,0.76,2021-12-17 19:24:10.939093760
10,21220573.0,6074828.0,31315546.0,18067.0,587316636.0,0.78,0.78,2021-12-17 19:24:14.799731712


## Access before / after scan "snapshot" of beamline components
* the **baseline** stream contains the before `[1]` and after `[2]`
* use pandas internal functions 
* .mean() in case of noisy readings or if you expect the value to not change

In [9]:
tbl_b = h.table("baseline")
print(f'{tbl_b["pgm_energy_setpoint"][1]:.2f} is the beamline energy setpoint BEFORE the scan ')
print(f'{tbl_b["pgm_energy_setpoint"][2]:.2f} is the beamline energy setpoint AFTER  the scan ')
print()
print(f'{tbl_b["stemp_temp_A_T"].mean():.2f} is the average cryostat temperature or "control" temperature')
print(f'{tbl_b["stemp_temp_B_T"].mean():.2f} is the average sample')
print(f'{tbl_b["stemp_temp_A_T"].diff()[2]:.2f} is the delta control temperature between start and stop of scan')
print(f'{tbl_b["stemp_temp_A_T"].diff()[2]:.2f} is the delta sample temperature between start and stop of scan')
print()
print(f'{tbl_b["epu1_phase_setpoint"].mean():.2f} is the phase of the undulator that control x-ray light polarization')

709.00 is the beamline energy setpoint BEFORE the scan 
709.00 is the beamline energy setpoint AFTER  the scan 

50.25 is the average cryostat temperature or "control" temperature
30.17 is the average sample
-11.64 is the delta control temperature between start and stop of scan
-11.64 is the delta sample temperature between start and stop of scan

0.00 is the phase of the undulator that control x-ray light polarization


In [10]:
#list(tbl_b)

## Access descriptors configuration information
* There are two ways: 
    * explicit access:
        * 0 is used in `h.descriptors[0]` to access the descriptors associated iwth the `primary` stream. 
        * 1 is the `baseline` stream.
    * helper function access:
        * `h.descriptors[0].get_config()`
        * See Configuration Utilities: https://nsls-ii.github.io/databroker/v1/api.html#configuration-utilities 
        * Not avaliable yet with version of databroker on CSX servers.
* if you cannot remember on the fly, repeatedly use the `list` function until you find what you need.



In [11]:
list(h.descriptors[0])

['run_start',
 'time',
 'data_keys',
 'uid',
 'configuration',
 'name',
 'hints',
 'object_keys']

In [12]:
h = db[126365]
fccd_setup = h.descriptors[0]["configuration"]["fccd"]["data"]
print(f'{fccd_setup["fccd_cam_acquire_period"]} is the time between frames in seconds')

0.3375 is the time between frames in seconds


## Using recorded metadata for the CSX FastCCD (fccd) 
* `list` will show extensitve information.  
* we will highlight the most important information
    * The raw images are recorded but we can provide information on how to concatentate your image if you don't have it
    * The roi positions are referenced to the image shown at the beamline.  
    * roi5 is the entire concatanated image
    * For most data, the gain is set to `0` or **autogain**
        * which is the most sensitive setting unless 
        * in **autogain**, the fccd's internal software will decrease the gain per pixel based on counting statistics
        * note that the **"stats"** in the **primary stream** are NOT gain corrected
    * Currently, the "zero" of delta and gamma on the fccd are defined in the **start document**.  This is a reasonable approxiation.
    * The `inout` device should be inserted to create dark images.  It's stutus is found in the **baseline** data stream.

In [13]:
#list(fccd_setup)

In [14]:
print(f'{fccd_setup["fccd_cam_acquire_period"]:^6} is the time between consecutive frames for a single "point"')
print(f'{fccd_setup["fccd_cam_acquire_time"]:^6} is the time between the detector is integrating photons for a single exposure')
print(f'{fccd_setup["fccd_cam_num_images"]:^6} is the number of frames collected for a single "point"')
print()
#print(f'{fccd_setup["fccd_roi1_name_"]:^6} is name of roi1 (user configured, often empty)')
#print(f'{fccd_setup["fccd_roi1_size_x"]:^6} is horizontal (x) size  of roi1')
#print(f'{fccd_setup["fccd_roi1_size_y"]:^6} is vertical (y) size  of roi1')
#print(f'{fccd_setup["fccd_roi1_min_xyz_min_x"]:^6} is horizontal (x) start of roi1')
#print(f'{fccd_setup["fccd_roi1_min_xyz_min_y"]:^6} is vertical (y) start of roi1')
#print()
#print(f'{fccd_setup["fccd_cam_fcric_gain"]:^6} is the gain setting of the FCCD')
print()
print(f'{h["start"]["fccd_intersection"]}  is the approximate intersection of the direct beam at delta, gamma = 0, 0 = x, y')
print()
print(f'{h.table("baseline")["inout_status"][1]:^10} is the status of the beam-block to control dark or light images')

0.3375 is the time between consecutive frames for a single "point"
0.2575 is the time between the detector is integrating photons for a single exposure
 200   is the number of frames collected for a single "point"


[440, 431]  is the approximate intersection of the direct beam at delta, gamma = 0, 0 = x, y

Not Inserted is the status of the beam-block to control dark or light images


### If you have HKL setup,  you can access additional information

```python
header.descriptors[1]["configuration"]["tardis"]["data"]["tardis_UB"]

for i, lat_par in enumerate(["a","b", "c", chr(945), chr(946), chr(947)]):
    print(f'{lat_par} = {header.start["sample"]["lattice"][i]}')
    
header.start["sample"]["lattice"]
```