# smalldata_tools intro & links
smalldata_tools is a package of analysis tools build on MPIDataSource and the psana.Detector class
It supports analysis in two modes: smalldata/event-based & cube/binned (the latter needs to be expanded for LCLS-2)
For the event-based mode, the two most important concepts for hdf5 production are the "DefaultDetector" and the "DetObject" for experiment specific data reduction. These will be covered in these notebooks.
Documentation is available on [confluence](https://confluence.slac.stanford.edu/display/PSDM/smalldata_tools%3A+Analysis+tools+for+aligned+data)
We are also starting to build out [notebooks](https://confluence.slac.stanford.edu/display/PSDM/2.+Example+analysis+notebooks) to help with analyses of particular measurement-types or advances uses of the hdf5 files (summing of data over runs w/ proper error treatment etc).

This example demonstrates what the code in the producer-files does which are used in the standard ARP-based smalldata production. The actual producer files contain a lot more lines for a variety of convenience options etc in addition to the 'main' ideas shown here.
smalldata_tools is based on top on MPIDataSource which allows us to write code in a simple loop, test it on a single core, but be able to run it with MPI on hundreds of cores without any code changes.

MPIDataSource will write out a hdf5 file with event-based (timesorted for LCLS1!) and 'summed'/single datasets. Some of the standard data (e.g. GEM, BLD, event codes) will be included by default.

# Add default Detectors 
Now we use the first concept of smalldata_tools: expand the list of always saved fields to regularly used detectors where the event based data is small. If applicable, use name colloquially used by the beamline scientists (ipm2, ...)
Translate technical event codes (which vary by hutch and possible even experiment) to standard binary flags: lightStatus/[xray,laser]

In [None]:
import numpy as np
import holoviews as hv
hv.extension('bokeh')
from tqdm import tqdm
import h5py as h5
from pathlib import Path
import tables
import sys

import psana as ps
sys.path.append('/sdf/group/lcls/ds/tools/smalldata_tools/latest')
from smalldata_tools.SmallDataUtils import defaultDetectors, detData

run = 320 
exp = 'xpptut15'

dsname = 'exp={}:run={}'.format(exp,run,exp[:3],exp)

ds = ps.MPIDataSource(dsname)

small_data = ds.small_data('./UserMtg24.h5', gather_interval=5)
default_dets = defaultDetectors(exp[:3])

max_evt = 5
ds.break_after(max_evt) # stop iteration after max_evt events (break statements do not work reliably with MPIDataSource).
for nevt,evt in enumerate(ds.events()):
    if nevt>=max_evt:
        break

    def_data = detData(default_dets, evt)
    
    small_data.event(def_data)
small_data.close()

dat_def = tables.open_file('./UserMtg24.h5').root

# Read the file
Print some of the values of the 'default' data saved in the hdf5 files.
lightStatus/[xray,laser] are avaialble for all hutches - the typically used event codes are setup in smalldata_tools
tt/ttCorr(etc) are also available for all instrumented where a timetool has been integrated.
ipm2/sum is i0-data measured a common component device, the names of the typically use device will differ hutch-by-hutch and may also change from experiment to experiment (though there typically are only 2-3 options)

In [None]:
print('laser:', dat_def.lightStatus.laser.read())
print('i0 (ipm2):',dat_def.ipm2.sum.read())
print('timetool:',dat_def.tt.ttCorr.read())
dat_def.ipm2.sum.shape

# Add the timetool waveform to the default detectors
There are a few default detectors that could be adjusted (e.g. giving the analog inputs special names).
Another option would be to apply a better calibration to the timetool (here, we take calibration runs to translate a pixel position into a ps measurement.
The most typically (though still rarely) needed detector of this type is the raw timetool traces. This is only needed is something has gone wrong in the online analysis, otherwise simply use the values in tt(/ttCorr) 

This example also demonstrates how to retrieve the data to be stored in the hdf5 file, process it and store the new result as well. 

Thr ttRaw detector extracts the timetool traces as defined in the DAQ. In this example, we also add a new parameter 'myVar' to the 'ttRaw' dictionary (obviously an example for more meaninful code....)

The timetool reprocessing code in smalldata_tools have will do nothing until a core gets a reference, then I'll run the fitting code on the ratio. We can add other code to do this here as well.

tqdm gives you a rough idea of how fast the job will run (although the comparison is not really perfect)

In [None]:
from smalldata_tools.SmallDataDefaultDetector import ttRawDetector

ttRawDet = ttRawDetector(env=ds.env())
#ttRawDet.setPars({'beamOff':[-137], 'refitData': True, 'kind': 'stepUp'})
ttRawDet.setPars({'beamOff':[162]}) #old data before SXR beam sharing, 
default_dets.append(ttRawDet)

small_data = ds.small_data('./UserMtg_smd_deftt.h5', gather_interval=5)

max_evt = 5 #because this fails when writing the file. 
userDict = {}
for nevt,evt in tqdm(enumerate(ds.events())):
    if nevt>=max_evt:
        break

    def_data = detData(default_dets, evt)

    #retrieve data for for further processing in this event loop
    ttRaw_wf = def_data['ttRaw']['tt_signal']
    
    #store the result you'd like to keep
    def_data['ttRaw']['myVar'] = np.nanmax(ttRaw_wf)
    
    small_data.event(def_data)
small_data.close()

dat_deftt = tables.open_file('./UserMtg_smd_deftt.h5').root

## Check default data - timetool trace as example

In [None]:
hv.Points(dat_deftt.ttRaw.tt_signal.read(-1).squeeze(), label='signal').options(width=800) *\
hv.Points(dat_deftt.ttRaw.tt_reference.read(-1).squeeze(),label='references').options(width=800)