# Nh5 Event Model

## Notes on building an event model based in hdf5, pytabes and pandas

In [1]:
from __future__ import print_function

### GATE event model

GATE event model (EM) can be exposed to python (via pyROOT). On the other hand:

1. Since the GATE objects were designed in C++, the objects exposed to python are "C++ translations" performed by pyROOT. This comes at the cost of loosing reflection/introspection. pyROOT objects can only be manipulated using their C++ interfaces, which are often not obviously translated to python (maps is a good example). 

2. GATE objects are stored using ROOT. Depending on your faith you may consider this a blessing or a curse. 

3. A description of GATE EVM is here:

http://next.ific.uv.es:8888/nextsw/dstbuilder/wikis/EventModel

## Nh5 Event Model

### Raw Data objects

#### Monte Carlo Raw  Data (MCRD) True Waveforms (TWF), Raw Waveforms (RWF),  and Corrected (restored) Waveforms (CWF) 

1. For detailed studies of the Energy Plane FEE, waveforms (bins of 1 ns) in photoelectrons (pes) are neeed. We refer to these as Monte Carlo Raw Data (MCRD). The MCRD is the input to the Simulation of the Energy Plane Response (SIERPE), which convolutes the MCRD with the response of the Energy plane FEE and outputs True Waveforms and Raw Waveforms. 

2. There are also MCRD for the SiPMs, in bins of 1 mus. The SiPM MCRD is the input to the simulation of the SiPM noise, which in turn creates RWF for the SiPMs (also binned in 1 mus).

3. TWF are simply MCRD which are: a) Zero Supressed (ZS); b) binned at 1 mus (to save space and match PMT and SiPMs, eg., for PMAPS). 

4. PMT RWF are the output of the FEE + DAQ. They are obtained by convoluting the MCRD with the LPF and HPF characteristic of the Energy Plane FEE. They are also decimated by the DAQ. The output of the FEE + DAQ is in adc counts. 

5. SiPM RWF are the ouput of the simulation of the SiPM, bineed in 1 mus (as the input). 

6. CWF are the output of the Digital Baseline Restoration (BLR) algorithm. The output of the BLR is also in adc counts.

7. BLR waveforms are PMT waveforms which have been corrected by BLR in the FPGA, or, alternatively, simulated withouth the effect of the HPF by SIERPE. 



http://localhost:8888/notebooks/Notebooks/SIERPE.ipynb

### Structure of the Nh5 files and data flow

1. For detailed MC simulation of the EP one needs to produce (using NEXUS) MCRD files. In spite of the fact that 1 ns waveforms are very large (thus MCRD files weight about 10 MB per event if not compressed), compression is very effective here, since most of the WF contents are zeros. The actual size of the event is of the order of 250 kb. 

2. DIOMIRA takes as an imput MCRD files and writes h5f files whih contain:
    a. PMT RWF waveforms (SIERPE with HPF on)
    b. PMT BLR waveforms (SIEPRE with HPF off)
    c. SiPM RWF (noise simulation of the SiPMs)
    d. PMT TWF (MCRD for PMTs ZS and rebinned at 1 mus)
    e. SiPM TWF (MCRD for SiPMs, ZS, bins of 1 mus)
    f. In addition DIOMIRA copies true monte carlo (true particles and tracks) tables from MCRD files. 
     
3. RWF are passed by DBLR which produces corrected (baseline restored) Waveforms. The resulting CWF are also stored in the DST file. 
4. Notice that NEXUS can produce directly CWF, corresponding to the case where DBLR is performed directly in the FPGA (in this case the DAQ also produces CWF). 

### Decimate or resum (rebin)

Nexus + art produce WF in bins of 25 ns, while RWF, CWF and TWF are DECIMATED in bins of 25 ns from the original MCRD WF. The difference is relevant, in the sense that the DAQ produces also decimated WF. In practice is very simple to convert a decimated WF into a "rebinned" WF, simply: rebinned = decimated * 25. It is imoportant to carry the bookeeping right. The total energy of a decimated WF is sum(wF)x 25 (25 ns is the decimation factor), while the total energy of a rebinned WF is just the sum(wF). 