# Nh5 Event Model

## Notes on building an event model based in hdf5, pytabes and pandas

In [1]:
from __future__ import print_function

### GATE event model

GATE event model (EM) can be exposed to python (via pyROOT). On the other hand:

1. Since the GATE objects were designed in C++, the objects exposed to python are "C++ translations" performed by pyROOT. This comes at the cost of loosing reflection/introspection. pyROOT objects can only be manipulated using their C++ interfaces, which are often not obviously translated to python (maps is a good example). 

2. GATE objects are stored using ROOT. Depending on your faith you may consider this a blessing or a curse. 

3. A description of GATE EVM is here:

http://next.ific.uv.es:8888/nextsw/dstbuilder/wikis/EventModel

## Nh5 Event Model

### Raw Data objects

#### Monte Carlo Raw  Data (MCRD) True Waveforms (TWF), Raw Waveforms (RWF),  Base Line Restored (BLR) waveforms and Corrected Waveforms (CWF) 

1. For detailed studies of the Energy Plane FEE, waveforms (bins of 1 ns) in photoelectrons (pes) are neeed. We refer to these as Monte Carlo Raw Data (MCRD). The PMT MCRD is the input to the Simulation of the Energy Plane Response (SIERPE), which convolutes the MCRD with the response of the Energy plane FEE and outputs True Waveforms, BLR Waveforms and Raw Waveforms. 

2. There are also MCRD for the SiPMs, in bins of 1 mus. The SiPM MCRD is the input to the simulation of the SiPM noise, which in turn creates RWF for the SiPMs (also binned in 1 mus).

3. PMT TWF are simply MCRD which are: a) Zero Supressed (ZS); b) binned at 1 mus (to save space and match PMT and SiPMs, eg., for PMAPS). 

4. PMT RWF are the output of the FEE + DAQ. They are obtained by convoluting the MCRD with the LPF and HPF characteristic of the Energy Plane FEE. They are also decimated by the DAQ. The output of the FEE + DAQ is in adc counts. 

5. PMT BLR are also output of the FEE + DAQ, but applying only the LPF, e.g, they correspond to the response of an electronics that would not distort the WF, or to the outuput of a perfectly resoted baseline (thus the Base Line Restored or BLR). 

6. We save TWF, RWF and BLR waveforms for the PMTs. In the Monte Carlo data, the three sets will allow detailed comparisons to quantify the effect of noise and peak-finding in the resolution (comparison TWF and BLR) and the effect of convolution/deconvolution (RWF are lated deconvolved using a BLR algorithm and produce corrected waveforms, CWF, which can be compared with BLR waveforms)

7. Notice that the DETSIM/ART chain works directly with "perfect" ZS-BLR waveforms. The studies of GML, PF and others show very good energy resolution, but many critical effects are being skipped. 

8. In the data produced by the DAQ we will save BLR waveforms and RWF too. In the data BLR are PMT waveforms which have been corrected by BLR in the FPGA, while RWF are direct output from the DAQ. 



### Structure of the Nh5 files and data flow

1. For detailed MC simulation of the EP one needs to produce (using NEXUS) MCRD files. In spite of the fact that 1 ns waveforms are very large (thus MCRD files weight about 10 MB per event if not compressed), compression is very effective here, since most of the WF contents are zeros. The actual size of the event is of the order of 336 kb. 

2.1 DIOMIRA takes as an imput MCRD files and writes h5f files whih contain:

    a. PMT RWF waveforms (SIERPE with HPF on)-- name of field: pmtrwf
    
    b. PMT BLR waveforms (SIEPRE with HPF off)--name of field: pmtblr
    
    c. SiPM RWF (noise simulation of the SiPMs)--name of field: sipmrwf
    
    d. PMT TWF (MCRD for PMTs ZS and rebinned at 1 mus)-- name of field: pmttwf
    
    e. SiPM TWF (MCRD for SiPMs, ZS, bins of 1 mus)-- name of field: sipmtwf
    
    f. In addition DIOMIRA copies true monte carlo (true particles and tracks) tables from MCRD files, a table with the FEE parameters, and tables with geometry and calib info for PMTs and SiPMs

2.2 Saving the pmtrwf, pmtblr and sipmrwf as Int32, compression is also effective (many zeros can be taken away). Since the pmtrwf and pmtblr are smaller than pmtrd, the size of the event is of ther order of 500 kb per event. 

2.3 DAQ (real data) produces an h5f file that corresponds to the DIOMIRA simulation (without the TWFs). That is:

    a. PMT RWF (BLR off in FPGA)-- name of field: pmtrwf
    
    b. PMT BLR (BLR on in FPGA)-- name of field: pmtblr
    
    c. SiPM --name of field: sipmrwf

2.4 Thus, notice that the file produced by DIOMIRA is equivalent to the file produced by DAQ except for the additional tables related with true waveforms (and Monte Carlo tracks, etc.). 
    
     
3. ISIDORA reads the h5f file produced by either DIOMIRA or DAQ. Then pmtrwf waveforms are passed by BLR algorithm to produce cwf (oputput pmtcwf) which are added to the file. The resulting file shoule weight less than 1 MB per event 
    
4. DOROTEA reads the h5f file, produces PMAPS and saves them into an additional table (PMAP table).

5. ZAIRA uses PMAPS to produce a Kripton analysis. 