# Reducing R1 to DL1

This notebook outlines the process of obtaining the *_dl1.h5 file, containing the parameters extracted from the calibrated R1 waveforms.

The DL1 information is stored as a pandas.DataFrame in HDF5 format. A DataFrame is an object that acts as a table. It is compatible with numpy methods and allows easy category searching. Learn about pandas.DataFrame at: https://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe

Each column in the table corresponds to a different parameter that characterises the waveform. Each row in the table corresponds to a different pixel or event.

For this tutorial you need TargetDriver, TargetIO and TargetCalib installed.

## Setup

Prepare your machine and environment by following the instructions at: https://forge.in2p3.fr/projects/gct/wiki/Installing_CHEC_Software

If you do not wish to install the TARGET libraries as you will only be reading DL1 files, you can skip this tutorial.

Check the installation was successful by running these lines:

In [None]:
import target_driver
import target_io
import target_calib

## Files

To run this tutorial you must download a reference dataset (using the username and password Rich has sent around in emails/Slack). This file required for this tutorial is a calibrated R1 file. This run corresponds to a ~50 p.e. illumination run.

In [None]:
username = '***'
pw = '***'
r1_url = 'https://www.mpi-hd.mpg.de/personalhomes/white/checs/data/d0000_ReferenceData/Run17473_r1.tio'

In [None]:
!mkdir refdata
!wget --user $username --password $pw -P refdata $r1_url

In [None]:
r1_path = "refdata/Run17473_r1.tio"

## Data Reduction

Once you have the R1 (calibrated waveforms) file, you can now extract charge and other parameters from the waveforms. This is where CHECLabPy comes into play. The extract_dl1.py script allows you to specify a reduction method to produce a HDF5 containing a table where each column is parameters extracted per event and pixel.

In [None]:
!extract_dl1 -h

In [None]:
!extract_dl1 -f $r1_path

## Config File

As you can see from the output above, a default `WaveformReducerChain` was built from the `columns` of the different `WaveformReducers`. The user is able to configure which columns are included in the DL1 file. This is achieved by specifying a path to a YAML config file with the -c option.

To generate an example config file, one can use the generate_dl1_config executable:

In [None]:
!generate_dl1_config -h

In [None]:
!generate_dl1_config

In [None]:
from CHECLabPy.data import get_file
config_path = get_file("extractor_config.yml")

In [None]:
!less $config_path

Every column available is printed in the config file, along with its docstring, and the description of the `WaveformReducer` the column belongs to. Additionally, the default setting for the column is also shown.

Lets create our own config file, which only activates a single column.

In [None]:
!echo "charge_cc: True" > $config_path
!cat $config_path

In [None]:
!extract_dl1 -f $r1_path -c $config_path

As you can see, only the single column was included in the `WaveformReducerChain`