<div class="alert alert-block alert-warning">
<b>IMPORTANT:</b> Colored boxes are for editing instructions and must be removed before adding the new notebook to the benchmarking suite.
</div>

# Title - Analysis stage - Data Level

<div class="alert alert-block alert-info">
    <li> <b>Title example:</b> <em> Shower geometry reconstruction - TRAINING - DL2a </em> </li>
    <li> <b>path/filename:</b> <em> protopipe/benchmarks/notebooks/analysis-stage/benchmarks_DLXy_Title.ipynb </em> </li>
    <li> <b>Data level:</b> should be short and easy (e.g. image cleaning, direction reconstruction, etc..) </li>
    <li> <b>Analysis stage:</b> will in general depend from the analysis workflow, but some are already in use as <em> TRAINING, MODELS, DL2, DL3 </em> </li>
    <li> <b>Datasample names:</b> should reflect particle and analysis stage as an integer, e.g <em> gamma1 </em> or <em> proton2 </em> </li>
    <li> <b> Cell tags:</b> in principle these notebooks are made to show results, so make sure that input cells are removed or hidden and that the only headers that will appear under HTML are those with the name of the benchmarks (the results repository will perform this automatically) - go <a href="https://jupyterbook.org/interactive/hiding.html">here</a> for a list of cell tags to use </li>
    
</div>

**Recommended datasample(s):** `datasample name` (dataset used to XXX)

**Data level(s):** DLXy (user-friendly short description) + ...

**Description:**

This notebook contains plots and benchmarks proposals from the _protopipe_ pipeline related to ...

**Requirements and steps to reproduce:**

- get a XXX data generated using `protopipe-XXX` (or ctapipe-process + XXX.json)
- execute the notebook with `protopipe-BENCHMARK`,

`protopipe-BENCHMARK launch --config_file configs/benchmarks.yaml -n TRAINING/benchmarks_DLXy_title`

To obtain the list of all available parameters add ``--help-notebook``.

**Development and testing:**

As with any other part of _protopipe_ and being part of the official repository, this notebook can be further developed by any interested contributor.   
The execution of this notebook is not currently automatic, it must be done locally by the user _before_ pushing a pull-request.  
Please, **strip the output before pushing**.

## Table of contents
- [Benchmark name 1](#Benchmark-name-1)
    - [Benchmark name 1.1](#Benchmark-name-1.1)

## Imports

<div class="alert alert-block alert-info">
    <li> Import only what is strictly necessary to run the notebook </li>
    <li> Privilege Python's standard library </li>
    <li> Optional imports (e.g. seaborn and uproot) should be performed after the injection of the parameters from papermill (see later) </li> <br>

   The following cell gives an example of what might be useful.
</div>

In [None]:
# Python Standard Library
import sys
from pathlib import Path

# Data handling
import tables
import pandas
import numpy as np

# Plotting
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from matplotlib.pyplot import rc
import matplotlib.style as style
from cycler import cycler

# protopipe I/O and benchmarking API
from protopipe.pipeline.io import get_camera_names, read_protopipe_TRAINING_per_tel_type
from protopipe.benchmarks.utils import string_to_boolean, get_fig_size

## Input data

<div class="alert alert-block alert-info">
    <li> The next cell contains an <b>example</b> list of parameters </li> 
    <li> also, more parameters could be injected by the user at runtime via the CLI</li>
</div>

In [None]:
# Parametrized cell

# General I/O options
analyses_directory = None
output_directory = Path.cwd() # default output directory for plots
analysis_name = None
input_filename = None # Name of the file produced with protopipe

# CTAMARS or other ROOT-based data to load (if none, just remove)
load_CTAMARS = True # Enable to compare the CTAN analysis done with CTAMARS (Release 2019)
indir_CTAMARS = None  # Path to CTAMARS data (if load_CTAMARS is True)
input_file_name_CTAMARS = "CTA_check_dl2_4L15M.root" # Name of the CTAMARS reference file to use (if load_CTAMARS is True)
input_simtel_filepath = None # simtel file used to plot telescope positions

# Comparison between protopipe analyses
export_data = True # If True export data in CSV format
superimpose = False # If True superimpose results from 'analysis_name_2' data files (requires 'export_data'=True)
analysis_name_2 = None

# Plotting options (see benchmarks.yaml)
export_plots = False # if True, save plots in a format (PNG by default) into a "plots" folder
plots_format = "png"
plots_scale = None
use_seaborn = False # If True import seaborn and apply global settings from config file

<div class="alert alert-block alert-info">
     <li> At runtime, <em> papermill </em> will inject the final values of all parameters after this cell. </li>
    <li> The cell after that should deal with the conversion of papermill format conversions (it converts CLI injected parameters to strings) and with the optional imports. </li>
</div>

In [None]:
# Handle boolean variables (papermill reads them as strings)
[load_CTAMARS,
 load_EventDisplay,
 use_seaborn,
 export_data,
 export_plots,
 superimpose] = string_to_boolean([load_CTAMARS,
                                   load_EventDisplay,
                                   use_seaborn,
                                   export_data,
                                   export_plots,
                                   superimpose])
if use_seaborn:
    try:
        import seaborn as sns
    except ImportError:
        sys.exit("ERROR: seaborn was enabled, but it doesn't seem to be installed in this environemnt.")

if load_CTAMARS:
    try:
        import uproot
    except ImportError:
        sys.exit("ERROR: ROOT-based data was requested, but uproot doesn't seem to be installed in this environemnt.")

<div class="alert alert-block alert-info">
    <li> The next cell makes sure the required filenames are defined by prioritizing <code>benchmarks.yaml</code>, then the notebook itself, otherwise an error is raised. </li>
    <li> The data input process could be refactored under <code>protopipe.benchmarks.operations</code>. </li>
</div>

In [None]:
if input_filename is None:
    try:
        input_filename = input_filenames["TRAINING_energy_gamma"]
    except (NameError, KeyError):
        raise ValueError("The name of the input file is undefined: please use benchmarks.yaml or define it using the CLI.")

if input_simtel_filepath is None:
    try:
        input_simtel_filepath = Path(input_filenames["simtel"])
    except (NameError, KeyError, TypeError):
        input_simtel_filepath = None # a warning is print later
    finally:
        if (input_filenames["simtel"]==""):
            input_simtel_filepath = None
else:
    input_simtel_filepath = Path(input_simtel_filepath)

# only if all required datafiles are defined, then start the data input process
input_directory = Path(analyses_directory) / analysis_name / Path("data/TRAINING/for_energy_estimation/gamma")
cameras = get_camera_names(input_directory, input_filename)
data = read_protopipe_TRAINING_per_tel_type(input_directory, input_filename, cameras)

<div class="alert alert-block alert-info">
    <li> The next 2 cells setup the output folders for plots and data (if exported) and the final plotting settings</li>
    <li> The code contained in the plotting setting cell could be refactored under <code>protopipe.benchmarks.plots</code>. </li>
</div>

In [None]:
# First we check if a "plots" folder exists already.  
# If not, we create it.
if export_plots:
    plots_folder = Path(output_directory) / "plots"
    plots_folder.mkdir(parents=True, exist_ok=True)

# Next we check if a "data" folder exists already.  
# If not, we create it.
if export_data:
    data_folder = Path(output_directory) / "data"
    data_folder.mkdir(parents=True, exist_ok=True)

if superimpose:
    input_directory_data_2 = Path(analyses_directory) / analysis_name_2/ "benchmarks_results/TRAINING"

In [None]:
# Plot aesthetics settings

scale = matplotlib_settings["scale"] if plots_scale is None else float(plots_scale)

style.use(matplotlib_settings["style"])
cmap = matplotlib_settings["cmap"]

if matplotlib_settings["style"] == "seaborn-colorblind":
    
    colors_order = ['#0072B2', '#D55E00', '#F0E442', '#009E73', '#CC79A7', '#56B4E9']
    rc('axes', prop_cycle=cycler(color=colors_order))

if use_seaborn:
    import seaborn as sns

    sns.set_theme(context=seaborn_settings["theme"]["context"] if "context" in seaborn_settings["theme"] else "talk",
                  style=seaborn_settings["theme"]["style"] if "style" in seaborn_settings["theme"] else "whitegrid",
                  palette=seaborn_settings["theme"]["palette"] if "palette" in seaborn_settings["theme"] else None,
                  font=seaborn_settings["theme"]["font"] if "font" in seaborn_settings["theme"] else "Fira Sans",
                  font_scale=seaborn_settings["theme"]["font_scale"] if "font_scale" in seaborn_settings["theme"] else 1.0,
                  color_codes=seaborn_settings["theme"]["color_codes"] if "color_codes" in seaborn_settings["theme"] else True
                  )
    
    sns.set_style(seaborn_settings["theme"]["style"], rc=seaborn_settings["rc_style"])
    sns.set_context(seaborn_settings["theme"]["context"],
                    font_scale=seaborn_settings["theme"]["font_scale"] if "font_scale" in seaborn_settings["theme"] else 1.0)

## Prepare data

<div class="alert alert-block alert-info">
    <li> Here you should prepare the data in the format that you need to build all the benchmarks of the notebook </li>
    <li> This part can be more or less long depending on how much refactored code is available (either from <em>protopipe</em> or <em>ctapipe</em> </li>
    <li> This is also the place where to define (or overwrite) reconstructed and true <b>energy bins</b> (would be good to define this first in <code>benchmarks.yaml</code> like for input data files so all benchmarks will share the same energy settings by default </li>
</div>

## Benchmark name 1
[back to top](#Table-of-contents)

- this benchmark shows ...
- it is expected that ...

<div class="alert alert-block alert-info">
    <li> all benchmarks should be done in the same way </li>
    <ol>
        <li> define figure </li>
        <li> define data to use </li>
        <li> make plot </li>
        <li> (optionally) export data </li>
        <li> (optionally) save plot </li>
    </ol>
    <li> figures should be always initialized with <code>protopipe.benchmarks.utils.get_fig_size</code> </li>
    <li> code to export data and plots should be refactored (improved) and called from relevant  <code>protopipe.benchmarks</code> modules (next cell shows an examples for a 1D plot) </li>
    <li> an optional cell to define specific variables for the specific benchmark can be added just before the actual plot cell (though, such an operation on data should be refactored under <code>protopipe.benchmarks.operations</code> if long, otherwise it can be done directly inside the benchmarking cell </li>
    
</div>

```python

benchmark_name = "benchmark example"
fig = plt.figure(figsize=get_fig_size(ratio=4/3., scale=scale))

# Get data

X = ...
Y = ...

# (Optionally) Export data used for plot
# This can be refactored as explained above
if export_data:
    data_to_write = np.array([X,Y])
    np.savetxt(data_folder / f"{benchmark_name}_protopipe_{analysis_name}.csv",
               data_to_write.T,
               delimiter=',',
               header="X quantity [unit name], Y quantity [unit name]")

# Make plot

options = {}
plotting_function(X, Y, **options)

# (Optionally) Superimpose same benchmark from a different analysis
# This can be refactored as explained above
if superimpose:
    data_2 = np.genfromtxt(
        input_directory_data_2 / f"data/{benchmark_name}_protopipe_{analysis_name_2}.csv",
        delimiter=',',
        filling_values=True).T
    plt.plot(data_2[0], data_2[1], '-.', label = f"protopipe {analysis_name_2}")

# all other plot options should be added at the very end
plt.xlabel()
plt.ylabel()
plt.legend()
plt.grid() # take care when using this with seaborn enabled

# (Optionally) Export plot
if export_plots:
    plt.savefig(plots_folder / f"{benchmark_name}_protopipe_{analysis_name}.{plots_format}")
    
None # to remove clutter by matplotlib objects
```

## Benchmark name 1.1
[back to top](#Table-of-contents)