# Onnx vs PyTorch performance example

This example compares inference time for denoising models defined in the
`DUNEdn` package.

The models implemented in PyTorch are exported to Onnx format and both are used
to make inference separately.

The elapsed time for PyTorch and Onnx models batch prediction is measured for
different batch sizes.

The image below shows the pipeline adopted for the current example. 
  

<div style="text-align:center">
<img src="assets/performance.png" alt="Onnx performance example" width=60%/>
</div>

In [None]:
from pathlib import Path
import pandas as pd
from assets.functions import (
    prepare_folders_and_paths,
    check_in_output_folder,
    compare_performance_onnx,
    plot_comparison_catplot,
)
from dunedn.inference.hitreco import DnModel
from dunedn.utils.utils import load_runcard

Define user inputs.

The user might want to tweak the following variables to experiment with the `DUNEdn` package.

- **modeltype** -> available options: `cnn`, `gcnn`, `uscg`.
- **version**  -> available options: `v08`, `v09`  
  The dataset version where the model was trained on.  
  For `cnn` and `gcnn` networks, only version `v08` is available.
- **pytorch_dev** -> available options: `cpu`, `cuda:0` or `cuda:id`.  
  The device hosting the PyTorch computation.  
  It is recommended to run PyTorch on a GPU.  
  Default ``batch_size`` settings ensure that the computation fits a 16 GB gpu.  
- **base_folder** -> the output folder.  
  Ensure to have permissions to write on the device.
- **ckpt_folder** -> the checkpoint folder.  
  Ensure the folder has the structure explained in the package documentation.

In [None]:
# user inputs
modeltype = "cnn"
version = "v08"
pytorch_dev = "cpu"  # device hosting PyTorch computation
base_folder = Path("../../output/tmp")
ckpt_folder = Path(f"../dunedn_checkpoints/{modeltype}_{version}")

# set up the environment
folders, paths = prepare_folders_and_paths(modeltype, version, base_folder, ckpt_folder)

Create output directories

In [None]:
check_in_output_folder(folders)

Model loading: PyTorch

In [None]:
setup = load_runcard(base_folder / "cards/runcard.yaml")  # settings
model = DnModel(setup, modeltype, ckpt_folder)
print(f"Loaded model from {ckpt_folder} folder")

Model loading: Onnx

In [None]:
# export
model.onnx_export(folders["onnx_save"])

In [None]:
# load model
model_onnx = DnModel(setup, modeltype, folders["onnx_save"], should_use_onnx=True)
print(f"Loaded model from {folders['onnx_save']} folder")

## PyTorch vs Onnx Performance

The goal is to compare the performance of the models for different input batch sizes.

The collected inference timings are loaded into a `Pandas.Dataframe` for easier manipulation.

In [None]:
batch_size_list = [32, 64, 128, 256, 512, 1024]
nb_batches = 2
performance_df = compare_performance_onnx(
    model, model_onnx, pytorch_dev, batch_size_list, nb_batches
)
performance_df.to_csv(paths["performance_csv"])

In [None]:
performance_df = pd.read_csv(paths["performance_csv"])
plot_comparison_catplot(performance_df, folders["plot"], with_graphics=True)