Skip to content

ftoralesacosta/GSGM_for_EIC_Calo

 
 

Repository files navigation

Fast Point Cloud Diffusion (FPCD)

This is a fork of the official implementation of the FPCD paper that uses a diffusion model to generate particle jets while progressive distillation is used to accelerate the generation. This fork is intended to be used to generate point-cloud calorimeter data. The calorimeter is designed for the upcoming Electron Ion Collider, and more information for this calorimeter can be found here.

Visualization of FPCD

#Calorimeter Data

The idea of this repository is to take advantage of the similarities between jets/constituent and clusters/cells. We take calorimeter simulations done for the EPIC detector an the upcoming EIC, and convert them to HDF5 files that closeley resemble the JetNet data format. The Calorimer HDF5 files are made in the eic/generate_data repository using this converter. The format of the data is in the table below. The integers below the cell description just give the index of that feature in the dataset for convenience.

"cluster"
$P_\mathrm{Gen.}$ $\theta_\mathrm{Gen.}$ $\sum E_\mathrm{cells}$ $N_\mathrm{cells}$
0 1 2 3
"hcal_cells"
E X Y Z mask
0 1 2 3 4

Training a new model

To train a new model from scratch, first, obtain data from the eic/generate_data repository. Then convert the root file(s) to HDF5 with this converter The baseline model can be trained with:

cd scripts
python train.py [--big]

with optiional --big flag to choose between the 30 or 150 particles dataset. After training the baseline model, you can train the distilled models with:

python train.py --distill --factor 2

This step will train a model that decreases the overall number of time steps by a factor 2. Similarly, you can load the distilled model as the next teacher and run the training using --factor 4 and so on to halve the number of evaluation steps during generation.

To reproduce the plots provided in the paper, you can run:

python plot_jet.py [--distill --factor 2] --sample

The command will generate new observations with optional flags to load the distilled models. Similarly, if you already have the samples generated and stored, you can omit the --sample flag to skip the generation.

Plotting and Metrics

The calculation os the physics inspired metrics is taken directly from the JetNet repository, thus also need to be cloned. Notice that while our implementation is carried out using TensorFlow while the physics inspired metrics are implemented in Pytorch.

Out distillation model is partially based on a Pytorch implementation.

About

Fast Point Cloud diffusion model for EIC hadronic calorimeter data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 95.3%
  • Python 4.7%