Skip to content

GageDeZoort/gnns-for-tracking

Repository files navigation

Graph Neural Networks for Charged Particle Tracking

This repository is focused on applying graph neural networks (GNNs) to the task of charged particle track reconstruction using the high-pileup TrackML dataset:

  • TrackML @ Kaggle: https://www.kaggle.com/c/trackml-particle-identification
  • TrackML @ Codalab: https://competitions.codalab.org/competitions/20112 TrackML data is a 3D point cloud of tracker hits with associated truth information about the particles that generate them. The goal of GNN-based tracking workflows is to embed track hits as graph nodes and apply GNNs to cluster hits belonging to the same particle. This repo focuses on two compelemtary strategies: edge classification to predict hit associations and object condensation to cluster hits and predict track properties.

Graph Construction

Base directory: graph_construction/. TrackML provides several truth quanities about the particles producing track hits in each events, for example transverse momentum, vertex, and charge. Each track hit is uniquely associated with a particle ID, so that we can calculate additional information about each track at truth level.

  • measure_particle_properties.py produces a dataframe of truth information corresponding to each particle, including transverse momentum, charge, transverse impact parameter, number of hits, number of layers hit, and whether the particle skips a layer. Particles that produce hits in three or more layers, do not skip a layer, and follow a physical trajectory are labeled as reconstructable.

    Example usage:
    python measure_particle_properties.py -i /trackml_data/train_1 -o particle_properties --n-workers=3

  • slurm/measure_particle_properties.{py,slurm} are provided to produce particle dataframes as set of batch jobs via Slurm.

The following scripts build tracker hit graphs from a set of TrackML event files and corresponding particle property dataframes.

  • build_graphs.py produces graphs containing track hits embedded as nodes with features (r, phi, z, u, v), where u and v are coordinates in conformal space, and edge features (dr, dphi, dz, dR), where dR is the hit-hit distance in eta-phi space. Hits are assigned to particle IDs and the track parameters belonging to that particle ID at truth level. Edges are drawn via a set of geometric selections specified in configs/build_graphs.yaml.

    Example usage:
    python train_TCN.py -i graphs/train1_ptmin0p8/ --n-train=10000 --n-test=2000 --learning-rate=0.0001

  • slurm/build_graphs_job.{py,slurm} are provided to produce graphs via a set of batch jobs via Slurm.
  • slurm/optimize_graph_construction_params.{py,slurm} are provided to submit batch jobs corresponding to different geometric selections and report the corresponding graph construction efficiency (n_true/n_true_possible edges) and purity (n_true/n_total edges).

The graph construction routine employs several functions available in utils/graph_building_utils.py and utils/hit_processing_utils.py (see below).

GNN Inference

Models

Training

Two training scripts, train_IN.py and train_TCN.py, are located at the head of the directory. train_IN.py focuses on training IN-based edge classification architectures with no learned track finding step. train_TCN.py focuses on the object condensation approach, optionally including track parameter predictions. Each script may be run stand-alone or through a batch job.

Example usage:
python build_graphs.py /configs/build_graphs.yaml --n-workers=3

Additionally, hyperparameter job arrays may be submitted using the scripts in hyperparameter_scans.

Inference

Track-Finding

Utils

  • hit_processing_utils.py contains a set of helper functions for opening TrackML events.
  • graph_building_utils.py contains functions that select hits given a set of truth cuts, select edges given a set of geometric cuts, split the detector into multiple sectors, and correct edge truth levels in the case that multiple barrel-endcap layer connections are possible.
  • data_utils.py contains a set of functions for loading in graphs, building train/test/validation partitions, organizing graph datasets and dataloaders, and a custom GraphDataset extension of the PyTorch Geometric Dataset class.
  • inference_utils.py contains functions relevant to training and testing various GNN algorithms.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages