yvpan/neuralNetworkReconstructor
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
# Neural Network Reconstructor
A TensorFlow/Keras-based neural-network reconstruction framework for the Askaryan Radio Array (ARA), designed to reconstruct ultra-high-energy neutrino event geometry from NuRadioMC simulation outputs.
The project trains multi-output neural networks on simulated detector observables such as ray travel times, amplitudes, launch/receive vectors, and polarization information to estimate event-level reconstruction targets including vertex position, direction, and shower energy.
---
## Overview
The Askaryan Radio Array (ARA) detects ultra-high-energy neutrinos through radio signals emitted in Antarctic ice. Reconstructing neutrino events requires inferring geometry and direction from detector-level observables across multiple antennas and ray solutions.
This repository provides tools to:
- Read NuRadioMC / NuRadioReco simulation outputs
- Visualize simulated input and detector-output distributions
- Train neural-network reconstruction models
- Perform cross-validation and grid search
- Generate diagnostic plots for reconstruction performance
- Evaluate prediction errors for vertex and directional quantities
---
## Repository Structure
```text
neuralNetworkReconstructor/
│
├── bin/
│ ├── nnRecon.py # Main training, inference, and plotting script
│ ├── visualize_sim_input.py # Visualize NuRadioMC input distributions
│ └── visualize_sim_output.py # Visualize detector-output observables
│
├── data/
│ ├── 1e18.5_n1e5_0.hdf5
│ └── 1e18.5_n1e5_ARA02_0.hdf5.XFDTD
│
├── install.sh # Python package installation script
├── gridSearch.sh # Hyperparameter grid search
├── xvalidation.sh # 10-fold cross-validation workflow
├── nnReconTrain.sh # Example training command
├── nnReconPlotRR.sh # Plot vertex-distance reconstruction
├── nnReconPlotTT.sh # Plot vertex-zenith reconstruction
├── nnReconPlotPP.sh # Plot vertex-azimuth reconstruction
├── nnReconPlotZE.sh # Plot neutrino-zenith reconstruction
├── nnReconPlotAZ.sh # Plot neutrino-azimuth reconstruction
├── nnReconPlotSH.sh # Plot shower-energy reconstruction
├── visInput.sh # Example input-visualization command
└── visOutput.sh # Example output-visualization command
```
---
## Main Reconstruction Targets
The model supports reconstruction and visualization for:
| Label | Target | Loss |
|---|---|---|
| `rr` | Horizontal distance from station center to interaction vertex | MSPE |
| `tt` | Vertex zenith angle from station center | MSE |
| `pp` | Vertex azimuth angle from station center, reconstructed via `cos(pp)` and `sin(pp)` | MSE |
| `ze` | Neutrino zenith angle | MSE |
| `az` | Neutrino azimuth angle, reconstructed via `cos(az)` and `sin(az)` | MSE |
| `sh` | Shower energy | MSE |
The network is implemented as a multi-output Keras model with separate output branches for these quantities.
---
## Model Architecture
The core model is implemented in:
```text
bin/nnRecon.py
```
Key features include:
- TensorFlow/Keras functional API
- Shared or branch-specific convolutional feature extraction
- 2D convolution layers over antenna/string observables
- Periodic padding for detector-geometry-aware angular structure
- Dense downstream prediction branches
- Multi-output reconstruction targets
- Custom mean-squared percentage error loss for vertex distance
- Weighted multi-task loss
- Model checkpointing
- Learning-rate reduction on validation plateau
- Optional input dropout/noise studies
The active training configuration focuses on vertex reconstruction while retaining architecture support for direction and shower-energy outputs.
---
## Input Features
The code reads NuRadioMC/NuRadioReco-style HDF5 or NumPy simulation outputs and extracts detector observables including:
- Event IDs
- Interaction vertex coordinates
- Neutrino azimuth and zenith
- Energy and flavor
- Inelasticity
- Signal-to-noise ratios
- Maximum amplitudes
- Envelope amplitudes
- Ray-solution amplitudes
- Peak frequencies
- Travel times
- Receive vectors
- Launch vectors
- Polarization vectors
The default training configuration uses VPol timing features from direct and reflected ray solutions.
---
## Installation
Install the required packages:
```bash
bash install.sh
```
The project uses:
- Python
- NumPy
- SciPy
- h5py
- pandas
- scikit-learn
- matplotlib
- seaborn
- TensorFlow 2.3
- Keras 2.4
- NuRadioMC
- NuRadioReco
- radiotools
---
## Visualize Simulation Inputs
To inspect NuRadioMC input distributions:
```bash
bash visInput.sh
```
or run directly:
```bash
python ./bin/visualize_sim_input.py ./data/1e18.5_n1e5_0.hdf5
```
This produces diagnostic plots of simulated event parameters such as vertex location, direction, flavor, energy, and interaction properties.
---
## Visualize Simulation Outputs
To inspect detector-level simulation outputs:
```bash
bash visOutput.sh
```
or run directly:
```bash
python ./bin/visualize_sim_output.py ./data/1e18.5_n1e5_ARA02_0.hdf5.XFDTD
```
This generates plots for detector observables such as amplitude distributions, travel-time distributions, direction distributions, correlation matrices, and scatter plots.
---
## Train the Reconstruction Network
Run the example training script:
```bash
bash nnReconTrain.sh
```
Equivalent direct usage:
```bash
python ./bin/nnRecon.py \
<input_file_1> <input_file_2> ... \
./plots/nnRecon/ \
train \
<input_mode> <num_layers> <num_nodes> <epochs> <batch_size> <fold> <amp_noise_factor> <time_noise_factor>
```
Example:
```bash
python ./bin/nnRecon.py \
./data/*_ARA02_*.hdf5.XFDTD \
./plots/nnRecon/ \
train \
0 3 60 100 64 0 0 0
```
Arguments:
```text
input files NuRadioMC/NuRadioReco simulation files
output directory directory for plots and model weights
label train, rr, tt, pp, ze, az, or sh
input_mode 0 = HDF5, 1 = NumPy, 2 = CSV, 3 = custom format
num_layers number of convolutional and dense layers
num_nodes number of nodes/filters per layer
epochs training epochs
batch_size batch size
fold fold index for validation/testing
amp_noise_factor amplitude-noise scaling factor
time_noise_factor timing-noise scaling factor
```
Model weights are saved as:
```text
allPairsWeights_<config>.hdf5
```
inside the output directory.
---
## Generate Reconstruction Plots
After training, run the plotting scripts:
```bash
bash nnReconPlotRR.sh
bash nnReconPlotTT.sh
bash nnReconPlotPP.sh
bash nnReconPlotZE.sh
bash nnReconPlotAZ.sh
bash nnReconPlotSH.sh
```
These scripts load the trained weights and generate reconstruction diagnostics for the selected target.
---
## Cross-Validation
Run 10-fold cross-validation:
```bash
bash xvalidation.sh
```
This trains and evaluates the model across folds for each reconstruction target.
---
## Hyperparameter Grid Search
Run a grid search over model depth, width, epochs, and batch size:
```bash
bash gridSearch.sh
```
The grid search explores combinations of:
- Number of layers
- Number of nodes
- Number of epochs
- Batch size
---
## Output Diagnostics
The training and plotting pipeline generates diagnostics such as:
- Model architecture diagrams
- Training/validation loss curves
- Combined multi-output loss curves
- Feature correlation heatmaps
- Feature scatter matrices
- Prediction-vs-truth plots
- Error histograms
- Energy-dependent reconstruction-error plots
- Quantile-based error summaries
---
## Research Context
This project was developed for neural-network-based reconstruction of ultra-high-energy neutrino events in the ARA detector.
It supports the study of whether detector-level radio observables can be used to reconstruct:
- Interaction vertex location
- Signal-arrival geometry
- Neutrino arrival direction
- Shower-energy-related quantities
using supervised learning on NuRadioMC simulations.
---
## Notes
This repository is research code. Several paths in the shell scripts point to local simulation directories and should be updated before running on a new machine.
The included example data files can be used to test visualization scripts and inspect expected input formats.
---
## Potential Future Improvements
- Refactor `nnRecon.py` into modular data, model, training, and plotting components
- Add command-line argument parsing with `argparse`
- Add configuration files for experiments
- Add unit tests for preprocessing and target construction
- Add clearer support for multiple detector stations
- Add benchmark tables for reconstruction resolution
- Add saved example output plots
- Add Docker or Conda environment support
- Add documentation for HDF5/NumPy input schemas
---
## Disclaimer
This repository is intended for scientific research and educational use.