Skip to content

yvpan/neuralNetworkReconstructor

Repository files navigation

# Neural Network Reconstructor

A TensorFlow/Keras-based neural-network reconstruction framework for the Askaryan Radio Array (ARA), designed to reconstruct ultra-high-energy neutrino event geometry from NuRadioMC simulation outputs.

The project trains multi-output neural networks on simulated detector observables such as ray travel times, amplitudes, launch/receive vectors, and polarization information to estimate event-level reconstruction targets including vertex position, direction, and shower energy.

---

## Overview

The Askaryan Radio Array (ARA) detects ultra-high-energy neutrinos through radio signals emitted in Antarctic ice. Reconstructing neutrino events requires inferring geometry and direction from detector-level observables across multiple antennas and ray solutions.

This repository provides tools to:

- Read NuRadioMC / NuRadioReco simulation outputs
- Visualize simulated input and detector-output distributions
- Train neural-network reconstruction models
- Perform cross-validation and grid search
- Generate diagnostic plots for reconstruction performance
- Evaluate prediction errors for vertex and directional quantities

---

## Repository Structure

```text
neuralNetworkReconstructor/
│
├── bin/
│   ├── nnRecon.py                  # Main training, inference, and plotting script
│   ├── visualize_sim_input.py      # Visualize NuRadioMC input distributions
│   └── visualize_sim_output.py     # Visualize detector-output observables
│
├── data/
│   ├── 1e18.5_n1e5_0.hdf5
│   └── 1e18.5_n1e5_ARA02_0.hdf5.XFDTD
│
├── install.sh                      # Python package installation script
├── gridSearch.sh                   # Hyperparameter grid search
├── xvalidation.sh                  # 10-fold cross-validation workflow
├── nnReconTrain.sh                 # Example training command
├── nnReconPlotRR.sh                # Plot vertex-distance reconstruction
├── nnReconPlotTT.sh                # Plot vertex-zenith reconstruction
├── nnReconPlotPP.sh                # Plot vertex-azimuth reconstruction
├── nnReconPlotZE.sh                # Plot neutrino-zenith reconstruction
├── nnReconPlotAZ.sh                # Plot neutrino-azimuth reconstruction
├── nnReconPlotSH.sh                # Plot shower-energy reconstruction
├── visInput.sh                     # Example input-visualization command
└── visOutput.sh                    # Example output-visualization command
```

---

## Main Reconstruction Targets

The model supports reconstruction and visualization for:

| Label | Target | Loss |
|---|---|---|
| `rr` | Horizontal distance from station center to interaction vertex | MSPE |
| `tt` | Vertex zenith angle from station center | MSE |
| `pp` | Vertex azimuth angle from station center, reconstructed via `cos(pp)` and `sin(pp)` | MSE |
| `ze` | Neutrino zenith angle | MSE |
| `az` | Neutrino azimuth angle, reconstructed via `cos(az)` and `sin(az)` | MSE |
| `sh` | Shower energy | MSE |

The network is implemented as a multi-output Keras model with separate output branches for these quantities.

---

## Model Architecture

The core model is implemented in:

```text
bin/nnRecon.py
```

Key features include:

- TensorFlow/Keras functional API
- Shared or branch-specific convolutional feature extraction
- 2D convolution layers over antenna/string observables
- Periodic padding for detector-geometry-aware angular structure
- Dense downstream prediction branches
- Multi-output reconstruction targets
- Custom mean-squared percentage error loss for vertex distance
- Weighted multi-task loss
- Model checkpointing
- Learning-rate reduction on validation plateau
- Optional input dropout/noise studies

The active training configuration focuses on vertex reconstruction while retaining architecture support for direction and shower-energy outputs.

---

## Input Features

The code reads NuRadioMC/NuRadioReco-style HDF5 or NumPy simulation outputs and extracts detector observables including:

- Event IDs
- Interaction vertex coordinates
- Neutrino azimuth and zenith
- Energy and flavor
- Inelasticity
- Signal-to-noise ratios
- Maximum amplitudes
- Envelope amplitudes
- Ray-solution amplitudes
- Peak frequencies
- Travel times
- Receive vectors
- Launch vectors
- Polarization vectors

The default training configuration uses VPol timing features from direct and reflected ray solutions.

---

## Installation

Install the required packages:

```bash
bash install.sh
```

The project uses:

- Python
- NumPy
- SciPy
- h5py
- pandas
- scikit-learn
- matplotlib
- seaborn
- TensorFlow 2.3
- Keras 2.4
- NuRadioMC
- NuRadioReco
- radiotools

---

## Visualize Simulation Inputs

To inspect NuRadioMC input distributions:

```bash
bash visInput.sh
```

or run directly:

```bash
python ./bin/visualize_sim_input.py ./data/1e18.5_n1e5_0.hdf5
```

This produces diagnostic plots of simulated event parameters such as vertex location, direction, flavor, energy, and interaction properties.

---

## Visualize Simulation Outputs

To inspect detector-level simulation outputs:

```bash
bash visOutput.sh
```

or run directly:

```bash
python ./bin/visualize_sim_output.py ./data/1e18.5_n1e5_ARA02_0.hdf5.XFDTD
```

This generates plots for detector observables such as amplitude distributions, travel-time distributions, direction distributions, correlation matrices, and scatter plots.

---

## Train the Reconstruction Network

Run the example training script:

```bash
bash nnReconTrain.sh
```

Equivalent direct usage:

```bash
python ./bin/nnRecon.py \
    <input_file_1> <input_file_2> ... \
    ./plots/nnRecon/ \
    train \
    <input_mode> <num_layers> <num_nodes> <epochs> <batch_size> <fold> <amp_noise_factor> <time_noise_factor>
```

Example:

```bash
python ./bin/nnRecon.py \
    ./data/*_ARA02_*.hdf5.XFDTD \
    ./plots/nnRecon/ \
    train \
    0 3 60 100 64 0 0 0
```

Arguments:

```text
input files          NuRadioMC/NuRadioReco simulation files
output directory     directory for plots and model weights
label                train, rr, tt, pp, ze, az, or sh
input_mode           0 = HDF5, 1 = NumPy, 2 = CSV, 3 = custom format
num_layers           number of convolutional and dense layers
num_nodes            number of nodes/filters per layer
epochs               training epochs
batch_size           batch size
fold                 fold index for validation/testing
amp_noise_factor     amplitude-noise scaling factor
time_noise_factor    timing-noise scaling factor
```

Model weights are saved as:

```text
allPairsWeights_<config>.hdf5
```

inside the output directory.

---

## Generate Reconstruction Plots

After training, run the plotting scripts:

```bash
bash nnReconPlotRR.sh
bash nnReconPlotTT.sh
bash nnReconPlotPP.sh
bash nnReconPlotZE.sh
bash nnReconPlotAZ.sh
bash nnReconPlotSH.sh
```

These scripts load the trained weights and generate reconstruction diagnostics for the selected target.

---

## Cross-Validation

Run 10-fold cross-validation:

```bash
bash xvalidation.sh
```

This trains and evaluates the model across folds for each reconstruction target.

---

## Hyperparameter Grid Search

Run a grid search over model depth, width, epochs, and batch size:

```bash
bash gridSearch.sh
```

The grid search explores combinations of:

- Number of layers
- Number of nodes
- Number of epochs
- Batch size

---

## Output Diagnostics

The training and plotting pipeline generates diagnostics such as:

- Model architecture diagrams
- Training/validation loss curves
- Combined multi-output loss curves
- Feature correlation heatmaps
- Feature scatter matrices
- Prediction-vs-truth plots
- Error histograms
- Energy-dependent reconstruction-error plots
- Quantile-based error summaries

---

## Research Context

This project was developed for neural-network-based reconstruction of ultra-high-energy neutrino events in the ARA detector.

It supports the study of whether detector-level radio observables can be used to reconstruct:

- Interaction vertex location
- Signal-arrival geometry
- Neutrino arrival direction
- Shower-energy-related quantities

using supervised learning on NuRadioMC simulations.

---

## Notes

This repository is research code. Several paths in the shell scripts point to local simulation directories and should be updated before running on a new machine.

The included example data files can be used to test visualization scripts and inspect expected input formats.

---

## Potential Future Improvements

- Refactor `nnRecon.py` into modular data, model, training, and plotting components
- Add command-line argument parsing with `argparse`
- Add configuration files for experiments
- Add unit tests for preprocessing and target construction
- Add clearer support for multiple detector stations
- Add benchmark tables for reconstruction resolution
- Add saved example output plots
- Add Docker or Conda environment support
- Add documentation for HDF5/NumPy input schemas

---

## Disclaimer

This repository is intended for scientific research and educational use.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors