QUANTUM-LAB

Lattice field theory research facility. Three independent particle physics projects sharing a common infrastructure: CUDA-accelerated gauge computations, normalizing flow sampling, and a sovereign experiment tracking system.

Target hardware: RTX 5060 Ti 16 GB (sm_120, Blackwell). No cloud. No notebooks-as-a-service. Bare metal.

results

P3 — U(1) gauge theory (CUDA reactor)

engine	action value	time	speedup
JAX (CPU/GPU)	-5.1589670	829 ms	baseline
CUDA C++ kernel	-5.1589675	0.578 ms	354×
precision diff	4.77e-07	—	bit-perfect

HMC thermalization: 200 configs on 8×8 lattice, β=1.0, acceptance rate 0.98. Plaquette energy: 0.4505 ± 0.0779.

P1 — LHCb anomaly detection

GNN Autoencoder on CERN Open Data (b→sℓℓ decays). ROC-AUC: 0.765 on 4-feature latent space. Quantum Boltzmann Machine background model via PennyLane.

P2 — W-boson quantum unfolding

Sequential QUBO unfolding of Jacobian peak distributions. Sliding window → Cirq/PennyLane solver. Benchmarked against classical SVD baseline.

architecture

QUANTUM_LAB/
├── P1_LHCB/
│   ├── main.py                 ← orchestrator (ingestion → GNN → QBM)
│   ├── ingestion.py            ← uproot + graph construction
│   ├── gnn_autoencoder.py      ← PyTorch Geometric autoencoder
│   └── qbm_pennylane.py        ← Quantum Boltzmann Machine
│
├── P2_WBOSON/
│   ├── data_generator.py       ← Jacobian peak toy data
│   ├── quantum_unfolder.py     ← QUBO sliding window solver
│   └── benchmark.py            ← SVD vs quantum comparison
│
├── P3_G2/
│   ├── main.py                 ← orchestrator (HMC → CNF → TN → CUDA)
│   ├── lattice_hmc.py          ← JAX Hybrid Monte Carlo for U(1)
│   ├── cnf_flow.py             ← Continuous Normalizing Flow (PyTorch)
│   ├── tn_quimb.py             ← Tensor network contraction (quimb)
│   ├── cuda_accelerator.py     ← CuPy RawKernel C++ plaquette action
│   └── visualization_3d.py     ← PyVista/VTK bare-metal 3D rendering
│
├── tracking.py                 ← SQLite sovereign experiment tracker
├── query.py                    ← CLI experiment query interface
├── plot_metrics.py             ← Offline metric visualization
└── experiments.db              ← Local experiment database

stack

layer	technology
gauge sampling	JAX (HMC with leapfrog integrator)
generative model	PyTorch + torchdiffeq (Neural ODE / CNF)
tensor networks	quimb + cotengra (PEPS, CTMRG, BMPS)
GPU acceleration	CuPy RawKernel (native C++ CUDA)
particle data	uproot (ROOT → NumPy)
quantum circuits	PennyLane + Cirq
3D visualization	PyVista / VTK (bare-metal OpenGL)
experiment tracking	SQLite (no MLflow, no cloud)

the CUDA kernel & Profiling

The plaquette action kernel maps each lattice site to a CUDA thread. The gauge field U(1) action is computed as:

S = -β Σ Re(U_μ(n) · U_ν(n+μ) · U_μ(n+ν)* · U_ν(n)*)

Optimization via Nsight Compute: Initial profiling of the JAX implementation using ncu (Nsight Compute) revealed severe memory bandwidth bottlenecks due to uncoalesced global memory accesses and high kernel launch overheads.

To resolve this, I rewrote the computation as a native C++ RawKernel. Each thread computes one plaquette, utilizing block-level reduction via __syncthreads() and utilizing shared memory to minimize VRAM roundtrips. This restructuring increased occupancy and achieved a 354× speedup over the JAX baseline with sub-microsecond precision (diff < 5e-07).

__global__ void compute_plaquette_kernel(
    const float* field_cos, const float* field_sin,
    float* block_results, int L, float beta)

sovereign tracking

Every experiment run is logged to experiments.db (SQLite) with:

Parameters (JSON): lattice size, β, epochs, bond dimension
Metrics (JSON): plaquette energy, acceptance rate, loss curves
Artifacts: paths to generated plots and configs
Timestamp and run notes

No MLflow. No localhost servers. No web dashboards. Query with python query.py.

run

# full pipeline (HMC → CNF → TN → CUDA benchmark)
python P3_G2/main.py

# LHCb anomaly detection
python P1_LHCB/main.py

# W-boson unfolding benchmark
python P2_WBOSON/benchmark.py

# query experiment history
python query.py --last 10

requirements

jax[cuda12]
torch
torchdiffeq
torch_geometric
quimb
cotengra
cupy-cuda12x
pennylane
cirq
uproot
pyvista

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QUANTUM-LAB

results

P3 — U(1) gauge theory (CUDA reactor)

P1 — LHCb anomaly detection

P2 — W-boson quantum unfolding

architecture

stack

the CUDA kernel & Profiling

sovereign tracking

run

requirements

related

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
P1_LHCB		P1_LHCB
P2_WBOSON		P2_WBOSON
P3_G2		P3_G2
reports		reports
.gitignore		.gitignore
README.md		README.md
experiments.db		experiments.db
plot_metrics.py		plot_metrics.py
query.py		query.py
tracking.py		tracking.py

Folders and files

Latest commit

History

Repository files navigation

QUANTUM-LAB

results

P3 — U(1) gauge theory (CUDA reactor)

P1 — LHCb anomaly detection

P2 — W-boson quantum unfolding

architecture

stack

the CUDA kernel & Profiling

sovereign tracking

run

requirements

related

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages