Initial public-beta release of NVIDIA ALCHEMI Toolkit, a GPU-first Python
framework for AI-driven atomic simulation workflows.
Core Data Layer
- AtomicData — Pydantic-backed graph representation of atomic systems
(positions, atomic numbers, masses, node/edge properties) with factory
constructorsfrom_atoms()(ASE) andfrom_structure()(pymatgen). - Batch — GPU-resident graph batch with
MultiLevelStoragebackend
supporting node-, edge-, and system-level tensors. Lazybatch_idx/batch_ptr,
index_select,append, andfrom_data_listfor efficient batching. - Zarr I/O —
AtomicDataZarrWriterandAtomicDataZarrReaderwith
configurable Zstd compression, chunking, and sharding for high-throughput
trajectory storage. - Dataset & DataLoader — CUDA-stream prefetching, async I/O, and
drop-inDataLoaderreplacement yieldingBatchobjects.
Model Wrappers
All wrappers implement BaseModelMixin with a unified ModelConfig for
capability declaration and runtime control.
- DemoModelWrapper — Lightweight test/demo model (point-cloud energy +
autograd forces). - MACEWrapper — MACE equivariant neural network; supports foundation
checkpoints; COO neighbor format; conservative forces via autograd. - AIMNet2Wrapper — AIMNet2 atom-in-molecule network; energy, forces,
charges, stress; MATRIX neighbor format; NSE auto-detection. - LennardJonesModelWrapper — Warp-accelerated single-species LJ with
analytical forces and optional virial stress. - EwaldModelWrapper — Real + reciprocal space Ewald summation for
periodic charged systems; k-vector caching; hybrid analytical forces. - PMEModelWrapper — Particle Mesh Ewald (FFT-based, O(N log N)) for
large periodic systems. - DFTD3ModelWrapper — DFT-D3(BJ) dispersion correction with
auto-downloaded reference parameters and cutoff smoothing. - PipelineModelWrapper — Compose multiple models into groups with
independent derivative strategies (autograd vs. analytical).
Dynamics Engine
- BaseDynamics — Abstract base orchestrating model evaluation, integrator
updates, hook dispatch, convergence detection, and inflight batching. - 9 hook insertion points per step (
DynamicsStageenum):BEFORE_STEP,
BEFORE_PRE_UPDATE,AFTER_PRE_UPDATE,BEFORE_COMPUTE,AFTER_COMPUTE,
BEFORE_POST_UPDATE,AFTER_POST_UPDATE,AFTER_STEP,ON_CONVERGE. - ConvergenceHook — Flexible convergence criteria with
from_fmax()
convenience constructor and per-system masking.
Integrators
- NVE — Velocity Verlet; symplectic, time-reversible, energy-conserving.
- NVTLangevin — BAOAB Langevin dynamics with Ornstein-Uhlenbeck
thermostat for canonical sampling. - NVTNoseHoover — Nosé-Hoover chain thermostat with Yoshida-Suzuki
factorization; deterministic and ergodic. - NPT — Martyna-Tobias-Klein isothermal-isobaric with dual Nosé-Hoover
chains (particle + cell DOFs). - NPH — MTK isenthalpic-isobaric without thermostat.
Optimizers
- FIRE — Fast Inertial Relaxation Engine with adaptive timestep.
- FIREVariableCell — FIRE with NPH-like variable-cell propagation.
- FIRE2 — Improved FIRE (Shuang et al. 2020) with better restart
conditions and modified velocity mixing. - FIRE2VariableCell — FIRE2 with variable-cell structural relaxation.
Built-in Hooks
Dynamics hooks (nvalchemi.dynamics.hooks):
LoggingHook— Per-graph scalar statistics with thread-pooled I/O and
optional CUDA stream prefetch.NaNDetectorHook— Immediate NaN/Inf detection in forces and energy.MaxForceClampHook— Clamps force magnitudes to prevent numerical
explosions.EnergyDriftMonitorHook— Cumulative energy drift tracking with
configurable thresholds (absolute and per-atom-per-step).FreezeAtomsHook— Freezes selected atoms by category during MD.SnapshotHook— Periodic full-state snapshots to aDataSink.ConvergedSnapshotHook— Snapshot on convergence.ProfilerHook— Per-stage wall-clock profiling with NVTX annotations
and CSV output.AlignCellHook— Upper-triangular cell alignment for variable-cell
optimization.
General hooks (nvalchemi.hooks):
NeighborListHook— On-the-fly neighbor list construction/refresh with
Verlet skin buffer; MATRIX and COO formats.WrapPeriodicHook— GPU-accelerated PBC wrapping via Warp kernel.BiasedPotentialHook— External bias potentials for enhanced sampling
(umbrella sampling, metadynamics, etc.).
Multi-stage Pipelines
- FusedStage (
+operator) — Compose dynamics stages on a single GPU
with shared forward pass and masked updates per sub-stage. - DistributedPipeline (
|operator) — Distribute stages across GPU
ranks with blocking inter-rank communication. - SizeAwareSampler — Bin-packing inflight batching that respects
max_atoms,max_edges, andmax_batch_sizeconstraints. - Data sinks —
HostMemory(CPU),GPUBuffer(device),ZarrData
(persistent disk) for capturing pipeline outputs.
GPU Primitives
All low-level kernels built on
nvalchemi-toolkit-ops
via NVIDIA Warp:
- Velocity Verlet position/velocity updates
- BAOAB Langevin half-steps
- Nosé-Hoover chain integration
- MTK barostat (NPT/NPH) cell and position propagation
- FIRE/FIRE2 coordinate and cell steps
- Kinetic energy and velocity initialization
- Neighbor list rebuild with Verlet skin
- Cell alignment to upper-triangular form
Developer & Agent Experience
- 20 worked examples across four tiers (basic, intermediate, advanced,
distributed) covering data structures, optimization, MD ensembles,
Zarr I/O, inflight batching, custom hooks, model composition, Ewald
electrostatics, and multi-GPU pipelines. - 7 Claude Code agent skills (
.claude/skills/) for guided workflows:
model wrapping, data structures, data storage, dynamics API, dynamics
hooks, dynamics implementation, and engineering scoping. OptionalDependencyguards for graceful degradation when MACE, AIMNet2,
ASE, or pymatgen are not installed.
Requirements
- Python 3.11–3.13
- PyTorch >= 2.5.1
nvalchemi-toolkit-ops[torch]>= 0.3.1- Optional:
[mace],[aimnet],[ase],[pymatgen]extras