Collected small changes and fixes#5068
Merged
Merged
Conversation
A non-KOKKOS style running inside a KOKKOS run expects x2lamda()/lamda2x() to update the coordinates it reads on the host, but the DomainKokkos overrides transform only the Kokkos views. Whenever those views do not alias the legacy atom->x array (host builds at single/mixed precision, and every GPU build), a host kspace style with a triclinic box, e.g. pppm/cg which has no /kk variant, reads stale Cartesian coordinates against the lamda-space grid mapping after its in-compute x2lamda() call and aborts with "Out of range atoms - cannot compute PPPM". Setup escapes because VerletKokkos::setup() runs under auto_sync; the timestep loop does not. This is the KSpaceStyle:pppm_cg_tri_slab failure of the KOKKOS serial single/mixed precision CI jobs. Add TransformView::need_sync_legacy() (is a Kokkos view newer than the legacy host view?) and use it in the four DomainKokkos::x2lamda()/ lamda2x() array overloads: when the legacy host copy is current on entry, sync it again on exit so the caller-visible host state stays coherent. Device-resident flows (verlet_kokkos, fix_nh_kokkos, min_kokkos, ...) enter with the host copy already stale and skip the extra sync, so the per-step device path is unaffected. Double precision host builds alias legacy and Kokkos views and were never affected; the added sync folds to a no-op there. The kokkos_omp (present since the test was added) and kokkos_gpu (added 2026-06-28) entries in the pppm_cg_tri_slab skip list were this same staleness in its other guises; remove both so the test locks the fix in on all KOKKOS backends. Verified: full unittest suite green on clean serial single/mixed/double builds, KSpace+FixTimestep green on OpenMP single (kokkos_omp unskipped) and HIP gfx1103 double (kokkos_gpu unskipped). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RZmFEmpx4SmV2vaUWBF4Yk
FixPrecessionSpin::init() converted the Zeeman field intensity with H_field *= gyro in place. Fix::init() runs once per run command, so every additional run compounded the conversion and scaled the field by another factor of gyro ~= 0.176 rad.THz/T: any multi-run input using precession/spin zeeman sampled a different field in the second and later runs, and a run continued from a restart file diverged from the same run performed without interruption. Store the converted intensity in a new member (hfield) instead, like the anisotropy constants already do (Kah = Ka/hbar), so init() is idempotent and H_field keeps the user-supplied value in Tesla. Found by the new fix-timestep-precession_spin.yaml test: its restart leg could never match the continuous reference trajectory. Verified by comparing an uninterrupted 8-step run against a 4-step run plus restart continuation, which are now bit-identical. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RZmFEmpx4SmV2vaUWBF4Yk
PairSpinNeel::init_one() mirrored all eight coefficient arrays to the transposed type pair but not cut_spin_neel[j][i], unlike all other spin pair styles which mirror their cutoff array. With more than one atom type, compute() and compute_single_pair() then compared rsq against an uninitialized cutoff for half of the type pairs, producing nondeterministic forces and energies. Found by the new spin-pair-neel.yaml test failing intermittently (9 out of 30 identical serial runs); valgrind --track-origins pointed at the rsq <= local_cut2 tests. 30/30 stable and valgrind-clean after the fix. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RZmFEmpx4SmV2vaUWBF4Yk
…s.cpp Doxygen could not match the explicit instantiations of utils::bounds() and utils::bounds_typelabel() nor the explicit specializations of utils::join<T>() in utils.cpp to their template declarations in utils.h, emitting 13 "no matching file member found" warnings. The two anon-namespace forward declarations of re_match()/re_find() used /** */ doxygen comments without \param/\return, emitting 4 more "parameters not documented" warnings. Wrap the instantiation/specialization blocks in \cond ... \endcond so doxygen skips them (their documentation lives on the utils.h template declarations, which breathe resolves), and demote the two forward-declaration comments to plain /* */. doc/doxygen-warn.log goes from 17 warnings to 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01YNvuaNpJNkbCKaAE8ngPQv
jtclemm
approved these changes
Jul 3, 2026
jtclemm
left a comment
Collaborator
There was a problem hiding this comment.
I do not understand the changes to domain_kokkos.cpp, that is probably for @stanmoore1 to review, but other than that the changes look correct to me.
Member
Author
I have a suggestion from @stanmoore1 for a more general fix for the issue and it will replace this specific workaround. Currently testing it... |
Replace the targeted fix from commit 237dab0 (keep legacy host coords current in DomainKokkos::x2lamda/lamda2x) with the general mechanism suggested by Stan Moore in review: when a style without KOKKOS support runs inside a KOKKOS run, enable kokkos->auto_sync for the duration of its compute so that every sync()/modified() it triggers - for example through the virtual DomainKokkos x2lamda/lamda2x overrides - writes modifications through to the legacy host arrays the style reads and writes. This is the same pattern ModifyKokkos already applies to every fix hook invocation and it covers any host-visible per-atom data, not just coordinates. Apply the wrapping to the pair, bond, angle, dihedral, improper, and kspace compute invocations in VerletKokkos::run(), MinKokkos::energy_force(), DynamicalMatrixKokkos::update_force(), and ThirdOrderKokkos::update_force(). The setup paths already run with auto_sync enabled and need no change. The KSpace base class was missing the kokkosable member that all other style base classes have; add it and set it in PPPMKokkos. Remove the now superseded TransformView::need_sync_legacy() and the conditional legacy sync from the DomainKokkos coordinate transforms. Verified: KSpaceStyle:pppm_cg_tri_slab (which runs the non-KOKKOS pppm/cg in a triclinic KOKKOS run and aborted before the original fix) passes on KOKKOS serial single and mixed precision builds and on a HIP GPU build; full unittest suite green on the serial single precision build; KSpace and FixTimestep suites green on serial mixed and double precision and HIP builds. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RZmFEmpx4SmV2vaUWBF4Yk
…mand pair_compute_neighlist() enables the neigh_thread setting on GPUs for systems with up to 16000 atoms when the pair style runs with newton off, without marking it as user selected. That state persists in the KokkosLMP class instance across a clear command, so when clear re-applies the package defaults and command line settings, a package command with "newton on" that was accepted at startup now aborts with "Must use 'newton off' with KOKKOS package option 'neigh/thread on'" even though the user never enabled neigh/thread. To reproduce: run a small system with newton off on a GPU, issue a clear command, and set up a newton on run in the same session. Reset the setting at the top of the package command processing unless it was selected with the neigh/thread option; the heuristic re-evaluates it on the next pair style compute anyway. Found by activating the newton off legs of the kokkos_gpu force-style tests: all 76 yaml files with KOKKOS-capable pair styles aborted in a later test stage after the newton off leg had triggered the heuristic. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RZmFEmpx4SmV2vaUWBF4Yk
Add support for magnetic (atom_style spin) systems to the pair style and fix timestep test drivers, following the model of the ellipsoid special case: - new yaml blocks init_mag_forces/run_mag_forces (per-atom fm) and run_spin (per-atom sp incl. norm), captured automatically when the atom style provides them and compared wherever forces (pair tester) and positions/velocities/torques (fix tester) are compared - new 'timestep' yaml keyword so the fix tester can use a timestep suitable for spin dynamics in metal units instead of the previous hard-coded 0.25 (which remains the default) - a 'spin' tag: such yaml files must define fix nve/spin in their post_commands (the spin pair styles compute mechanical forces only when the fix is present), so the pair tester run phase does not add another time integrator; newton off legs are skipped via the established newton_pair pre_commands pin since spin pair styles require newton on - remove the hard-coded MOLECULE package gate in the fix tester; the atom style prerequisites in every yaml file already cover it - new inputs in.spin, data.spin (54-atom perturbed bcc iron cell with randomized spins, two atom types), coeffs.spin - pair style references for spin/exchange, spin/exchange/biquadratic, spin/dmi, spin/neel, spin/magelec, spin/dipole/cut, and hybrid/overlay eam/fs + spin/exchange; fix references for nve/spin, precession/spin, and setforce/spin, registered as SpinPairStyle:* and FixTimestep:* ctest cases These tests found the fix precession/spin zeeman compounding bug and the pair spin/neel uninitialized cutoff bug fixed in the two previous commits. No test for fix langevin/spin yet: it does not store the RNG state in restart files, so its restart trajectory cannot match. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RZmFEmpx4SmV2vaUWBF4Yk
The newton off blocks in the omp leg and the shared kokkos test body of test_pair_style checked "if (newton_pair == 0)" on the still newton on instance and only re-initialized with newton off inside that block, so the comparisons had been dead code since 2021 (the plain leg re-initializes first and then checks whether the pair style forced newton back on). Restructure both legs to match the plain leg. Also correct the comparison of the freshly initialized state inside the previously dead omp block, which compared the initial forces against the run_forces reference, and remove a duplicated magnetic force check that had slipped into the plain and kokkos legs. With this change the newton off state of every pair style with an /omp or /kk variant is compared against the reference data. Results across all 394 pair and kspace yaml files: no failures in the omp and kokkos_omp legs (OpenMP build), no failures in the kokkos_serial leg (KOKKOS serial single precision build; the pre-existing pppm_tip4p_ad failure is unrelated), and no failures in the kokkos_gpu leg (HIP build) after the preceding fix for the stale neigh/thread state. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RZmFEmpx4SmV2vaUWBF4Yk
The four std::sort comparators for the per-particle contact lists in FixSurfaceGlobal::post_force() and PairSurfGranular::compute() took their ContactSurf arguments by value. The struct holds three std::vector members besides ~200 bytes of plain data, so every comparison in these per-atom hot loops copied both arguments including up to six heap allocations. Take them by const reference. Found by Coverity Scan (CIDs 1661276, 1661387, 1661110, 1661564, 1661250, 1661984, 1661398, 1661585). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01YNvuaNpJNkbCKaAE8ngPQv
The loop over the connections of a line's second endpoint iterated with the connection count of the first endpoint, reading past the end of neigh_p2[] whenever a line has more connections at endpoint 1 than at endpoint 2 and crashing FixSurfaceGlobal::init() for 2d global surfaces with per-molecule motion (e.g. the in.line.gran.global example). The 3d twin block pairs its counts correctly. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01YNvuaNpJNkbCKaAE8ngPQv
The four sort call sites for per-particle contact lists in FixSurfaceGlobal::post_force() and PairSurfGranular::compute() carried duplicated lambda comparators encoding the same 6-level ordering (overlap, priority, normal alignment, center-of-mass distance, index), in two variants: before the connection pre-walk the normal signs are not yet assigned and the alignment is compared in absolute value. Move them into two static comparators on FixSurface implemented via one shared function, so the pair style and the fix are guaranteed to rank multi-contact configurations identically, and return bool instead of int. Verified bit-identical on the tribox/line global and tri/line local example runs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01YNvuaNpJNkbCKaAE8ngPQv
Setting the fflag/nside/aflag (and ewhich/pwhich) connection flags between two connected lines or triangles was implemented in ten nearly identical blocks: one per endpoint (2d) or edge (3d) of the global and local surface fixes, differing only in which endpoints form the edge and which per-connection arrays are written. Move the shared computation into two FixSurface methods, point_connection2d() and edge_connection3d(); the callers keep their genuinely different endpoint matching (point indices for global surfaces, coordinate comparison for distributed local surfaces) and now report inconsistent surface connectivity instead of leaving flags unset when no shared point is found. The classification values (FLAT/NONFLAT, CONCAVE/CONVEX, SAME_SIDE/OPPOSITE_SIDE) move from per-file enum copies into the FixSurface class so the helpers and both subclasses share one definition. Verified bit-identical on the tribox/line global and tri/line local example runs (only source line numbers in warning messages shift). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01YNvuaNpJNkbCKaAE8ngPQv
Extend the connection-flag consolidation to the third duplicated family: the six corner blocks (three per file) that set the cwhich/nside/fflag values for tris sharing only a corner point. The shared computation (normal sides from the sign of the normal dot product plus the flatness test) moves into FixSurface::corner_connection3d(); the callers keep their differing corner matching and now report inconsistent surface connectivity for an unmatched corner instead of leaving cwhich unset. Also drop the local variable declarations that became unused in the connection functions after the consolidations. Verified bit-identical on the tribox/line global and tri/line local example runs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01YNvuaNpJNkbCKaAE8ngPQv
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This pull request contains multiple small fixes and improvements
Related Issue(s)
N/A
Author(s)
Axel Kohlmeyer, Temple U
Licensing
By submitting this pull request, I agree, that my contribution will be included in LAMMPS and redistributed under either the GNU General Public License version 2 (GPL v2) or the GNU Lesser General Public License version 2.1 (LGPL v2.1).
Artificial Intelligence (AI) Tools Usage
The KOKKOS package changes were created by Claude Opus 4.8 to correct failing unittests
The SPIN package changes were created by Claude Fable to correct failing unittests
Backward Compatibility
N/A
Implementation Notes
The following individual changes are included:
Post Submission Checklist
build system