refactor: pair exclude_types as canonical NeighborGraph transform; dpa1 graph path supports exclude_types (decision #18)#5733
Conversation
…ftmax Built on the existing xp_maximum_at (no new array_api helper needed). Part of NeighborGraph PR-D (graph-native attention).
Segment-based (global (E,E) boolean deliberately avoided): compact eager form for carry-all graphs + shape-static nonzero-free form for the center-major static layout (jit/export/make_fx traceable). Part of NeighborGraph PR-D; PR-E angles reuse (unordered, no-self).
…r > 0) DescrptBlockSeAtten.call_graph grows _graph_attention: the dense per-center (nnei, nnei) attention square becomes the edge-pair axis (center_edge_pairs, ordered + self-included), softmax over keys becomes segment_softmax grouped by the query edge. Op-for-op mirror of GatedAttentionLayer.call (head_dim QKV slicing, normalize q/k/v, temperature/scaling, smooth shift trick, post-softmax sw and dotr weighting, residual + LayerNorm per layer). - shape-static adapter path (static_nnei threaded from the dense call adapter): bit-exact vs the dense body, rtol 1e-12, full flag matrix (attn_layer 1/2 x dotr x smooth x normalize x temperature, binding and non-binding sel). - carry-all (compact) graphs: exact for non-smooth; for smooth the dense branch keeps sel-padding slots in the softmax denominator (dense output is sel-DEPENDENT, up to ~1e-4) — the carry-all form drops those phantom terms by design (user decision 2026-07-03), pinned by a clean-divergence test. - edge_env_mat(return_sw=True) exposes the per-edge switch (zeroed on padding) for the smooth branch. - uses_graph_lower: attention configs are now graph-eligible (concat tebd, no exclude_types still required).
…ial parity
- test_make_fx_graph_attn: graph forward + autograd.grad at attn_layer=2
traces under make_fx for BOTH smooth branches (the shape-static
center_edge_pairs form is nonzero-free) — required since pt_expt compiled
training routes eligible models through the graph lower.
- model-level graph-vs-legacy lower parity now parametrized over
attn_layer {0, 2} (energy/force/virial/atom_virial, 1e-12 CPU).
- eligibility pins: attention+concat is graph-eligible; se_atten_v2
(tebd_input_mode='strip') correctly stays dense (strip = later PR;
the plan's 'se_atten_v2 inherits for free' did not hold).
- linear-model weight tests: pin smooth_type_embedding=False — the standard (graph-routed, carry-all) and linear (graph-ineligible, dense) submodels otherwise differ by the accepted smooth-attention denominator divergence (~1e-6), which is a route artifact, not a weight-combination bug. - new binding-sel sanity: carry-all graph attention diverges from the sel-truncated dense path when sel binds (spec decision deepmodeling#17).
…rity) neighbor_list=None now takes the carry-all graph default for eligible attention models; explicit World-1 builders take the legacy dense route. With smooth attention the two routes differ by design (PR-D), so the route-equivalence tests pin smooth_type_embedding=False.
for more information, see https://pre-commit.ci
…SymInts The compact (carry-all) pair enumeration used nonzero + tensor-repeat with Python control flow on their data-dependent sizes, so the attention graph lower failed torch.export with GuardOnDataDependentSymNode. Register those sizes as unbacked SymInt sizes (new torch-free xp_hint_dynamic_size shim, no-op for numpy/jax), take the empty-input fast paths only on concrete int shapes, build iotas via cumsum(ones)-1 (the array_api_compat arange wrapper branches on the length in Python), and skip the policy-compression nonzero when no filter applies (include_self and ordered - the attention default). Eager numpy/torch results are unchanged.
…> 0)
With the compact pair enumeration unbacked-SymInt-traceable, the carry-all
attention graph lower now exports to a graph-form .pt2 unchanged in ABI
(same 5-tensor NeighborGraph schema, dynamic edge axis) and with carry-all
semantics preserved (no sel truncation, unlike the dense-adapter nlist-form
export). Update the stale freeze-gate message (attention is eligible), add a
symbolic-trace merge gate at attn_layer in {0,2}, parametrize the DeepEval
graph .pt2 fixture over attn_layer (both artifacts: dynamic sizes, PBC and
non-PBC, 1e-10 vs the sel-capped dense reference at non-binding sel), and
add a single-atom zero-real-edge runtime test (the R==0 extreme of the
unbacked sizes).
The dense call is wrapped in @cast_precision, but the graph route's only float input (edge_vec) lives inside the NeighborGraph dataclass where the decorator cannot see it, so non-global-precision models (e.g. float32) crashed with a double-vs-float matmul on the graph route while the dense route worked. Cast edge_vec down to the descriptor precision on entry and the outputs back to the caller's dtype on exit (differentiable, so the model-level force autograd is unaffected). Add an fp32 graph-vs-dense route parity test at attn_layer 0 and 2.
…nt-wide NaN A masked entry whose raw logit exceeds the unmasked per-segment max by more than the exp overflow threshold (~709 fp64 / ~88 fp32) overflowed exp() to inf, and the post-hoc inf * 0 mask multiply produced nan, which the denominator sum then spread across the entire segment. Shift data_for_max (masked entries already -inf, exp(-inf) == 0 exactly) instead of the raw data; the mask multiply stays as a defensive no-op. Regression test with a masked logit 1e5 above the unmasked max. Addresses CodeRabbit review.
…sh eligibility wording Address OutisLi review on deepmodeling#5715: - Notes on DescrptDPA1.call_graph (+ pointers on uses_graph_lower and the freeze lower_kind docstring): for smooth_type_embedding=True the carry-all graph attention intentionally drops the dense layout's sel-padding terms from the softmax denominator - sel-independent semantics that differ from the legacy dense lower by up to ~1e-4; bit-tight dense parity holds for the non-smooth branch and for the static_nnei dense-adapter realization. - Update the stale 'dpa1 attn_layer == 0' eligibility wording (freeze() docstring, training-path predicate/comment, dpmodel/pt_expt graph-lower docstrings) to the actual contract: mixed-types dpa1/se_atten with concat type embedding and no exclude_types, attention layers included; call_common docs no longer imply unconditional dense parity.
…he pt_expt graph default and legacy escape hatch
…ct=True) when angle fields are present
…exclusion; drop eligibility gate
…layer and type_one_side Add @pytest.mark.parametrize for attn_layer in [0, 2] and type_one_side in [False, True] to test_exclude_types_graph_parity. Also adds the missing parity assertion (graph vs dense at rtol=atol=1e-12, non-binding sel). Uses smooth_type_embedding=False to avoid the known by-design softmax denominator divergence in the dense smooth path.
Descriptor-level exclude_types is now graph-eligible (fully supported via apply_pair_exclusion). Remove 'no exclude_types' from four docstrings/error messages that list graph eligibility conditions. The gate condition was removed in the NeighborGraph implementation; only tebd_input_mode='concat' restriction remains. - deepmd/pt_expt/entrypoints/main.py: freeze_model docstring (~502) + ValueError message (~589) - deepmd/dpmodel/model/make_model.py: forward docstring (~317) - deepmd/pt_expt/train/training.py: _model_uses_graph_lower docstring (~591)
build_neighbor_graph, build_neighbor_graph_ase, build_neighbor_graph_vesin, build_neighbor_graph_nv all gain optional keyword-only pair_excl=None and compact=False; default path = geometric search then apply_pair_exclusion. _call_common_graph in pt_expt make_model wires atomic_model.pair_excl to every builder call so model-level pair_exclude_types is applied at build time (the atomic-model seam backstop stays as idempotent identity). Oracle tests assert set-equality of the valid-edge set between builder(pair_excl=X) and builder() + separate apply_pair_exclusion(X), for dense (2 id + 3 oracle cases) and ase (2 cases); vesin gets 4 new tests (2 identity, 2 oracle, parametrized over periodic).
…n for array_api_strict compat
…_neighbor_list/strategies (A4) Extract the inline pair-exclusion from base_atomic_model.forward_common_atomic into apply_pair_exclusion_nlist(nlist, atype_ext, pair_excl) in nlist.py. The seam is refactored to call the named helper (idempotent backstop remains). Add pair_excl=None to: - build_neighbor_list (dpmodel, nlist.py) - DefaultNeighborList.build - VesinNeighborList.build (pt_expt) - NvNeighborList.build (pt; CUDA-only, API parity) - NeighborList base class signature 12 new unit tests covering: None/empty identity, excluded pairs -> -1, -1 slot preservation, ghost-atom types, idempotence, torch namespace smoke, build_neighbor_list oracle equivalence, DefaultNeighborList oracle, VesinNeighborList oracle. NvNeighborList CUDA-only (not validated locally).
…cross-refs Serialize the model-level pair_exclude_types into metadata.json so the C++ DeepPotPTExpt loader can rebuild the flat (ntypes+1)^2 keep table and re-apply the same pair-exclusion transform at the ingestion seam. Add See Also cross-references between the Python transforms (apply_pair_exclusion, apply_pair_exclusion_nlist) and their forthcoming C++ twins.
Add buildPairExcludeTable / applyPairExclusion (graph) / applyPairExclusionNlist (dense) in commonPT.h, structurally mirroring the Python transforms (apply_pair_exclusion, apply_pair_exclusion_nlist) with the same argument order and variable names (type_ij, keep). DeepPotPTExpt::init rebuilds the flat (ntypes+1)^2 keep table from the pair_exclude_types metadata; the seam applies it before every model call (graph and dense, LAMMPS and standalone) as an idempotent backstop to the exclusion already compiled into the .pt2.
gen_dpa1_pairexcl.py exports graph-route and dense-route DPA1(attn_layer=0) .pt2 models with model-level pair_exclude_types=[[0,1]] plus a no-exclusion baseline; references come from Python DeepEval of each model. test_deeppot_dpa1_pairexcl_ptexpt.cc validates both C++ ingestion routes (applyPairExclusion / applyPairExclusionNlist) against the Python references at 1e-10 (fp64), cross-checks graph==dense, and proves the exclusion is active by comparing against the empty-table baseline. Wired into test_cc_local.sh. 8/8 tests pass locally.
…e and forward_common_atomic_graph
The SpinModel backbone (dpa1 attn_layer=0) was being routed to the carry-all graph path by the default-flip (decision deepmodeling#17) introduced in the graph-pair-exclude branch. Virtual/placeholder types injected by get_spin_model double the atom density, making the system sel-binding; the carry-all graph keeps neighbors the capped dense nlist discards, so SpinModel.call_common diverged from call_common_lower (the dense lower used by pt_expt eager inference) by ~2e-6. Fix: pass neighbor_graph_method='legacy' when SpinModel.call_common invokes the backbone, forcing the dense-nlist path until spin-graph support is explicitly implemented. The graph .pt2 export path already has a fail-fast guard for spin (serialization.py:963). Adds a regression test pinning that: - SpinModel.call_common total energy equals backbone(legacy) on the same spin-doubled inputs (exact bit-identity). - The backbone in graph mode gives a DIFFERENT energy at this density, confirming the fixture exercises the diverging regime.
The pair-exclude branch removed exclude_types from DescrptDPA1.uses_graph_lower(), which previously kept spin backbones (they inject exclude_types) on the dense path. As a side effect the descriptor-level dispatch routed spin .pt2/.pte export through the graph kernel, whose scatter/atomic_add tripped a torch-inductor CPU codegen assertion (23 errors in test_deep_eval_spin.py). Add an explicit disable_graph_lower() knob on DescrptDPA1 and set it structurally in SpinModel.__init__ (covers get_spin_model and both dpmodel/pt_expt deserialize paths, so it survives serialize round trips). The flag is not serialized; re-derived at construction. The neighbor_graph_method=legacy kwarg in call_common is kept as belt-and-braces.
for more information, see https://pre-commit.ci
📝 WalkthroughWalkthroughThis PR adds model-level and descriptor-level pair-type exclusion support across dense neighbor lists and neighbor-graph builders (dpmodel, pt, pt_expt), extends DPA1 with graph-native attention (including exclusion-aware masking), disables graph-lower for spin models via legacy routing, and wires equivalent pair-exclusion enforcement into the C++ pt_expt inference seam with new tests and a torch-version guard for graph tracing. ChangesPython pair exclusion and graph-native DPA1 attention
C++ pt_expt pair-exclusion ingestion seam
Estimated code review effort: 4 (Complex) | ~75 minutes Sequence Diagram(s)sequenceDiagram
participant DescrptDPA1
participant DescrptBlockSeAtten
participant CenterEdgePairs
participant SegmentSoftmax
DescrptDPA1->>DescrptBlockSeAtten: call_graph(graph, atype, static_nnei)
DescrptBlockSeAtten->>DescrptBlockSeAtten: apply_pair_exclusion(graph, atype, pair_excl)
DescrptBlockSeAtten->>CenterEdgePairs: center_edge_pairs(dst, edge_mask, static_nnei)
DescrptBlockSeAtten->>DescrptBlockSeAtten: _graph_attention(gg, rr, dst, sw_e)
DescrptBlockSeAtten->>SegmentSoftmax: segment_softmax(scores, query_edge, mask)
SegmentSoftmax-->>DescrptBlockSeAtten: normalized attention weights
DescrptBlockSeAtten-->>DescrptDPA1: grrg, rot_mat
sequenceDiagram
participant DeepPotPTExptInit
participant DeepPotPTExptCompute
participant PairExcludeTable
participant ExportedModel
DeepPotPTExptInit->>PairExcludeTable: buildPairExcludeTable(ntypes, pair_exclude_types)
DeepPotPTExptCompute->>PairExcludeTable: applyPairExclusion(edge_index, edge_mask, atype)
DeepPotPTExptCompute->>PairExcludeTable: applyPairExclusionNlist(nlist, atype_ext)
PairExcludeTable-->>ExportedModel: filtered edge_mask / nlist
ExportedModel-->>DeepPotPTExptCompute: energy, forces, virial
Possibly related PRs
Suggested labels: Suggested reviewers: 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| atype = np.array([[0, 1, 0, 1, 0]], dtype=np.int64) | ||
| ds = DescrptDPA1(rcut=4.0, rcut_smth=0.5, sel=[200], ntypes=2, attn_layer=0) | ||
| ft = InvarFitting("energy", 2, ds.get_dim_out(), 1, mixed_types=True) | ||
| am = DPAtomicModel(ds, ft, type_map=["a", "b"]) |
| try: | ||
| import ase # noqa: F401 | ||
| except ImportError as e: | ||
| import unittest |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
source/tests/pt_expt/descriptor/test_dpa1.py (1)
117-147: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winPass
mappingtotorch.export.exporthere
test_exportablestill goes through the dense fallback becauseTestCaseSingleFrameWithNlistsetsnloc=3andnall=4, while this export call passes nomapping. That means the newexclude_typescase only covers the legacy dense exclusion mask, not the graph-nativeapply_pair_exclusionpath. Addmappingto the exported inputs so the parametrization exercises the intended route.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/pt_expt/descriptor/test_dpa1.py` around lines 117 - 147, test_exportable is missing the graph mapping input, so it still exercises the dense fallback instead of the graph-native exclusion path. Update the export setup in test_exportable to pass mapping into torch.export.export alongside dd0 and the existing inputs, using the test fixture’s mapping source so the exclude_types parametrization covers apply_pair_exclusion as intended.
🧹 Nitpick comments (7)
deepmd/dpmodel/model/make_model.py (1)
316-322: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueDocstring update looks accurate; stale example nearby.
Matches the new DPA1 graph-native attention behavior (attention layers now included in graph eligibility). Note the unchanged
_call_common_graphexception message a few dozen lines below ("e.g. dpa1 attn_layer=0") is now a narrower example than what this docstring describes — consider updating that message text for consistency in a follow-up.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/dpmodel/model/make_model.py` around lines 316 - 322, The exception message in _call_common_graph is now too narrow compared with the updated graph-native attention behavior described in the nearby docstring. Update the message text in _call_common_graph so it reflects the broader DPA1 attention-layer graph eligibility instead of only referencing the old “e.g. dpa1 attn_layer=0” example, keeping the wording consistent with the behavior documented in make_model.py.source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py (1)
166-179: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winTest comment overstates what is actually verified.
The comment says this block will "verify excluded pairs contribute sw == 0" and "check ... call_graph sw channel", but
call_graphonly returns(grrg, rot_mat)— it has noswoutput — and the actual assertions only check for NaN/Inf, not the claimed masking behavior. The earlierout[4]vsref[4]comparison (lines 162-164) already indirectly validates exclusion parity forswvia the dense reference, so this block is largely redundant and its comment is misleading about intent/coverage. Either remove the stale comment or replace it with an assertion that actually validates zeroed contributions from excluded pairs (e.g., inspect the block'sedge_mask/sw_eviase_atten.call_graphdirectly).♻️ Suggested comment fix (minimal)
- if exclude_types: - # verify excluded pairs contribute sw == 0 in the dense reference - # (atype=[0,1,0,1] -> pairs (0,1) and (1,0) should be masked) - # sw shape: (nf, nloc, nnei, 1); just check the graph output is also 0 - # for excluded-pair edges by checking call_graph sw channel + if exclude_types: + # additional sanity check on the raw call_graph output (no sw + # channel here; exclusion parity for sw is already verified via + # out[4] vs ref[4] above). graph = from_dense_quartet(ext_coord, nlist, mapping, compact=False) atype_local = self.atype.reshape(-1) - grrg_g, rot_mat_g = dd.call_graph( + grrg_g, _rot_mat_g = dd.call_graph( graph, atype_local, type_embedding=dd.type_embedding.call() ) # no nan/inf in output with exclusions applied assert not np.any(np.isnan(grrg_g)) assert not np.any(np.isinf(grrg_g))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py` around lines 166 - 179, The comment in test_dpa1_call_graph_descriptor is misleading because this block does not verify excluded-pair sw masking; call_graph only returns grrg and rot_mat, and the current assertions only check NaN/Inf. Update the test by either removing/rephrasing the stale comment to match the actual coverage, or add a real assertion for zeroed excluded-pair contributions by checking the relevant sw/edge-mask path through se_atten.call_graph or the returned graph data. The earlier out[4] vs ref[4] comparison already covers sw parity, so keep this block focused on what it truly validates.source/tests/common/dpmodel/test_neighbor_graph_builder.py (1)
419-427: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueRedundant
import unittest.
unittestis already imported at the top of this file; the local re-import inside theexceptblock is unnecessary.🧹 Proposed cleanup
`@classmethod` def setUpClass(cls) -> None: try: import ase # noqa: F401 except ImportError as e: - import unittest - raise unittest.SkipTest("ase not installed") from e🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/common/dpmodel/test_neighbor_graph_builder.py` around lines 419 - 427, Remove the redundant local import inside test_neighbor_graph_builder’s setUpClass method: the file already imports unittest, so keep the ImportError handling but drop the inner import and use the existing unittest.SkipTest reference when ase is missing.Source: Linters/SAST tools
deepmd/dpmodel/utils/neighbor_graph/graph.py (1)
192-194: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winDocstring overstates what
compact=Truereplaces.The parameter doc says
edge_index,edge_vec,angle_index,angle_maskare all replaced whencompact=True. In practiceangle_index/angle_maskare never touched by the compact branch — the function only reaches the compaction step after confirming both areNone(otherwise it raisesNotImplementedError). Listing them as "replaced" could mislead a future implementer extending angle-compaction support into thinking this path already handles it.📝 Suggested doc fix
graph - The neighbor graph; only ``edge_mask`` (and, if ``compact=True``, - ``edge_index``, ``edge_vec``, ``angle_index``, ``angle_mask``) are - replaced. + The neighbor graph; only ``edge_mask`` (and, if ``compact=True``, + ``edge_index`` and ``edge_vec``) are replaced. ``angle_index`` / + ``angle_mask`` are never touched — compaction is rejected outright + when either is present (see the ``compact`` behavior below).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/dpmodel/utils/neighbor_graph/graph.py` around lines 192 - 194, The docstring for the neighbor graph parameter overstates the effect of compact=True by implying that angle_index and angle_mask are also replaced. Update the documentation in graph.py to say compact mode only compacts edge_index and edge_vec (along with edge_mask), and make it clear that angle_index and angle_mask are not handled by this branch because the code path only proceeds when they are None.deepmd/dpmodel/utils/neighbor_graph/ase_builder.py (1)
154-163: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winPin device explicitly when converting
atypeforapply_pair_exclusion.
xp = array_api_compat.array_namespace(coord)followed byxp.asarray(atype)doesn't pin a device, unlike the analogouspair_exclwiring innv_graph_builder.pyandvesin_graph_builder.py, which both usetorch.as_tensor(atype, device=<coord's device>). Ifatypeisn't already a tensor on the same device ascoord(e.g. a CPU/numpyatypepaired with a CUDAcoord),xp.asarraywill silently produce a CPU tensor, which will then device-mismatch againstgraph.edge_index/edge_maskinsideapply_pair_exclusion.🔧 Suggested fix
if pair_excl is not None: import array_api_compat xp = array_api_compat.array_namespace(coord) - atype_flat = xp.reshape(xp.asarray(atype), (-1,)) + dev = array_api_compat.device(coord) + atype_flat = xp.reshape(xp.asarray(atype, device=dev), (-1,)) graph = apply_pair_exclusion(graph, atype_flat, pair_excl, compact=compact) return graph🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/dpmodel/utils/neighbor_graph/ase_builder.py` around lines 154 - 163, The atype conversion in the ASE neighbor graph path is not explicitly pinned to coord’s device, so apply_pair_exclusion can receive tensors on the wrong device. Update the ase_builder flow that builds graph and handles pair_excl to convert atype the same way as the nv_graph_builder and vesin_graph_builder paths: derive the device from coord and create atype on that device before flattening and passing it into apply_pair_exclusion. This keeps the device consistent with graph.edge_index and graph.edge_mask.source/tests/common/dpmodel/test_graph_atomic_parity.py (1)
318-344: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueDrop the unused model scaffolding.
amis never referenced here, soDescrptDPA1,InvarFitting, andDPAtomicModelcan be removed from this test.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/common/dpmodel/test_graph_atomic_parity.py` around lines 318 - 344, The test builds unused model scaffolding that is never referenced, so remove the dead setup from test_apply_pair_exclusion_idempotent. Eliminate the DescrptDPA1, InvarFitting, and DPAtomicModel construction (including the am variable) and keep only the inputs actually needed for extend_input_and_build_neighbor_list, from_dense_quartet, and apply_pair_exclusion. Make sure the test still covers both the empty and non-empty pair_exclude_types branches.Sources: Coding guidelines, Linters/SAST tools
source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc (1)
101-159: 🎯 Functional Correctness | 🔵 Trivial | 🏗️ Heavy liftTest coverage gap: LAMMPS
InputNlistingestion route not exercised for pair-exclusion.
check_against_ref/allTYPED_TESTs call the 6-argdp.compute(ener, force, virial, coord, atype, box), which routes toDeepPotPTExpt's standalone (no-nlist,build_nlist-based)compute()overload. The LAMMPS-styleInputNlistoverload — the actual pair-style ingestion seam, which cachesedge_index_tensor/firstneigh_tensoratago==0and recomputes geometry viacompactEdgeTensorsevery step before callingapplyPairExclusion/applyPairExclusionNlist— is never invoked here. A bug isolated to that branch's node/edge tensor construction (e.g. themulti_rank ? nall_real : nlocnode-count selection feedingapplyPairExclusion) wouldn't be caught by this suite.Consider adding a case that drives the
InputNlistoverload (mirroring the pattern intest_deeppot_dpa1_graph_ptexpt.cc) withpair_exclude_typesset, so both C++ ingestion entry points are validated against the Python reference.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc` around lines 101 - 159, Add coverage for the LAMMPS-style InputNlist ingestion path in this pair-exclusion test, because the current check_against_ref and TYPED_TESTs only exercise the 6-arg DeepPot::compute route. Introduce a test that calls the InputNlist compute overload on DeepPotPTExpt, using pair_exclude_types and matching the pattern used in test_deeppot_dpa1_graph_ptexpt.cc, so the edge/node tensor caching and applyPairExclusion/applyPairExclusionNlist branch are validated against the Python reference.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/pt_expt/entrypoints/main.py`:
- Around line 571-576: Update the stale inline comment in main by removing the
“no type exclusion” restriction so it matches the current graph-eligibility
behavior. Keep the note aligned with `_model_uses_graph_lower` in training.py
and the nearby ValueError message: describe graph lower as opt-in for
graph-eligible models (dpa1 with concat tebd, attention layers, and supported
exclude_types) and preserve the rest of the fail-fast/per-atom-virial
explanation.
In `@doc/model/train-se-atten.md`:
- Around line 160-164: Update the pt_expt training doc sentence describing
graph-eligible descriptors so it no longer says descriptor-level exclude_types
disqualifies the carry-all neighbor-graph path. Use the surrounding
se_atten/neighbor_graph_method explanation to state that mixed-type descriptors
with tebd_input_mode "concat" and no descriptor-level compression remain
graph-eligible, while exclude_types is not a blocking condition anymore. Keep
the dense-vs-graph parity note tied to smooth_type_embedding and attn_layer, but
make the eligibility rule consistent with the current behavior exercised by
test_exclude_types_graph_eligible_and_parity and dd.uses_graph_lower().
---
Outside diff comments:
In `@source/tests/pt_expt/descriptor/test_dpa1.py`:
- Around line 117-147: test_exportable is missing the graph mapping input, so it
still exercises the dense fallback instead of the graph-native exclusion path.
Update the export setup in test_exportable to pass mapping into
torch.export.export alongside dd0 and the existing inputs, using the test
fixture’s mapping source so the exclude_types parametrization covers
apply_pair_exclusion as intended.
---
Nitpick comments:
In `@deepmd/dpmodel/model/make_model.py`:
- Around line 316-322: The exception message in _call_common_graph is now too
narrow compared with the updated graph-native attention behavior described in
the nearby docstring. Update the message text in _call_common_graph so it
reflects the broader DPA1 attention-layer graph eligibility instead of only
referencing the old “e.g. dpa1 attn_layer=0” example, keeping the wording
consistent with the behavior documented in make_model.py.
In `@deepmd/dpmodel/utils/neighbor_graph/ase_builder.py`:
- Around line 154-163: The atype conversion in the ASE neighbor graph path is
not explicitly pinned to coord’s device, so apply_pair_exclusion can receive
tensors on the wrong device. Update the ase_builder flow that builds graph and
handles pair_excl to convert atype the same way as the nv_graph_builder and
vesin_graph_builder paths: derive the device from coord and create atype on that
device before flattening and passing it into apply_pair_exclusion. This keeps
the device consistent with graph.edge_index and graph.edge_mask.
In `@deepmd/dpmodel/utils/neighbor_graph/graph.py`:
- Around line 192-194: The docstring for the neighbor graph parameter overstates
the effect of compact=True by implying that angle_index and angle_mask are also
replaced. Update the documentation in graph.py to say compact mode only compacts
edge_index and edge_vec (along with edge_mask), and make it clear that
angle_index and angle_mask are not handled by this branch because the code path
only proceeds when they are None.
In `@source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc`:
- Around line 101-159: Add coverage for the LAMMPS-style InputNlist ingestion
path in this pair-exclusion test, because the current check_against_ref and
TYPED_TESTs only exercise the 6-arg DeepPot::compute route. Introduce a test
that calls the InputNlist compute overload on DeepPotPTExpt, using
pair_exclude_types and matching the pattern used in
test_deeppot_dpa1_graph_ptexpt.cc, so the edge/node tensor caching and
applyPairExclusion/applyPairExclusionNlist branch are validated against the
Python reference.
In `@source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py`:
- Around line 166-179: The comment in test_dpa1_call_graph_descriptor is
misleading because this block does not verify excluded-pair sw masking;
call_graph only returns grrg and rot_mat, and the current assertions only check
NaN/Inf. Update the test by either removing/rephrasing the stale comment to
match the actual coverage, or add a real assertion for zeroed excluded-pair
contributions by checking the relevant sw/edge-mask path through
se_atten.call_graph or the returned graph data. The earlier out[4] vs ref[4]
comparison already covers sw parity, so keep this block focused on what it truly
validates.
In `@source/tests/common/dpmodel/test_graph_atomic_parity.py`:
- Around line 318-344: The test builds unused model scaffolding that is never
referenced, so remove the dead setup from test_apply_pair_exclusion_idempotent.
Eliminate the DescrptDPA1, InvarFitting, and DPAtomicModel construction
(including the am variable) and keep only the inputs actually needed for
extend_input_and_build_neighbor_list, from_dense_quartet, and
apply_pair_exclusion. Make sure the test still covers both the empty and
non-empty pair_exclude_types branches.
In `@source/tests/common/dpmodel/test_neighbor_graph_builder.py`:
- Around line 419-427: Remove the redundant local import inside
test_neighbor_graph_builder’s setUpClass method: the file already imports
unittest, so keep the ImportError handling but drop the inner import and use the
existing unittest.SkipTest reference when ase is missing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f66e001c-837b-4b14-a6b8-84d867cc8a56
📒 Files selected for processing (48)
deepmd/dpmodel/array_api.pydeepmd/dpmodel/atomic_model/base_atomic_model.pydeepmd/dpmodel/descriptor/dpa1.pydeepmd/dpmodel/model/make_model.pydeepmd/dpmodel/model/spin_model.pydeepmd/dpmodel/utils/__init__.pydeepmd/dpmodel/utils/default_neighbor_list.pydeepmd/dpmodel/utils/neighbor_graph/__init__.pydeepmd/dpmodel/utils/neighbor_graph/ase_builder.pydeepmd/dpmodel/utils/neighbor_graph/builder.pydeepmd/dpmodel/utils/neighbor_graph/env.pydeepmd/dpmodel/utils/neighbor_graph/graph.pydeepmd/dpmodel/utils/neighbor_graph/pairs.pydeepmd/dpmodel/utils/neighbor_graph/segment.pydeepmd/dpmodel/utils/neighbor_list.pydeepmd/dpmodel/utils/nlist.pydeepmd/pt/utils/nv_nlist.pydeepmd/pt_expt/entrypoints/main.pydeepmd/pt_expt/model/make_model.pydeepmd/pt_expt/train/training.pydeepmd/pt_expt/utils/nv_graph_builder.pydeepmd/pt_expt/utils/serialization.pydeepmd/pt_expt/utils/vesin_graph_builder.pydeepmd/pt_expt/utils/vesin_neighbor_list.pydoc/model/train-se-atten.mdsource/api_cc/include/DeepPotPTExpt.hsource/api_cc/include/commonPT.hsource/api_cc/src/DeepPotPTExpt.ccsource/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.ccsource/install/test_cc_local.shsource/tests/common/dpmodel/test_apply_pair_exclusion.pysource/tests/common/dpmodel/test_apply_pair_exclusion_nlist.pysource/tests/common/dpmodel/test_center_edge_pairs.pysource/tests/common/dpmodel/test_dpa1_call_graph_block.pysource/tests/common/dpmodel/test_dpa1_call_graph_descriptor.pysource/tests/common/dpmodel/test_dpa1_graph_attention_parity.pysource/tests/common/dpmodel/test_graph_atomic_parity.pysource/tests/common/dpmodel/test_neighbor_graph_builder.pysource/tests/common/dpmodel/test_segment_softmax.pysource/tests/common/dpmodel/test_spin_model_legacy_routing.pysource/tests/infer/gen_dpa1_pairexcl.pysource/tests/pt_expt/descriptor/test_dpa1.pysource/tests/pt_expt/infer/test_graph_deepeval.pysource/tests/pt_expt/model/test_dpa1_graph_lower.pysource/tests/pt_expt/model/test_linear_model.pysource/tests/pt_expt/utils/test_graph_pt2_metadata.pysource/tests/pt_expt/utils/test_neighbor_list.pysource/tests/pt_expt/utils/test_vesin_graph_builder.py
| # The graph lower is opt-in and only valid for graph-eligible models | ||
| # (dpa1 with concat tebd and no type exclusion; attention layers included | ||
| # -- the carry-all pair enumeration exports via unbacked SymInts). Fail | ||
| # fast with a clear message rather than emitting a broken .pt2. Enable the | ||
| # per-atom virial for the graph form -- it is near-free there (one extra | ||
| # scatter off the single shared backward). |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Stale comment: "no type exclusion" contradicts this PR's own change.
This inline comment still describes graph eligibility as "dpa1 with concat tebd and no type exclusion", but the PR's whole premise (per stack context: "dpa1 graph path supports exclude_types... removing the prior eligibility gate") is that exclude_types is now supported on the graph path. The companion predicate _model_uses_graph_lower in training.py (and its docstring) no longer mentions any exclusion restriction, and the ValueError message a few lines below (Line 587) also drops the exclusion caveat — only this comment is stale.
📝 Proposed fix
- # The graph lower is opt-in and only valid for graph-eligible models
- # (dpa1 with concat tebd and no type exclusion; attention layers included
- # -- the carry-all pair enumeration exports via unbacked SymInts). Fail
- # fast with a clear message rather than emitting a broken .pt2. Enable the
- # per-atom virial for the graph form -- it is near-free there (one extra
- # scatter off the single shared backward).
+ # The graph lower is opt-in and only valid for graph-eligible models
+ # (dpa1 with concat tebd, incl. attention layers and exclude_types --
+ # the carry-all pair enumeration exports via unbacked SymInts). Fail
+ # fast with a clear message rather than emitting a broken .pt2. Enable the
+ # per-atom virial for the graph form -- it is near-free there (one extra
+ # scatter off the single shared backward).📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # The graph lower is opt-in and only valid for graph-eligible models | |
| # (dpa1 with concat tebd and no type exclusion; attention layers included | |
| # -- the carry-all pair enumeration exports via unbacked SymInts). Fail | |
| # fast with a clear message rather than emitting a broken .pt2. Enable the | |
| # per-atom virial for the graph form -- it is near-free there (one extra | |
| # scatter off the single shared backward). | |
| # The graph lower is opt-in and only valid for graph-eligible models | |
| # (dpa1 with concat tebd, incl. attention layers and exclude_types -- | |
| # the carry-all pair enumeration exports via unbacked SymInts). Fail | |
| # fast with a clear message rather than emitting a broken .pt2. Enable the | |
| # per-atom virial for the graph form -- it is near-free there (one extra | |
| # scatter off the single shared backward). |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@deepmd/pt_expt/entrypoints/main.py` around lines 571 - 576, Update the stale
inline comment in main by removing the “no type exclusion” restriction so it
matches the current graph-eligibility behavior. Keep the note aligned with
`_model_uses_graph_lower` in training.py and the nearby ValueError message:
describe graph lower as opt-in for graph-eligible models (dpa1 with concat tebd,
attention layers, and supported exclude_types) and preserve the rest of the
fail-fast/per-atom-virial explanation.
| In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no descriptor-level `exclude_types` or compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list. | ||
| The graph path considers all neighbors within the cutoff, so its result does not depend on {ref}`sel <model[standard]/descriptor[se_atten]/sel>`. | ||
| When `smooth_type_embedding` is `true` and {ref}`attn_layer <model[standard]/descriptor[se_atten]/attn_layer>` is larger than 0 (the defaults), the dense path keeps `sel`-padding phantom terms in the attention softmax denominator while the graph path drops them, so checkpoints trained under the dense semantics shift by up to about 1e-4 in energy when evaluated on the graph path. | ||
| Passing `neighbor_graph_method="legacy"` to the model forward (or the corresponding evaluation option) restores the dense-path numbers exactly. | ||
|
|
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Stale "no exclude_types" eligibility gate contradicts this PR's own change.
This new sentence states that descriptor-level exclude_types disqualifies a descriptor from the graph-eligible/carry-all path in pt_expt. But this cohort explicitly removes that gate: test_exclude_types_graph_eligible_and_parity in source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py asserts dd.uses_graph_lower() is True when exclude_types is set, with the comment "gate: with any exclude list the descriptor must now be graph-eligible." The PR objective also states the eligibility gate for exclude_types was removed. This doc line will mislead users about which models route through the carry-all graph path.
📝 Proposed fix
-In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no descriptor-level `exclude_types` or compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list.
+In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list. Descriptor-level `exclude_types` no longer disqualifies a descriptor from this path.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no descriptor-level `exclude_types` or compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list. | |
| The graph path considers all neighbors within the cutoff, so its result does not depend on {ref}`sel <model[standard]/descriptor[se_atten]/sel>`. | |
| When `smooth_type_embedding` is `true` and {ref}`attn_layer <model[standard]/descriptor[se_atten]/attn_layer>` is larger than 0 (the defaults), the dense path keeps `sel`-padding phantom terms in the attention softmax denominator while the graph path drops them, so checkpoints trained under the dense semantics shift by up to about 1e-4 in energy when evaluated on the graph path. | |
| Passing `neighbor_graph_method="legacy"` to the model forward (or the corresponding evaluation option) restores the dense-path numbers exactly. | |
| In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list. Descriptor-level `exclude_types` no longer disqualifies a descriptor from this path. | |
| The graph path considers all neighbors within the cutoff, so its result does not depend on {ref}`sel <model[standard]/descriptor[se_atten]/sel>`. | |
| When `smooth_type_embedding` is `true` and {ref}`attn_layer <model[standard]/descriptor[se_atten]/attn_layer>` is larger than 0 (the defaults), the dense path keeps `sel`-padding phantom terms in the attention softmax denominator while the graph path drops them, so checkpoints trained under the dense semantics shift by up to about 1e-4 in energy when evaluated on the graph path. | |
| Passing `neighbor_graph_method="legacy"` to the model forward (or the corresponding evaluation option) restores the dense-path numbers exactly. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@doc/model/train-se-atten.md` around lines 160 - 164, Update the pt_expt
training doc sentence describing graph-eligible descriptors so it no longer says
descriptor-level exclude_types disqualifies the carry-all neighbor-graph path.
Use the surrounding se_atten/neighbor_graph_method explanation to state that
mixed-type descriptors with tebd_input_mode "concat" and no descriptor-level
compression remain graph-eligible, while exclude_types is not a blocking
condition anymore. Keep the dense-vs-graph parity note tied to
smooth_type_embedding and attn_layer, but make the eligibility rule consistent
with the current behavior exercised by
test_exclude_types_graph_eligible_and_parity and dd.uses_graph_lower().
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #5733 +/- ##
==========================================
- Coverage 81.29% 81.20% -0.09%
==========================================
Files 990 992 +2
Lines 111019 111372 +353
Branches 4235 4250 +15
==========================================
+ Hits 90252 90444 +192
- Misses 19243 19401 +158
- Partials 1524 1527 +3 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Pair
exclude_typesas a canonical NeighborGraph transform (decision #18)Makes pair-type exclusion a single canonical transform applied once at the neighbor-list/graph build seam, and uses it to add descriptor-level
exclude_typessupport to the dpa1 graph path (removing that eligibility gate), consistently across dpmodel, pt_expt, jax, and the C++ inference path.What changed
apply_pair_exclusion(graph, atype, pair_excl, *, compact=False)indeepmd/dpmodel/utils/neighbor_graph/: ANDsPairExcludeMask.build_edge_exclude_maskintograph.edge_mask.compact=Falseis mask-only (shape-static, export/AOTI-safe);compact=Truedrops masked edges (eager-only; raises on angle-carrying graphs). Idempotent.base_atomic_model) refactored to call the transform for model-levelpair_exclude_types; stays as idempotent backstop.exclude_types: theNotImplementedErrorand theuses_graph_lower()exclude condition are removed; exclusion applied insideDescrptBlockSeAtten.call_graphbefore the segment sums. Graph-vs-dense parity at non-binding sel is exact (rtol=atol=1e-12, attn_layer 0 and 2, type_one_side both).build_neighbor_graphand the pt_expt graph builders (dense/ase/vesin/nv) gain optionalpair_excl/compactwith default post-search application;_call_common_graphpasses model-level excludes at build time; oracle set-equality tests per available builder.apply_pair_exclusion_nlist(nlist, atype_ext, pair_excl)extracted from the inline seam code;build_neighbor_list+ Vesin/Nv/Default strategies gainpair_excl;return_mode='edges'+pair_exclfails fast.buildPairExcludeTable/applyPairExclusion/applyPairExclusionNlistinsource/api_cc/include/commonPT.h, mirroring the Python transforms (same arg order/variable names, cross-referenced docs);pair_exclude_typesserialized into.pt2metadata.jsonand rebuilt inDeepPotPTExpt::init. Exclusion is compiled into the exported graph (traced seam); the C++ call is an idempotent backstop. New gtest (8 tests) vs Python DeepEval reference at 1e-10.apply_pair_exclusionuseslogical_and+ bool cast (array_api_strict rejectedbool*bool), caught by the jax/strict consistency rows now traversing the graph path.Known limitations
pair_exclpath has no local oracle test (CUDA-only); to be validated on a GPU box.build_edge_exclude_maskstill returns int32 (bool cast at call sites; follow-up).🤖 Generated with Claude Code
Spin routing
Spin models auto-inject
exclude_types(virtual/placeholder types) into their backbone descriptor; before this PR that condition accidentally kept spin on the dense path. With exclude_types now graph-eligible, spin backbones flipped onto the carry-all graph route, which (a) diverges from the sel-capped reference on sel-binding spin systems and (b) trips a torch-inductor scatter codegen assertion during spin.pt2export. Fixed explicitly:DescrptDPA1.disable_graph_lower()(not serialized; re-derived structurally) is set inSpinModel.__init__— the single choke point coveringget_spin_model,SpinModel.deserialize, and the pt_expt spin classes — plus a belt-and-bracesneighbor_graph_method="legacy"atSpinModel.call_common. Regression tests pin the routing and its serialize→deserialize survival; the full spin export suite (23) and spin checkpoint-interop suite (12) are green.Verification
Full pt_expt suite: 1196 passed / 39 skipped / 3 failed — the 3 failures (
test_dpa4_freeze_to_pt2,test_dpa4_deep_eval_*) are byte-identical on the base commit (pre-existing torch-inductor dpa4 export issue on this box, unrelated). dpmodel exclusion suites 69 passed; consistency dpa1 99 passed/63 skipped (incl. jax + array_api_strict exclude rows); C++Dpa1PairExclgtest 8/8.Summary by CodeRabbit
New Features
Bug Fixes