refactor: pair exclude_types as canonical NeighborGraph transform; dpa1 graph path supports exclude_types (decision #18) by wanghan-iapcm · Pull Request #5733 · deepmodeling/deepmd-kit

wanghan-iapcm · 2026-07-05T00:57:46Z

Pair `exclude_types` as a canonical NeighborGraph transform (decision #18)

Stacked on #5715 (NeighborGraph PR-D). Only the commits after 6fc45bd26 belong to this PR; will rebase onto master once #5715 merges.

Makes pair-type exclusion a single canonical transform applied once at the neighbor-list/graph build seam, and uses it to add descriptor-level exclude_types support to the dpa1 graph path (removing that eligibility gate), consistently across dpmodel, pt_expt, jax, and the C++ inference path.

What changed

Canonical graph transform apply_pair_exclusion(graph, atype, pair_excl, *, compact=False) in deepmd/dpmodel/utils/neighbor_graph/: ANDs PairExcludeMask.build_edge_exclude_mask into graph.edge_mask. compact=False is mask-only (shape-static, export/AOTI-safe); compact=True drops masked edges (eager-only; raises on angle-carrying graphs). Idempotent.
Atomic-model seam (base_atomic_model) refactored to call the transform for model-level pair_exclude_types; stays as idempotent backstop.
dpa1 graph path supports descriptor-level exclude_types: the NotImplementedError and the uses_graph_lower() exclude condition are removed; exclusion applied inside DescrptBlockSeAtten.call_graph before the segment sums. Graph-vs-dense parity at non-binding sel is exact (rtol=atol=1e-12, attn_layer 0 and 2, type_one_side both).
Build-time exclusion (dispatcher): build_neighbor_graph and the pt_expt graph builders (dense/ase/vesin/nv) gain optional pair_excl/compact with default post-search application; _call_common_graph passes model-level excludes at build time; oracle set-equality tests per available builder.
Dense-nlist port: apply_pair_exclusion_nlist(nlist, atype_ext, pair_excl) extracted from the inline seam code; build_neighbor_list + Vesin/Nv/Default strategies gain pair_excl; return_mode='edges' + pair_excl fails fast.
C++ twin: buildPairExcludeTable / applyPairExclusion / applyPairExclusionNlist in source/api_cc/include/commonPT.h, mirroring the Python transforms (same arg order/variable names, cross-referenced docs); pair_exclude_types serialized into .pt2 metadata.json and rebuilt in DeepPotPTExpt::init. Exclusion is compiled into the exported graph (traced seam); the C++ call is an idempotent backstop. New gtest (8 tests) vs Python DeepEval reference at 1e-10.
Fix: apply_pair_exclusion uses logical_and + bool cast (array_api_strict rejected bool*bool), caught by the jax/strict consistency rows now traversing the graph path.

Known limitations

nv builder's pair_excl path has no local oracle test (CUDA-only); to be validated on a GPU box.
Input statistics remain on the dense path (graph-native stats is a separate follow-on).
smooth_type_embedding + exclude parity untestable at 1e-12 (pre-existing dense sel-padding divergence, feat(dpmodel): graph-native se_atten attention (NeighborGraph PR-D) #5715).
Multi-rank (mpirun) exclusion shares the same seam but has no dedicated gtest.
build_edge_exclude_mask still returns int32 (bool cast at call sites; follow-up).

🤖 Generated with Claude Code

Spin routing

Spin models auto-inject exclude_types (virtual/placeholder types) into their backbone descriptor; before this PR that condition accidentally kept spin on the dense path. With exclude_types now graph-eligible, spin backbones flipped onto the carry-all graph route, which (a) diverges from the sel-capped reference on sel-binding spin systems and (b) trips a torch-inductor scatter codegen assertion during spin .pt2 export. Fixed explicitly: DescrptDPA1.disable_graph_lower() (not serialized; re-derived structurally) is set in SpinModel.__init__ — the single choke point covering get_spin_model, SpinModel.deserialize, and the pt_expt spin classes — plus a belt-and-braces neighbor_graph_method="legacy" at SpinModel.call_common. Regression tests pin the routing and its serialize→deserialize survival; the full spin export suite (23) and spin checkpoint-interop suite (12) are green.

Verification

Full pt_expt suite: 1196 passed / 39 skipped / 3 failed — the 3 failures (test_dpa4_freeze_to_pt2, test_dpa4_deep_eval_*) are byte-identical on the base commit (pre-existing torch-inductor dpa4 export issue on this box, unrelated). dpmodel exclusion suites 69 passed; consistency dpa1 99 passed/63 skipped (incl. jax + array_api_strict exclude rows); C++ Dpa1PairExcl gtest 8/8.

Summary by CodeRabbit

New Features
- Added support for pair-type exclusions across neighbor-list and neighbor-graph generation.
- Enabled graph-based attention workflows for more descriptor configurations, including attention layers.
- Added export checks to better guide compatible graph tracing on supported torch versions.
Bug Fixes
- Improved consistency between dense and graph-based results, including attention and exclusion handling.
- Fixed spin-model routing to use the supported legacy path for inference.
- Reduced numerical mismatches in graph exports and improved behavior for empty or masked neighbor cases.

…ftmax Built on the existing xp_maximum_at (no new array_api helper needed). Part of NeighborGraph PR-D (graph-native attention).

Segment-based (global (E,E) boolean deliberately avoided): compact eager form for carry-all graphs + shape-static nonzero-free form for the center-major static layout (jit/export/make_fx traceable). Part of NeighborGraph PR-D; PR-E angles reuse (unordered, no-self).

…r > 0) DescrptBlockSeAtten.call_graph grows _graph_attention: the dense per-center (nnei, nnei) attention square becomes the edge-pair axis (center_edge_pairs, ordered + self-included), softmax over keys becomes segment_softmax grouped by the query edge. Op-for-op mirror of GatedAttentionLayer.call (head_dim QKV slicing, normalize q/k/v, temperature/scaling, smooth shift trick, post-softmax sw and dotr weighting, residual + LayerNorm per layer). - shape-static adapter path (static_nnei threaded from the dense call adapter): bit-exact vs the dense body, rtol 1e-12, full flag matrix (attn_layer 1/2 x dotr x smooth x normalize x temperature, binding and non-binding sel). - carry-all (compact) graphs: exact for non-smooth; for smooth the dense branch keeps sel-padding slots in the softmax denominator (dense output is sel-DEPENDENT, up to ~1e-4) — the carry-all form drops those phantom terms by design (user decision 2026-07-03), pinned by a clean-divergence test. - edge_env_mat(return_sw=True) exposes the per-edge switch (zeroed on padding) for the smooth branch. - uses_graph_lower: attention configs are now graph-eligible (concat tebd, no exclude_types still required).

…ial parity - test_make_fx_graph_attn: graph forward + autograd.grad at attn_layer=2 traces under make_fx for BOTH smooth branches (the shape-static center_edge_pairs form is nonzero-free) — required since pt_expt compiled training routes eligible models through the graph lower. - model-level graph-vs-legacy lower parity now parametrized over attn_layer {0, 2} (energy/force/virial/atom_virial, 1e-12 CPU). - eligibility pins: attention+concat is graph-eligible; se_atten_v2 (tebd_input_mode='strip') correctly stays dense (strip = later PR; the plan's 'se_atten_v2 inherits for free' did not hold).

- linear-model weight tests: pin smooth_type_embedding=False — the standard (graph-routed, carry-all) and linear (graph-ineligible, dense) submodels otherwise differ by the accepted smooth-attention denominator divergence (~1e-6), which is a route artifact, not a weight-combination bug. - new binding-sel sanity: carry-all graph attention diverges from the sel-truncated dense path when sel binds (spec decision deepmodeling#17).

…rity) neighbor_list=None now takes the carry-all graph default for eligible attention models; explicit World-1 builders take the legacy dense route. With smooth attention the two routes differ by design (PR-D), so the route-equivalence tests pin smooth_type_embedding=False.

for more information, see https://pre-commit.ci

…SymInts The compact (carry-all) pair enumeration used nonzero + tensor-repeat with Python control flow on their data-dependent sizes, so the attention graph lower failed torch.export with GuardOnDataDependentSymNode. Register those sizes as unbacked SymInt sizes (new torch-free xp_hint_dynamic_size shim, no-op for numpy/jax), take the empty-input fast paths only on concrete int shapes, build iotas via cumsum(ones)-1 (the array_api_compat arange wrapper branches on the length in Python), and skip the policy-compression nonzero when no filter applies (include_self and ordered - the attention default). Eager numpy/torch results are unchanged.

…> 0) With the compact pair enumeration unbacked-SymInt-traceable, the carry-all attention graph lower now exports to a graph-form .pt2 unchanged in ABI (same 5-tensor NeighborGraph schema, dynamic edge axis) and with carry-all semantics preserved (no sel truncation, unlike the dense-adapter nlist-form export). Update the stale freeze-gate message (attention is eligible), add a symbolic-trace merge gate at attn_layer in {0,2}, parametrize the DeepEval graph .pt2 fixture over attn_layer (both artifacts: dynamic sizes, PBC and non-PBC, 1e-10 vs the sel-capped dense reference at non-binding sel), and add a single-atom zero-real-edge runtime test (the R==0 extreme of the unbacked sizes).

The dense call is wrapped in @cast_precision, but the graph route's only float input (edge_vec) lives inside the NeighborGraph dataclass where the decorator cannot see it, so non-global-precision models (e.g. float32) crashed with a double-vs-float matmul on the graph route while the dense route worked. Cast edge_vec down to the descriptor precision on entry and the outputs back to the caller's dtype on exit (differentiable, so the model-level force autograd is unaffected). Add an fp32 graph-vs-dense route parity test at attn_layer 0 and 2.

…nt-wide NaN A masked entry whose raw logit exceeds the unmasked per-segment max by more than the exp overflow threshold (~709 fp64 / ~88 fp32) overflowed exp() to inf, and the post-hoc inf * 0 mask multiply produced nan, which the denominator sum then spread across the entire segment. Shift data_for_max (masked entries already -inf, exp(-inf) == 0 exactly) instead of the raw data; the mask multiply stays as a defensive no-op. Regression test with a masked logit 1e5 above the unmasked max. Addresses CodeRabbit review.

…sh eligibility wording Address OutisLi review on deepmodeling#5715: - Notes on DescrptDPA1.call_graph (+ pointers on uses_graph_lower and the freeze lower_kind docstring): for smooth_type_embedding=True the carry-all graph attention intentionally drops the dense layout's sel-padding terms from the softmax denominator - sel-independent semantics that differ from the legacy dense lower by up to ~1e-4; bit-tight dense parity holds for the non-smooth branch and for the static_nnei dense-adapter realization. - Update the stale 'dpa1 attn_layer == 0' eligibility wording (freeze() docstring, training-path predicate/comment, dpmodel/pt_expt graph-lower docstrings) to the actual contract: mixed-types dpa1/se_atten with concat type embedding and no exclude_types, attention layers included; call_common docs no longer imply unconditional dense parity.

…he pt_expt graph default and legacy escape hatch

deepmodeling#18)

…ct=True) when angle fields are present

…exclusion; drop eligibility gate

…layer and type_one_side Add @pytest.mark.parametrize for attn_layer in [0, 2] and type_one_side in [False, True] to test_exclude_types_graph_parity. Also adds the missing parity assertion (graph vs dense at rtol=atol=1e-12, non-binding sel). Uses smooth_type_embedding=False to avoid the known by-design softmax denominator divergence in the dense smooth path.

Descriptor-level exclude_types is now graph-eligible (fully supported via apply_pair_exclusion). Remove 'no exclude_types' from four docstrings/error messages that list graph eligibility conditions. The gate condition was removed in the NeighborGraph implementation; only tebd_input_mode='concat' restriction remains. - deepmd/pt_expt/entrypoints/main.py: freeze_model docstring (~502) + ValueError message (~589) - deepmd/dpmodel/model/make_model.py: forward docstring (~317) - deepmd/pt_expt/train/training.py: _model_uses_graph_lower docstring (~591)

build_neighbor_graph, build_neighbor_graph_ase, build_neighbor_graph_vesin, build_neighbor_graph_nv all gain optional keyword-only pair_excl=None and compact=False; default path = geometric search then apply_pair_exclusion. _call_common_graph in pt_expt make_model wires atomic_model.pair_excl to every builder call so model-level pair_exclude_types is applied at build time (the atomic-model seam backstop stays as idempotent identity). Oracle tests assert set-equality of the valid-edge set between builder(pair_excl=X) and builder() + separate apply_pair_exclusion(X), for dense (2 id + 3 oracle cases) and ase (2 cases); vesin gets 4 new tests (2 identity, 2 oracle, parametrized over periodic).

…y parity

…test

…n for array_api_strict compat

…_neighbor_list/strategies (A4) Extract the inline pair-exclusion from base_atomic_model.forward_common_atomic into apply_pair_exclusion_nlist(nlist, atype_ext, pair_excl) in nlist.py. The seam is refactored to call the named helper (idempotent backstop remains). Add pair_excl=None to: - build_neighbor_list (dpmodel, nlist.py) - DefaultNeighborList.build - VesinNeighborList.build (pt_expt) - NvNeighborList.build (pt; CUDA-only, API parity) - NeighborList base class signature 12 new unit tests covering: None/empty identity, excluded pairs -> -1, -1 slot preservation, ghost-atom types, idempotence, torch namespace smoke, build_neighbor_list oracle equivalence, DefaultNeighborList oracle, VesinNeighborList oracle. NvNeighborList CUDA-only (not validated locally).

…cross-refs Serialize the model-level pair_exclude_types into metadata.json so the C++ DeepPotPTExpt loader can rebuild the flat (ntypes+1)^2 keep table and re-apply the same pair-exclusion transform at the ingestion seam. Add See Also cross-references between the Python transforms (apply_pair_exclusion, apply_pair_exclusion_nlist) and their forthcoming C++ twins.

Add buildPairExcludeTable / applyPairExclusion (graph) / applyPairExclusionNlist (dense) in commonPT.h, structurally mirroring the Python transforms (apply_pair_exclusion, apply_pair_exclusion_nlist) with the same argument order and variable names (type_ij, keep). DeepPotPTExpt::init rebuilds the flat (ntypes+1)^2 keep table from the pair_exclude_types metadata; the seam applies it before every model call (graph and dense, LAMMPS and standalone) as an idempotent backstop to the exclusion already compiled into the .pt2.

gen_dpa1_pairexcl.py exports graph-route and dense-route DPA1(attn_layer=0) .pt2 models with model-level pair_exclude_types=[[0,1]] plus a no-exclusion baseline; references come from Python DeepEval of each model. test_deeppot_dpa1_pairexcl_ptexpt.cc validates both C++ ingestion routes (applyPairExclusion / applyPairExclusionNlist) against the Python references at 1e-10 (fp64), cross-checks graph==dense, and proves the exclusion is active by comparing against the empty-table baseline. Wired into test_cc_local.sh. 8/8 tests pass locally.

…e and forward_common_atomic_graph

The SpinModel backbone (dpa1 attn_layer=0) was being routed to the carry-all graph path by the default-flip (decision deepmodeling#17) introduced in the graph-pair-exclude branch. Virtual/placeholder types injected by get_spin_model double the atom density, making the system sel-binding; the carry-all graph keeps neighbors the capped dense nlist discards, so SpinModel.call_common diverged from call_common_lower (the dense lower used by pt_expt eager inference) by ~2e-6. Fix: pass neighbor_graph_method='legacy' when SpinModel.call_common invokes the backbone, forcing the dense-nlist path until spin-graph support is explicitly implemented. The graph .pt2 export path already has a fail-fast guard for spin (serialization.py:963). Adds a regression test pinning that: - SpinModel.call_common total energy equals backbone(legacy) on the same spin-doubled inputs (exact bit-identity). - The backbone in graph mode gives a DIFFERENT energy at this density, confirming the fixture exercises the diverging regime.

The pair-exclude branch removed exclude_types from DescrptDPA1.uses_graph_lower(), which previously kept spin backbones (they inject exclude_types) on the dense path. As a side effect the descriptor-level dispatch routed spin .pt2/.pte export through the graph kernel, whose scatter/atomic_add tripped a torch-inductor CPU codegen assertion (23 errors in test_deep_eval_spin.py). Add an explicit disable_graph_lower() knob on DescrptDPA1 and set it structurally in SpinModel.__init__ (covers get_spin_model and both dpmodel/pt_expt deserialize paths, so it survives serialize round trips). The flag is not serialized; re-derived at construction. The neighbor_graph_method=legacy kwarg in call_common is kept as belt-and-braces.

for more information, see https://pre-commit.ci

coderabbitai · 2026-07-05T01:01:55Z

📝 Walkthrough

Walkthrough

This PR adds model-level and descriptor-level pair-type exclusion support across dense neighbor lists and neighbor-graph builders (dpmodel, pt, pt_expt), extends DPA1 with graph-native attention (including exclusion-aware masking), disables graph-lower for spin models via legacy routing, and wires equivalent pair-exclusion enforcement into the C++ pt_expt inference seam with new tests and a torch-version guard for graph tracing.

Changes

Python pair exclusion and graph-native DPA1 attention

Layer / File(s)	Summary
Array-API and segment primitives `deepmd/dpmodel/array_api.py`, `deepmd/dpmodel/utils/neighbor_graph/segment.py`, `source/tests/common/dpmodel/test_segment_softmax.py`	Adds `xp_hint_dynamic_size` for export size hints and `segment_max`/`segment_softmax` mask-aware reductions with tests.
Center edge-pair enumeration `deepmd/dpmodel/utils/neighbor_graph/pairs.py`, `source/tests/common/dpmodel/test_center_edge_pairs.py`	Implements `center_edge_pairs` in compact and shape-static modes for attention edge enumeration.
NeighborGraph pair-exclusion primitive `deepmd/dpmodel/utils/neighbor_graph/graph.py`, `.../env.py`, `.../__init__.py`, `source/tests/common/dpmodel/test_apply_pair_exclusion.py`	Adds `apply_pair_exclusion`, `edge_env_mat(return_sw=...)`, module exports, and masking/compaction/idempotence tests.
Dense neighbor-list pair-exclusion `deepmd/dpmodel/utils/{nlist,neighbor_list,default_neighbor_list,__init__}.py`, `deepmd/pt/utils/nv_nlist.py`, `deepmd/pt_expt/utils/vesin_neighbor_list.py`, `source/tests/common/dpmodel/test_apply_pair_exclusion_nlist.py`	Adds `apply_pair_exclusion_nlist` and `pair_excl` parameters across `build_neighbor_list` and neighbor-list builder classes.
Neighbor-graph builder wiring `deepmd/dpmodel/utils/neighbor_graph/{ase_builder,builder}.py`, `deepmd/pt_expt/utils/{nv_graph_builder,vesin_graph_builder}.py`, `source/tests/{common/dpmodel/test_neighbor_graph_builder.py, pt_expt/utils/test_vesin_graph_builder.py}`	Threads `pair_excl`/`compact` through neighbor-graph builders with oracle-equivalence tests.
Atomic-model exclusion integration `deepmd/dpmodel/atomic_model/base_atomic_model.py`, `source/tests/common/dpmodel/test_graph_atomic_parity.py`	Replaces inline exclusion masking with `apply_pair_exclusion_nlist`/`apply_pair_exclusion` calls.
DPA1 graph-native attention `deepmd/dpmodel/descriptor/dpa1.py`, `deepmd/dpmodel/model/make_model.py`, `doc/model/train-se-atten.md`, `source/tests/common/dpmodel/test_dpa1_*`, `source/tests/pt_expt/descriptor/test_dpa1.py`	Adds `_graph_lower_disabled` eligibility, `static_nnei`, `_graph_attention`/`_graph_attention_one_layer`, and exclusion-aware `sw` masking.
SpinModel legacy routing `deepmd/dpmodel/model/spin_model.py`, `source/tests/common/dpmodel/test_spin_model_legacy_routing.py`	Disables graph-lower on spin backbone descriptor and forces `neighbor_graph_method="legacy"`.
pt_expt wiring and torch-version guard `deepmd/pt_expt/{entrypoints/main.py, model/make_model.py, train/training.py, utils/serialization.py}`, `source/tests/pt_expt/utils/test_graph_pt2_metadata.py`	Forwards `pair_excl` into builders, adds `check_graph_trace_torch_version` guard, and exports `pair_exclude_types` metadata.
pt_expt graph-lower parity tests `source/tests/pt_expt/{model/test_dpa1_graph_lower.py, infer/test_graph_deepeval.py, model/test_linear_model.py, utils/test_neighbor_list.py}`	Extends parity tests for attention/exclusion, single-atom edges, float32 route, and pins `smooth_type_embedding=False`.

C++ pt_expt pair-exclusion ingestion seam

Layer / File(s)	Summary
Pair-exclusion table and application `source/api_cc/include/{DeepPotPTExpt.h, commonPT.h}`, `source/api_cc/src/DeepPotPTExpt.cc`	Adds `pair_exclude_table_`, `buildPairExcludeTable`/`applyPairExclusion`/`applyPairExclusionNlist`, applied in graph and dense compute paths.
C++ test suite and model generator `source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc`, `source/tests/infer/gen_dpa1_pairexcl.py`, `source/install/test_cc_local.sh`	Adds a model generator producing graph/nlist/baseline `.pt2` artifacts and a GoogleTest suite validating ingestion-route parity and exclusion activation.

Estimated code review effort: 4 (Complex) | ~75 minutes

Sequence Diagram(s)

sequenceDiagram
  participant DescrptDPA1
  participant DescrptBlockSeAtten
  participant CenterEdgePairs
  participant SegmentSoftmax
  DescrptDPA1->>DescrptBlockSeAtten: call_graph(graph, atype, static_nnei)
  DescrptBlockSeAtten->>DescrptBlockSeAtten: apply_pair_exclusion(graph, atype, pair_excl)
  DescrptBlockSeAtten->>CenterEdgePairs: center_edge_pairs(dst, edge_mask, static_nnei)
  DescrptBlockSeAtten->>DescrptBlockSeAtten: _graph_attention(gg, rr, dst, sw_e)
  DescrptBlockSeAtten->>SegmentSoftmax: segment_softmax(scores, query_edge, mask)
  SegmentSoftmax-->>DescrptBlockSeAtten: normalized attention weights
  DescrptBlockSeAtten-->>DescrptDPA1: grrg, rot_mat

sequenceDiagram
  participant DeepPotPTExptInit
  participant DeepPotPTExptCompute
  participant PairExcludeTable
  participant ExportedModel
  DeepPotPTExptInit->>PairExcludeTable: buildPairExcludeTable(ntypes, pair_exclude_types)
  DeepPotPTExptCompute->>PairExcludeTable: applyPairExclusion(edge_index, edge_mask, atype)
  DeepPotPTExptCompute->>PairExcludeTable: applyPairExclusionNlist(nlist, atype_ext)
  PairExcludeTable-->>ExportedModel: filtered edge_mask / nlist
  ExportedModel-->>DeepPotPTExptCompute: energy, forces, virial

Possibly related PRs

deepmodeling/deepmd-kit#5491: Extends the same neighbor-list builders (VesinNeighborList, DefaultNeighborList/build_neighbor_list) that this PR adds pair_excl support to.
deepmodeling/deepmd-kit#5583: Both PRs heavily modify deepmd/dpmodel/descriptor/dpa1.py's uses_graph_lower/call_graph routing and signatures.
deepmodeling/deepmd-kit#5714: Extends the same pt_expt carry-all NeighborGraph builders (nv_graph_builder.py, vesin_graph_builder.py) with pair_excl/compact and apply_pair_exclusion.

Suggested labels: enhancement, Python, C++

Suggested reviewers: OutisLi, iProzd

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately captures the main change: canonical pair-exclusion handling and DPA1 graph-path support for exclude_types.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

+    atype = np.array([[0, 1, 0, 1, 0]], dtype=np.int64)
+    ds = DescrptDPA1(rcut=4.0, rcut_smth=0.5, sel=[200], ntypes=2, attn_layer=0)
+    ft = InvarFitting("energy", 2, ds.get_dim_out(), 1, mixed_types=True)
+    am = DPAtomicModel(ds, ft, type_map=["a", "b"])


+        try:
+            import ase  # noqa: F401
+        except ImportError as e:
+            import unittest


coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

source/tests/pt_expt/descriptor/test_dpa1.py (1)
117-147: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Pass mapping to torch.export.export here

test_exportable still goes through the dense fallback because TestCaseSingleFrameWithNlist sets nloc=3 and nall=4, while this export call passes no mapping. That means the new exclude_types case only covers the legacy dense exclusion mask, not the graph-native apply_pair_exclusion path. Add mapping to the exported inputs so the parametrization exercises the intended route.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/pt_expt/descriptor/test_dpa1.py` around lines 117 - 147,
test_exportable is missing the graph mapping input, so it still exercises the
dense fallback instead of the graph-native exclusion path. Update the export
setup in test_exportable to pass mapping into torch.export.export alongside dd0
and the existing inputs, using the test fixture’s mapping source so the
exclude_types parametrization covers apply_pair_exclusion as intended.

🧹 Nitpick comments (7)

deepmd/dpmodel/model/make_model.py (1)
316-322: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Docstring update looks accurate; stale example nearby.

Matches the new DPA1 graph-native attention behavior (attention layers now included in graph eligibility). Note the unchanged _call_common_graph exception message a few dozen lines below ("e.g. dpa1 attn_layer=0") is now a narrower example than what this docstring describes — consider updating that message text for consistency in a follow-up.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepmd/dpmodel/model/make_model.py` around lines 316 - 322, The exception
message in _call_common_graph is now too narrow compared with the updated
graph-native attention behavior described in the nearby docstring. Update the
message text in _call_common_graph so it reflects the broader DPA1
attention-layer graph eligibility instead of only referencing the old “e.g. dpa1
attn_layer=0” example, keeping the wording consistent with the behavior
documented in make_model.py.
source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py (1)
166-179: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Test comment overstates what is actually verified.

The comment says this block will "verify excluded pairs contribute sw == 0" and "check ... call_graph sw channel", but call_graph only returns (grrg, rot_mat) — it has no sw output — and the actual assertions only check for NaN/Inf, not the claimed masking behavior. The earlier out[4] vs ref[4] comparison (lines 162-164) already indirectly validates exclusion parity for sw via the dense reference, so this block is largely redundant and its comment is misleading about intent/coverage. Either remove the stale comment or replace it with an assertion that actually validates zeroed contributions from excluded pairs (e.g., inspect the block's edge_mask/sw_e via se_atten.call_graph directly).
♻️ Suggested comment fix (minimal)
-        if exclude_types:
-            # verify excluded pairs contribute sw == 0 in the dense reference
-            # (atype=[0,1,0,1] -> pairs (0,1) and (1,0) should be masked)
-            # sw shape: (nf, nloc, nnei, 1); just check the graph output is also 0
-            # for excluded-pair edges by checking call_graph sw channel
+        if exclude_types:
+            # additional sanity check on the raw call_graph output (no sw
+            # channel here; exclusion parity for sw is already verified via
+            # out[4] vs ref[4] above).
             graph = from_dense_quartet(ext_coord, nlist, mapping, compact=False)
             atype_local = self.atype.reshape(-1)
-            grrg_g, rot_mat_g = dd.call_graph(
+            grrg_g, _rot_mat_g = dd.call_graph(
                 graph, atype_local, type_embedding=dd.type_embedding.call()
             )
             # no nan/inf in output with exclusions applied
             assert not np.any(np.isnan(grrg_g))
             assert not np.any(np.isinf(grrg_g))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py` around lines
166 - 179, The comment in test_dpa1_call_graph_descriptor is misleading because
this block does not verify excluded-pair sw masking; call_graph only returns
grrg and rot_mat, and the current assertions only check NaN/Inf. Update the test
by either removing/rephrasing the stale comment to match the actual coverage, or
add a real assertion for zeroed excluded-pair contributions by checking the
relevant sw/edge-mask path through se_atten.call_graph or the returned graph
data. The earlier out[4] vs ref[4] comparison already covers sw parity, so keep
this block focused on what it truly validates.
source/tests/common/dpmodel/test_neighbor_graph_builder.py (1)
419-427: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Redundant import unittest.

unittest is already imported at the top of this file; the local re-import inside the except block is unnecessary.
🧹 Proposed cleanup
     `@classmethod`
     def setUpClass(cls) -> None:
         try:
             import ase  # noqa: F401
         except ImportError as e:
-            import unittest
-
             raise unittest.SkipTest("ase not installed") from e
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/common/dpmodel/test_neighbor_graph_builder.py` around lines 419
- 427, Remove the redundant local import inside test_neighbor_graph_builder’s
setUpClass method: the file already imports unittest, so keep the ImportError
handling but drop the inner import and use the existing unittest.SkipTest
reference when ase is missing.
Source: Linters/SAST tools
deepmd/dpmodel/utils/neighbor_graph/graph.py (1)
192-194: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Docstring overstates what compact=True replaces.

The parameter doc says edge_index, edge_vec, angle_index, angle_mask are all replaced when compact=True. In practice angle_index/angle_mask are never touched by the compact branch — the function only reaches the compaction step after confirming both are None (otherwise it raises NotImplementedError). Listing them as "replaced" could mislead a future implementer extending angle-compaction support into thinking this path already handles it.
📝 Suggested doc fix
     graph
-        The neighbor graph; only ``edge_mask`` (and, if ``compact=True``,
-        ``edge_index``, ``edge_vec``, ``angle_index``, ``angle_mask``) are
-        replaced.
+        The neighbor graph; only ``edge_mask`` (and, if ``compact=True``,
+        ``edge_index`` and ``edge_vec``) are replaced. ``angle_index`` /
+        ``angle_mask`` are never touched — compaction is rejected outright
+        when either is present (see the ``compact`` behavior below).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepmd/dpmodel/utils/neighbor_graph/graph.py` around lines 192 - 194, The
docstring for the neighbor graph parameter overstates the effect of compact=True
by implying that angle_index and angle_mask are also replaced. Update the
documentation in graph.py to say compact mode only compacts edge_index and
edge_vec (along with edge_mask), and make it clear that angle_index and
angle_mask are not handled by this branch because the code path only proceeds
when they are None.
deepmd/dpmodel/utils/neighbor_graph/ase_builder.py (1)
154-163: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Pin device explicitly when converting atype for apply_pair_exclusion.

xp = array_api_compat.array_namespace(coord) followed by xp.asarray(atype) doesn't pin a device, unlike the analogous pair_excl wiring in nv_graph_builder.py and vesin_graph_builder.py, which both use torch.as_tensor(atype, device=<coord's device>). If atype isn't already a tensor on the same device as coord (e.g. a CPU/numpy atype paired with a CUDA coord), xp.asarray will silently produce a CPU tensor, which will then device-mismatch against graph.edge_index/edge_mask inside apply_pair_exclusion.
🔧 Suggested fix
     if pair_excl is not None:
         import array_api_compat

         xp = array_api_compat.array_namespace(coord)
-        atype_flat = xp.reshape(xp.asarray(atype), (-1,))
+        dev = array_api_compat.device(coord)
+        atype_flat = xp.reshape(xp.asarray(atype, device=dev), (-1,))
         graph = apply_pair_exclusion(graph, atype_flat, pair_excl, compact=compact)
     return graph
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepmd/dpmodel/utils/neighbor_graph/ase_builder.py` around lines 154 - 163,
The atype conversion in the ASE neighbor graph path is not explicitly pinned to
coord’s device, so apply_pair_exclusion can receive tensors on the wrong device.
Update the ase_builder flow that builds graph and handles pair_excl to convert
atype the same way as the nv_graph_builder and vesin_graph_builder paths: derive
the device from coord and create atype on that device before flattening and
passing it into apply_pair_exclusion. This keeps the device consistent with
graph.edge_index and graph.edge_mask.
source/tests/common/dpmodel/test_graph_atomic_parity.py (1)
318-344: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Drop the unused model scaffolding. am is never referenced here, so DescrptDPA1, InvarFitting, and DPAtomicModel can be removed from this test.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/common/dpmodel/test_graph_atomic_parity.py` around lines 318 -
344, The test builds unused model scaffolding that is never referenced, so
remove the dead setup from test_apply_pair_exclusion_idempotent. Eliminate the
DescrptDPA1, InvarFitting, and DPAtomicModel construction (including the am
variable) and keep only the inputs actually needed for
extend_input_and_build_neighbor_list, from_dense_quartet, and
apply_pair_exclusion. Make sure the test still covers both the empty and
non-empty pair_exclude_types branches.
Sources: Coding guidelines, Linters/SAST tools
source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc (1)
101-159: 🎯 Functional Correctness | 🔵 Trivial | 🏗️ Heavy lift

Test coverage gap: LAMMPS InputNlist ingestion route not exercised for pair-exclusion.

check_against_ref/all TYPED_TESTs call the 6-arg dp.compute(ener, force, virial, coord, atype, box), which routes to DeepPotPTExpt's standalone (no-nlist, build_nlist-based) compute() overload. The LAMMPS-style InputNlist overload — the actual pair-style ingestion seam, which caches edge_index_tensor/firstneigh_tensor at ago==0 and recomputes geometry via compactEdgeTensors every step before calling applyPairExclusion/applyPairExclusionNlist — is never invoked here. A bug isolated to that branch's node/edge tensor construction (e.g. the multi_rank ? nall_real : nloc node-count selection feeding applyPairExclusion) wouldn't be caught by this suite.

Consider adding a case that drives the InputNlist overload (mirroring the pattern in test_deeppot_dpa1_graph_ptexpt.cc) with pair_exclude_types set, so both C++ ingestion entry points are validated against the Python reference.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc` around lines 101 -
159, Add coverage for the LAMMPS-style InputNlist ingestion path in this
pair-exclusion test, because the current check_against_ref and TYPED_TESTs only
exercise the 6-arg DeepPot::compute route. Introduce a test that calls the
InputNlist compute overload on DeepPotPTExpt, using pair_exclude_types and
matching the pattern used in test_deeppot_dpa1_graph_ptexpt.cc, so the edge/node
tensor caching and applyPairExclusion/applyPairExclusionNlist branch are
validated against the Python reference.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deepmd/pt_expt/entrypoints/main.py`:
- Around line 571-576: Update the stale inline comment in main by removing the
“no type exclusion” restriction so it matches the current graph-eligibility
behavior. Keep the note aligned with `_model_uses_graph_lower` in training.py
and the nearby ValueError message: describe graph lower as opt-in for
graph-eligible models (dpa1 with concat tebd, attention layers, and supported
exclude_types) and preserve the rest of the fail-fast/per-atom-virial
explanation.

In `@doc/model/train-se-atten.md`:
- Around line 160-164: Update the pt_expt training doc sentence describing
graph-eligible descriptors so it no longer says descriptor-level exclude_types
disqualifies the carry-all neighbor-graph path. Use the surrounding
se_atten/neighbor_graph_method explanation to state that mixed-type descriptors
with tebd_input_mode "concat" and no descriptor-level compression remain
graph-eligible, while exclude_types is not a blocking condition anymore. Keep
the dense-vs-graph parity note tied to smooth_type_embedding and attn_layer, but
make the eligibility rule consistent with the current behavior exercised by
test_exclude_types_graph_eligible_and_parity and dd.uses_graph_lower().

---

Outside diff comments:
In `@source/tests/pt_expt/descriptor/test_dpa1.py`:
- Around line 117-147: test_exportable is missing the graph mapping input, so it
still exercises the dense fallback instead of the graph-native exclusion path.
Update the export setup in test_exportable to pass mapping into
torch.export.export alongside dd0 and the existing inputs, using the test
fixture’s mapping source so the exclude_types parametrization covers
apply_pair_exclusion as intended.

---

Nitpick comments:
In `@deepmd/dpmodel/model/make_model.py`:
- Around line 316-322: The exception message in _call_common_graph is now too
narrow compared with the updated graph-native attention behavior described in
the nearby docstring. Update the message text in _call_common_graph so it
reflects the broader DPA1 attention-layer graph eligibility instead of only
referencing the old “e.g. dpa1 attn_layer=0” example, keeping the wording
consistent with the behavior documented in make_model.py.

In `@deepmd/dpmodel/utils/neighbor_graph/ase_builder.py`:
- Around line 154-163: The atype conversion in the ASE neighbor graph path is
not explicitly pinned to coord’s device, so apply_pair_exclusion can receive
tensors on the wrong device. Update the ase_builder flow that builds graph and
handles pair_excl to convert atype the same way as the nv_graph_builder and
vesin_graph_builder paths: derive the device from coord and create atype on that
device before flattening and passing it into apply_pair_exclusion. This keeps
the device consistent with graph.edge_index and graph.edge_mask.

In `@deepmd/dpmodel/utils/neighbor_graph/graph.py`:
- Around line 192-194: The docstring for the neighbor graph parameter overstates
the effect of compact=True by implying that angle_index and angle_mask are also
replaced. Update the documentation in graph.py to say compact mode only compacts
edge_index and edge_vec (along with edge_mask), and make it clear that
angle_index and angle_mask are not handled by this branch because the code path
only proceeds when they are None.

In `@source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc`:
- Around line 101-159: Add coverage for the LAMMPS-style InputNlist ingestion
path in this pair-exclusion test, because the current check_against_ref and
TYPED_TESTs only exercise the 6-arg DeepPot::compute route. Introduce a test
that calls the InputNlist compute overload on DeepPotPTExpt, using
pair_exclude_types and matching the pattern used in
test_deeppot_dpa1_graph_ptexpt.cc, so the edge/node tensor caching and
applyPairExclusion/applyPairExclusionNlist branch are validated against the
Python reference.

In `@source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py`:
- Around line 166-179: The comment in test_dpa1_call_graph_descriptor is
misleading because this block does not verify excluded-pair sw masking;
call_graph only returns grrg and rot_mat, and the current assertions only check
NaN/Inf. Update the test by either removing/rephrasing the stale comment to
match the actual coverage, or add a real assertion for zeroed excluded-pair
contributions by checking the relevant sw/edge-mask path through
se_atten.call_graph or the returned graph data. The earlier out[4] vs ref[4]
comparison already covers sw parity, so keep this block focused on what it truly
validates.

In `@source/tests/common/dpmodel/test_graph_atomic_parity.py`:
- Around line 318-344: The test builds unused model scaffolding that is never
referenced, so remove the dead setup from test_apply_pair_exclusion_idempotent.
Eliminate the DescrptDPA1, InvarFitting, and DPAtomicModel construction
(including the am variable) and keep only the inputs actually needed for
extend_input_and_build_neighbor_list, from_dense_quartet, and
apply_pair_exclusion. Make sure the test still covers both the empty and
non-empty pair_exclude_types branches.

In `@source/tests/common/dpmodel/test_neighbor_graph_builder.py`:
- Around line 419-427: Remove the redundant local import inside
test_neighbor_graph_builder’s setUpClass method: the file already imports
unittest, so keep the ImportError handling but drop the inner import and use the
existing unittest.SkipTest reference when ase is missing.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f66e001c-837b-4b14-a6b8-84d867cc8a56

📥 Commits

Reviewing files that changed from the base of the PR and between ffe57a3 and 1b5c75d.

📒 Files selected for processing (48)

deepmd/dpmodel/array_api.py
deepmd/dpmodel/atomic_model/base_atomic_model.py
deepmd/dpmodel/descriptor/dpa1.py
deepmd/dpmodel/model/make_model.py
deepmd/dpmodel/model/spin_model.py
deepmd/dpmodel/utils/__init__.py
deepmd/dpmodel/utils/default_neighbor_list.py
deepmd/dpmodel/utils/neighbor_graph/__init__.py
deepmd/dpmodel/utils/neighbor_graph/ase_builder.py
deepmd/dpmodel/utils/neighbor_graph/builder.py
deepmd/dpmodel/utils/neighbor_graph/env.py
deepmd/dpmodel/utils/neighbor_graph/graph.py
deepmd/dpmodel/utils/neighbor_graph/pairs.py
deepmd/dpmodel/utils/neighbor_graph/segment.py
deepmd/dpmodel/utils/neighbor_list.py
deepmd/dpmodel/utils/nlist.py
deepmd/pt/utils/nv_nlist.py
deepmd/pt_expt/entrypoints/main.py
deepmd/pt_expt/model/make_model.py
deepmd/pt_expt/train/training.py
deepmd/pt_expt/utils/nv_graph_builder.py
deepmd/pt_expt/utils/serialization.py
deepmd/pt_expt/utils/vesin_graph_builder.py
deepmd/pt_expt/utils/vesin_neighbor_list.py
doc/model/train-se-atten.md
source/api_cc/include/DeepPotPTExpt.h
source/api_cc/include/commonPT.h
source/api_cc/src/DeepPotPTExpt.cc
source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc
source/install/test_cc_local.sh
source/tests/common/dpmodel/test_apply_pair_exclusion.py
source/tests/common/dpmodel/test_apply_pair_exclusion_nlist.py
source/tests/common/dpmodel/test_center_edge_pairs.py
source/tests/common/dpmodel/test_dpa1_call_graph_block.py
source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py
source/tests/common/dpmodel/test_dpa1_graph_attention_parity.py
source/tests/common/dpmodel/test_graph_atomic_parity.py
source/tests/common/dpmodel/test_neighbor_graph_builder.py
source/tests/common/dpmodel/test_segment_softmax.py
source/tests/common/dpmodel/test_spin_model_legacy_routing.py
source/tests/infer/gen_dpa1_pairexcl.py
source/tests/pt_expt/descriptor/test_dpa1.py
source/tests/pt_expt/infer/test_graph_deepeval.py
source/tests/pt_expt/model/test_dpa1_graph_lower.py
source/tests/pt_expt/model/test_linear_model.py
source/tests/pt_expt/utils/test_graph_pt2_metadata.py
source/tests/pt_expt/utils/test_neighbor_list.py
source/tests/pt_expt/utils/test_vesin_graph_builder.py

coderabbitai · 2026-07-05T01:19:20Z

+    # The graph lower is opt-in and only valid for graph-eligible models
+    # (dpa1 with concat tebd and no type exclusion; attention layers included
+    # -- the carry-all pair enumeration exports via unbacked SymInts). Fail
+    # fast with a clear message rather than emitting a broken .pt2. Enable the
+    # per-atom virial for the graph form -- it is near-free there (one extra
+    # scatter off the single shared backward).


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Stale comment: "no type exclusion" contradicts this PR's own change.

This inline comment still describes graph eligibility as "dpa1 with concat tebd and no type exclusion", but the PR's whole premise (per stack context: "dpa1 graph path supports exclude_types... removing the prior eligibility gate") is that exclude_types is now supported on the graph path. The companion predicate _model_uses_graph_lower in training.py (and its docstring) no longer mentions any exclusion restriction, and the ValueError message a few lines below (Line 587) also drops the exclusion caveat — only this comment is stale.

📝 Proposed fix

- # The graph lower is opt-in and only valid for graph-eligible models - # (dpa1 with concat tebd and no type exclusion; attention layers included - # -- the carry-all pair enumeration exports via unbacked SymInts). Fail - # fast with a clear message rather than emitting a broken .pt2. Enable the - # per-atom virial for the graph form -- it is near-free there (one extra - # scatter off the single shared backward). + # The graph lower is opt-in and only valid for graph-eligible models + # (dpa1 with concat tebd, incl. attention layers and exclude_types -- + # the carry-all pair enumeration exports via unbacked SymInts). Fail + # fast with a clear message rather than emitting a broken .pt2. Enable the + # per-atom virial for the graph form -- it is near-free there (one extra + # scatter off the single shared backward).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# The graph lower is opt-in and only valid for graph-eligible models

# (dpa1 with concat tebd and no type exclusion; attention layers included

# -- the carry-all pair enumeration exports via unbacked SymInts). Fail

# fast with a clear message rather than emitting a broken .pt2. Enable the

# per-atom virial for the graph form -- it is near-free there (one extra

# scatter off the single shared backward).

# The graph lower is opt-in and only valid for graph-eligible models

# (dpa1 with concat tebd, incl. attention layers and exclude_types --

# the carry-all pair enumeration exports via unbacked SymInts). Fail

# fast with a clear message rather than emitting a broken .pt2. Enable the

# per-atom virial for the graph form -- it is near-free there (one extra

# scatter off the single shared backward).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/pt_expt/entrypoints/main.py` around lines 571 - 576, Update the stale inline comment in main by removing the “no type exclusion” restriction so it matches the current graph-eligibility behavior. Keep the note aligned with `_model_uses_graph_lower` in training.py and the nearby ValueError message: describe graph lower as opt-in for graph-eligible models (dpa1 with concat tebd, attention layers, and supported exclude_types) and preserve the rest of the fail-fast/per-atom-virial explanation.

coderabbitai · 2026-07-05T01:19:20Z

+In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no descriptor-level `exclude_types` or compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list.
+The graph path considers all neighbors within the cutoff, so its result does not depend on {ref}`sel <model[standard]/descriptor[se_atten]/sel>`.
+When `smooth_type_embedding` is `true` and {ref}`attn_layer <model[standard]/descriptor[se_atten]/attn_layer>` is larger than 0 (the defaults), the dense path keeps `sel`-padding phantom terms in the attention softmax denominator while the graph path drops them, so checkpoints trained under the dense semantics shift by up to about 1e-4 in energy when evaluated on the graph path.
+Passing `neighbor_graph_method="legacy"` to the model forward (or the corresponding evaluation option) restores the dense-path numbers exactly.
+


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Stale "no exclude_types" eligibility gate contradicts this PR's own change.

This new sentence states that descriptor-level exclude_types disqualifies a descriptor from the graph-eligible/carry-all path in pt_expt. But this cohort explicitly removes that gate: test_exclude_types_graph_eligible_and_parity in source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py asserts dd.uses_graph_lower() is True when exclude_types is set, with the comment "gate: with any exclude list the descriptor must now be graph-eligible." The PR objective also states the eligibility gate for exclude_types was removed. This doc line will mislead users about which models route through the carry-all graph path.

📝 Proposed fix

-In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no descriptor-level `exclude_types` or compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list. +In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list. Descriptor-level `exclude_types` no longer disqualifies a descriptor from this path.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no descriptor-level `exclude_types` or compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list.

The graph path considers all neighbors within the cutoff, so its result does not depend on {ref}`sel <model[standard]/descriptor[se_atten]/sel>`.

When `smooth_type_embedding` is `true` and {ref}`attn_layer <model[standard]/descriptor[se_atten]/attn_layer>` is larger than 0 (the defaults), the dense path keeps `sel`-padding phantom terms in the attention softmax denominator while the graph path drops them, so checkpoints trained under the dense semantics shift by up to about 1e-4 in energy when evaluated on the graph path.

Passing `neighbor_graph_method="legacy"` to the model forward (or the corresponding evaluation option) restores the dense-path numbers exactly.

In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list. Descriptor-level `exclude_types` no longer disqualifies a descriptor from this path.

The graph path considers all neighbors within the cutoff, so its result does not depend on {ref}`sel <model[standard]/descriptor[se_atten]/sel>`.

When `smooth_type_embedding` is `true` and {ref}`attn_layer <model[standard]/descriptor[se_atten]/attn_layer>` is larger than 0 (the defaults), the dense path keeps `sel`-padding phantom terms in the attention softmax denominator while the graph path drops them, so checkpoints trained under the dense semantics shift by up to about 1e-4 in energy when evaluated on the graph path.

Passing `neighbor_graph_method="legacy"` to the model forward (or the corresponding evaluation option) restores the dense-path numbers exactly.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@doc/model/train-se-atten.md` around lines 160 - 164, Update the pt_expt training doc sentence describing graph-eligible descriptors so it no longer says descriptor-level exclude_types disqualifies the carry-all neighbor-graph path. Use the surrounding se_atten/neighbor_graph_method explanation to state that mixed-type descriptors with tebd_input_mode "concat" and no descriptor-level compression remain graph-eligible, while exclude_types is not a blocking condition anymore. Keep the dense-vs-graph parity note tied to smooth_type_embedding and attn_layer, but make the eligibility rule consistent with the current behavior exercised by test_exclude_types_graph_eligible_and_parity and dd.uses_graph_lower().

codecov · 2026-07-05T02:03:04Z

Codecov Report

❌ Patch coverage is 92.83820% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.20%. Comparing base (40d7a49) to head (1b5c75d).
⚠️ Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
.../api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc	79.31%	12 Missing ⚠️
deepmd/pt_expt/utils/nv_graph_builder.py	0.00%	5 Missing ⚠️
deepmd/pt/utils/nv_nlist.py	40.00%	3 Missing ⚠️
deepmd/pt_expt/model/make_model.py	60.00%	2 Missing ⚠️
source/api_cc/src/DeepPotPTExpt.cc	86.66%	0 Missing and 2 partials ⚠️
deepmd/dpmodel/descriptor/dpa1.py	98.55%	1 Missing ⚠️
deepmd/dpmodel/utils/neighbor_graph/env.py	80.00%	1 Missing ⚠️
source/api_cc/include/commonPT.h	98.03%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5733      +/-   ##
==========================================
- Coverage   81.29%   81.20%   -0.09%     
==========================================
  Files         990      992       +2     
  Lines      111019   111372     +353     
  Branches     4235     4250      +15     
==========================================
+ Hits        90252    90444     +192     
- Misses      19243    19401     +158     
- Partials     1524     1527       +3

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Han Wang and others added 30 commits July 5, 2026 00:03

feat(dpmodel): segment_max + numerically-stable mask-aware segment_so…

962399b

…ftmax Built on the existing xp_maximum_at (no new array_api helper needed). Part of NeighborGraph PR-D (graph-native attention).

[pre-commit.ci] auto fixes from pre-commit.com hooks

1c28aa5

for more information, see https://pre-commit.ci

fix(pt_expt): fail fast on torch < 2.6 for graph attention tracing

d49b9cb

docs(pt_expt): numpydoc sections for check_graph_trace_torch_version

7c65935

test+docs: pin smooth-attention graph-vs-dense divergence; document t…

6fc45bd

…he pt_expt graph default and legacy escape hatch

feat(dpmodel): canonical apply_pair_exclusion graph transform (decision

ce98ea5

deepmodeling#18)

fix(dpmodel): raise NotImplementedError in apply_pair_exclusion(compa…

fffca36

…ct=True) when angle fields are present

refactor(dpmodel): atomic-model pair exclusion via apply_pair_exclusion

ed6334e

feat(dpmodel): dpa1 graph path supports exclude_types via apply_pair_…

6c2b007

…exclusion; drop eligibility gate

test(pt_expt): pair_exclude_types graph-vs-legacy parity + vacuity check

2711fb3

docs: add notes on nv_graph_builder pair_excl oracle gap

45b0e30

test(pt_expt): graph-route exclude_types coverage (parity + make_fx)

f4e4691

test(pt_expt): descriptor-level exclude_types export + graph-vs-legac…

4c02666

…y parity

fix: add reduced-virial parity assertion in descriptor exclude_types …

b842263

…test

fix(neighbor_graph): use logical_and+bool cast in apply_pair_exclusio…

f7cdb46

…n for array_api_strict compat

fix(dpmodel): guard edges+pair_excl, fix base docstring (A4 review)

4a9b6b5

Han Wang added 6 commits July 5, 2026 02:11

docs: exclude_types is graph-native; fix stale comments in _call_dens…

5adae7e

…e and forward_common_atomic_graph

dosubot Bot added the enhancement label Jul 5, 2026

github-actions Bot added Python C++ Docs labels Jul 5, 2026

[pre-commit.ci] auto fixes from pre-commit.com hooks

1b5c75d

for more information, see https://pre-commit.ci

github-advanced-security AI found potential problems Jul 5, 2026

View reviewed changes

coderabbitai Bot reviewed Jul 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: pair exclude_types as canonical NeighborGraph transform; dpa1 graph path supports exclude_types (decision #18)#5733

refactor: pair exclude_types as canonical NeighborGraph transform; dpa1 graph path supports exclude_types (decision #18)#5733
wanghan-iapcm wants to merge 37 commits into
deepmodeling:masterfrom
wanghan-iapcm:feat-graph-pair-exclude

wanghan-iapcm commented Jul 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jul 5, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jul 5, 2026

Uh oh!

coderabbitai Bot Jul 5, 2026

Uh oh!

codecov Bot commented Jul 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

wanghan-iapcm commented Jul 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pair exclude_types as a canonical NeighborGraph transform (decision #18)

What changed

Known limitations

Spin routing

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 5, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jul 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wanghan-iapcm commented Jul 5, 2026 •

edited by coderabbitai Bot

Loading

Pair `exclude_types` as a canonical NeighborGraph transform (decision #18)

coderabbitai Bot commented Jul 5, 2026 •

edited

Loading

codecov Bot commented Jul 5, 2026 •

edited

Loading