Merged
Conversation
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added an option to control whether output statistics are computed or loaded across atomic models. * **Bug Fixes** * More robust parameter transfer during fine‑tuning to handle renamed branches and missing pretrained keys. * **Refactor** * Revised output-statistics workflow and refined per‑type output bias application in composite models. * **Tests** * Simplified linear-model bias checks and added a ZBL finetuning test path. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: anyangml <anyangpeng.ca@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
fix eta computation code <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Improved ETA accuracy in training/validation progress logs by adapting calculations to recent step intervals, reducing misleading estimates early in runs. * Consistent behavior across both backends, providing more reliable remaining-time estimates without changing any public interfaces. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…ms (#4869) When using virtual atoms, the property output of virtual atom is `0`. - If predicting energy or other extensive properties, it works well, that's because the virtual atom property `0` do not contribute to the total energy or other extensive properties. - However, if predicting intensive properties, there is some error. For example, a frame has two real atoms and two virtual atoms, the atomic property contribution is [2, 2, 0, 0](the atomic property of virtual atoms are always 0), the final property should be `(2+2)/real_atoms = 2`, not be `(2+2)/total_atoms =1`. This PR is used to solve this bug mentioned above. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Models now provide accessors to retrieve property names and their fitting network; property fitting nets expose output definitions. * **Bug Fixes** * Intensive property reduction respects atom masks so padded/dummy atoms are ignored, keeping results invariant to padding. * **Tests** * Added PyTorch, JAX, and core tests validating consistent behavior with padded atoms. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Fix version finding in pip and CMake; pin TF to <2.20 on Windows; fix TENSORFLOW_ROOT in the CI. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added compatibility with TensorFlow 2.20+ via runtime version detection and generated version macros. - Bug Fixes - Clearer errors when a specified TensorFlow root is invalid. - Improved version-parsing fallback for newer TensorFlow releases. - Tightened Windows CPU wheel constraint to avoid incompatible versions. - Chores - Updated devcontainer scripts and CI workflows to more reliably locate TensorFlow without importing it directly. - Linked TensorFlow during version checks to ensure accurate detection. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn> Signed-off-by: Jinzhe Zeng <njzjz@qq.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
The UT of padding atoms(pytorch backend) sometimes fails like:
```
Mismatched elements: 1 / 2 (50%)
Max absolute difference among violations: 1.97471693e-08
Max relative difference among violations: 6.45619919e-07
ACTUAL: array([[-0.236542],
[ 0.030586]])
DESIRED: array([[-0.236542],
[ 0.030586]])
= 1 failed, 15442 passed, 4135 skipped, 97877 deselected, 224 warnings in 2825.25s (0:47:05) =
```
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Tests**
- Adjusted numerical comparison assertions to use both absolute and
relative tolerances in padding-related tests.
- Aligns checks between computed results and references, improving
resilience to minor floating-point variation.
- Reduces intermittent test failures across environments and dependency
versions.
- No impact on features, performance, or user workflows.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
…t in dpa3 document (#4887) update paddle installation scripts and custom border op error message <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Updated installation guides to reference PaddlePaddle 3.1.1 for CUDA 12.6, CUDA 11.8, and CPU; added nightly pre-release install examples. * Refined training docs wording and CINN note; added Paddle backend guidance and explicit OP-install instructions in DPA3 docs. * **Chores** * Improved error messages when custom Paddle operators are unavailable, adding clearer install instructions and links to documentation. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: HydrogenSulfate <490868991@qq.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Fix #4877. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Bug Fixes - Improved build compatibility with PyTorch 2.8+ on UNIX-like systems (excluding macOS) by aligning the default ABI selection with PyTorch’s behavior. This reduces potential linker/runtime issues when building against newer PyTorch versions. Behavior on other platforms and with older PyTorch remains unchanged. No runtime functionality changes for end users. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Training entrypoints now accept YAML configuration files in addition to JSON, offering more flexibility when launching training. * Unified configuration loading across frameworks for consistent behavior (PyTorch, Paddle, TensorFlow). * Backward compatible: existing JSON-based workflows continue to work unchanged. * **Tests** * Added coverage to verify YAML input produces the expected training output. * Improved test cleanup to remove generated artifacts after execution. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/releases">actions/checkout's releases</a>.</em></p> <blockquote> <h2>v5.0.0</h2> <h2>What's Changed</h2> <ul> <li>Update actions checkout to use node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li> <li>Prepare v5.0.0 release by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2238">actions/checkout#2238</a></li> </ul> <h2>⚠️ Minimum Compatible Runner Version</h2> <p><strong>v2.327.1</strong><br /> <a href="https://github.com/actions/runner/releases/tag/v2.327.1">Release Notes</a></p> <p>Make sure your runner is updated to this version or newer to use this release.</p> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4...v5.0.0">https://github.com/actions/checkout/compare/v4...v5.0.0</a></p> <h2>v4.3.0</h2> <h2>What's Changed</h2> <ul> <li>docs: update README.md by <a href="https://github.com/motss"><code>@motss</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li>Add internal repos for checking out multiple repositories by <a href="https://github.com/mouismail"><code>@mouismail</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li>Documentation update - add recommended permissions to Readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li>Adjust positioning of user email note and permissions heading by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li>Update CODEOWNERS for actions by <a href="https://github.com/TingluoHuang"><code>@TingluoHuang</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li> <li>Update package dependencies by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> <li>Prepare release v4.3.0 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2237">actions/checkout#2237</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/motss"><code>@motss</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li><a href="https://github.com/mouismail"><code>@mouismail</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li><a href="https://github.com/benwells"><code>@benwells</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li><a href="https://github.com/nebuk89"><code>@nebuk89</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li><a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4...v4.3.0">https://github.com/actions/checkout/compare/v4...v4.3.0</a></p> <h2>v4.2.2</h2> <h2>What's Changed</h2> <ul> <li><code>url-helper.ts</code> now leverages well-known environment variables by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li> <li>Expand unit test coverage for <code>isGhes</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4.2.1...v4.2.2">https://github.com/actions/checkout/compare/v4.2.1...v4.2.2</a></p> <h2>v4.2.1</h2> <h2>What's Changed</h2> <ul> <li>Check out other refs/* by commit if provided, fall back to ref by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/Jcambass"><code>@Jcambass</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/1919">actions/checkout#1919</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4.2.0...v4.2.1">https://github.com/actions/checkout/compare/v4.2.0...v4.2.1</a></p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2>V5.0.0</h2> <ul> <li>Update actions checkout to use node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li> </ul> <h2>V4.3.0</h2> <ul> <li>docs: update README.md by <a href="https://github.com/motss"><code>@motss</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li>Add internal repos for checking out multiple repositories by <a href="https://github.com/mouismail"><code>@mouismail</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li>Documentation update - add recommended permissions to Readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li>Adjust positioning of user email note and permissions heading by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li>Update CODEOWNERS for actions by <a href="https://github.com/TingluoHuang"><code>@TingluoHuang</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li> <li>Update package dependencies by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> </ul> <h2>v4.2.2</h2> <ul> <li><code>url-helper.ts</code> now leverages well-known environment variables by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li> <li>Expand unit test coverage for <code>isGhes</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li> </ul> <h2>v4.2.1</h2> <ul> <li>Check out other refs/* by commit if provided, fall back to ref by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li> </ul> <h2>v4.2.0</h2> <ul> <li>Add Ref and Commit outputs by <a href="https://github.com/lucacome"><code>@lucacome</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li> <li>Dependency updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>- <a href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>, <a href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li> </ul> <h2>v4.1.7</h2> <ul> <li>Bump the minor-npm-dependencies group across 1 directory with 4 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li> <li>Bump actions/checkout from 3 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li> <li>Check out other refs/* by commit by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li> <li>Pin actions/checkout's own workflows to a known, good, stable version. by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li> </ul> <h2>v4.1.6</h2> <ul> <li>Check platform to set archive extension appropriately by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li> </ul> <h2>v4.1.5</h2> <ul> <li>Update NPM dependencies by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li> <li>Bump github/codeql-action from 2 to 3 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li> <li>Bump actions/setup-node from 1 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li> <li>Bump actions/upload-artifact from 2 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li> <li>README: Suggest <code>user.email</code> to be <code>41898282+github-actions[bot]@users.noreply.github.com</code> by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1707">actions/checkout#1707</a></li> </ul> <h2>v4.1.4</h2> <ul> <li>Disable <code>extensions.worktreeConfig</code> when disabling <code>sparse-checkout</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1692">actions/checkout#1692</a></li> <li>Add dependabot config by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1688">actions/checkout#1688</a></li> <li>Bump the minor-actions-dependencies group with 2 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1693">actions/checkout#1693</a></li> <li>Bump word-wrap from 1.2.3 to 1.2.5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1643">actions/checkout#1643</a></li> </ul> <h2>v4.1.3</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/actions/checkout/commit/08c6903cd8c0fde910a37f88322edcfb5dd907a8"><code>08c6903</code></a> Prepare v5.0.0 release (<a href="https://redirect.github.com/actions/checkout/issues/2238">#2238</a>)</li> <li><a href="https://github.com/actions/checkout/commit/9f265659d3bb64ab1440b03b12f4d47a24320917"><code>9f26565</code></a> Update actions checkout to use node 24 (<a href="https://redirect.github.com/actions/checkout/issues/2226">#2226</a>)</li> <li>See full diff in <a href="https://github.com/actions/checkout/compare/v4...v5">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
<!--pre-commit.ci start--> updates: - [github.com/astral-sh/ruff-pre-commit: v0.12.8 → v0.12.9](astral-sh/ruff-pre-commit@v0.12.8...v0.12.9) <!--pre-commit.ci end--> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…ix bugs mentioned in issue #4906 (#4908) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added optional support to pass a communication dictionary through lower-level model computations across energy, dipole, DOS, polarization, and related models. This enables advanced workflows while remaining fully backward compatible. - Refactor - Standardized internal propagation of the communication dictionary across sub-models to ensure consistent behavior. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…etup (#4911) This PR adds comprehensive development support for GitHub Copilot agents working in the DeePMD-kit codebase. ## What's included **Comprehensive Copilot Instructions (`.github/copilot-instructions.md`)** - Complete build workflow with exact timing expectations (67s Python build, 164s C++ build) - Virtual environment setup and dependency installation for all backends (TensorFlow, PyTorch, JAX, Paddle) - **Optimized testing guidance**: Emphasizes single test execution (~8-13 seconds) over full test suite (60+ minutes) for faster development feedback - Linting and formatting with ruff (1 second execution) - Multiple validation scenarios for CLI, Python interface, and training workflows - Directory structure reference and key file locations - Critical warnings with specific timeout recommendations to prevent premature cancellation - **Conventional commit specification**: Guidelines for commit messages and PR titles following `type(scope): description` format **Automated Environment Setup (`.github/workflows/copilot-setup-steps.yml`)** - Pre-configures Python environment using uv for fast dependency management - Installs TensorFlow CPU and PyTorch automatically - Builds the DeePMD-kit package with all dependencies - Sets up pre-commit hooks for code quality - Validates installation to ensure environment readiness **Development Efficiency Features** - All commands tested and validated with accurate timing measurements - Imperative tone throughout for clear action items - Copy-paste ready validation scenarios - Gitignore rules to prevent temporary test files from being committed ## Key improvements for Copilot agents - **Faster iteration**: Single test recommendations instead of 60+ minute full test suites - **Automated setup**: No manual environment configuration needed - **Precise expectations**: Exact timing guidance prevents timeout issues during builds - **Multi-backend support**: Complete coverage of TensorFlow, PyTorch, JAX, and Paddle workflows - **Consistent commit standards**: Enforces conventional commit specification for all changes The instructions enable any GitHub Copilot agent to work effectively in this codebase from a fresh clone with precise expectations for build times, test execution, and validation workflows. Fixes #4910. <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…truction warnings (#4907) This PR fixes deprecation warnings that occur when `torch.tensor()` or `paddle.to_tensor()` is called on existing tensor objects: **PyTorch warning:** ``` UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). ``` **PaddlePaddle warning:** ``` UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach(), rather than paddle.to_tensor(sourceTensor). ``` ## Root Cause The warnings were being triggered in multiple locations: 1. **PyTorch**: Test cases were passing tensor objects directly to ASE calculators, which internally convert them using `torch.tensor()` 2. **PaddlePaddle**: Similar issues in `eval_model` function and `to_paddle_tensor` utility, plus a TypeError where `tensor.to()` method was incorrectly using `place=` instead of `device=` ## Solution **For PyTorch:** - Modified test cases to convert tensor inputs to numpy arrays before passing to ASE calculators - Removed redundant tensor handling in `to_torch_tensor` utility function since the non-numpy check already handles tensors by returning them as-is **For PaddlePaddle:** - Added proper type checking in `eval_model` function to handle existing tensors with `clone().detach()` - Removed redundant tensor handling in `to_paddle_tensor` utility function, applying the same optimization as PyTorch - Fixed TypeError by changing `place=` to `device=` in all `tensor.to()` method calls (PaddlePaddle's tensor `.to()` method expects `device=` parameter, while `paddle.to_tensor()` correctly uses `place=`) ## Changes Made 1. **`source/tests/pt/test_calculator.py`**: Fixed `TestCalculator` and `TestCalculatorWithFparamAparam` to convert PyTorch tensors to numpy arrays before passing to ASE calculator 2. **`deepmd/pt/utils/utils.py`**: Removed redundant tensor-specific handling in `to_torch_tensor` function 3. **`source/tests/pd/common.py`**: Updated `eval_model` function with type checking for PaddlePaddle tensors and fixed `tensor.to()` method calls to use `device=` instead of `place=` 4. **`deepmd/pd/utils/utils.py`**: Removed redundant tensor-specific handling in `to_paddle_tensor` function for consistency with PyTorch Both utility functions now use a simplified approach where the `if not isinstance(xx, np.ndarray): return xx` check handles all non-numpy inputs (including tensors) by returning them unchanged, eliminating the need for separate tensor-specific code paths. This change is backward compatible and maintains the same functionality while eliminating both deprecation warnings and TypeErrors, improving code consistency between PyTorch and PaddlePaddle backends. Fixes #3790. <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…put format (#4903) This PR implements a new command-line interface for evaluating descriptors using trained DeePMD models, addressing the feature request for making the `eval_descriptor` function available from the command line. ## Overview The new `dp eval-desc` command allows users to generate descriptor matrices from their models using a simple CLI interface, similar to the existing `dp test` command. ## Usage ```bash # Basic usage dp eval-desc -m model.pb -s /path/to/system # With custom output directory dp eval-desc -m model.pth -s /path/to/system -o my_descriptors # Using datafile with multiple systems dp eval-desc -m model.pb -f systems_list.txt -o desc_output # For multi-task models dp eval-desc -m model.pth -s system_dir --head task_branch ``` ## Output Format Descriptors are saved as NumPy `.npy` files in 3D format (nframes, natoms, ndesc) preserving the natural structure of the data with separate dimensions for frames, atoms, and descriptor components. This format maintains the original data organization and is suitable for various analysis workflows. ## Implementation Details The implementation follows the same architectural pattern as the existing `dp test` command: - **CLI Parser**: Added argument parser in `deepmd/main.py` with options for model (`-m`), system (`-s`), datafile (`-f`), output (`-o`), and model branch (`--head`) - **Command Routing**: Integrated into the entrypoints system in `deepmd/entrypoints/main.py` - **Core Functionality**: New `eval_desc.py` module that uses `DeepEval.eval_descriptor()` to generate descriptors and saves them as `.npy` files in their natural 3D format - **Documentation**: Updated user guide and API documentation with output format details - **Testing**: Comprehensive tests following the pattern of existing `dp test` functionality Fixes #4503. <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/deepmodeling/deepmd-kit/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Bumps [actions/upload-pages-artifact](https://github.com/actions/upload-pages-artifact) from 3 to 4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/upload-pages-artifact/releases">actions/upload-pages-artifact's releases</a>.</em></p> <blockquote> <h2>v4.0.0</h2> <h2>What's Changed</h2> <ul> <li>Potentially breaking change: hidden files (specifically dotfiles) will not be included in the artifact by <a href="https://github.com/tsusdere"><code>@tsusdere</code></a> in <a href="https://redirect.github.com/actions/upload-pages-artifact/pull/102">actions/upload-pages-artifact#102</a> If you need to include dotfiles in your artifact: instead of using this action, create your own artifact according to these requirements <a href="https://github.com/actions/upload-pages-artifact?tab=readme-ov-file#artifact-validation">https://github.com/actions/upload-pages-artifact?tab=readme-ov-file#artifact-validation</a></li> <li>Pin <code>actions/upload-artifact</code> to SHA by <a href="https://github.com/heavymachinery"><code>@heavymachinery</code></a> in <a href="https://redirect.github.com/actions/upload-pages-artifact/pull/127">actions/upload-pages-artifact#127</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/upload-pages-artifact/compare/v3.0.1...v4.0.0">https://github.com/actions/upload-pages-artifact/compare/v3.0.1...v4.0.0</a></p> <h2>v3.0.1</h2> <h1>Changelog</h1> <ul> <li>Group tar's output to prevent it from messing up action logs <a href="https://github.com/SilverRainZ"><code>@SilverRainZ</code></a> (<a href="https://redirect.github.com/actions/upload-pages-artifact/issues/94">#94</a>)</li> <li>Update README.md <a href="https://github.com/uiolee"><code>@uiolee</code></a> (<a href="https://redirect.github.com/actions/upload-pages-artifact/issues/88">#88</a>)</li> <li>Bump the non-breaking-changes group with 1 update <a href="https://github.com/dependabot"><code>@dependabot</code></a> (<a href="https://redirect.github.com/actions/upload-pages-artifact/issues/92">#92</a>)</li> <li>Update Dependabot config to group non-breaking changes <a href="https://github.com/JamesMGreene"><code>@JamesMGreene</code></a> (<a href="https://redirect.github.com/actions/upload-pages-artifact/issues/91">#91</a>)</li> <li>Bump actions/checkout from 3 to 4 <a href="https://github.com/dependabot"><code>@dependabot</code></a> (<a href="https://redirect.github.com/actions/upload-pages-artifact/issues/76">#76</a>)</li> </ul> <p>See details of <a href="https://github.com/actions/upload-pages-artifact/compare/v3.0.0...v3.0.1">all code changes</a> since previous release.</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/actions/upload-pages-artifact/commit/7b1f4a764d45c48632c6b24a0339c27f5614fb0b"><code>7b1f4a7</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-pages-artifact/issues/127">#127</a> from heavymachinery/pin-sha</li> <li><a href="https://github.com/actions/upload-pages-artifact/commit/4cc19c7d3f3e6c87c68366501382a03c8b1ba6db"><code>4cc19c7</code></a> Pin <code>actions/upload-artifact</code> to SHA</li> <li><a href="https://github.com/actions/upload-pages-artifact/commit/2d163be3ddce01512f3eea7ac5b7023b5d643ce1"><code>2d163be</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-pages-artifact/issues/107">#107</a> from KittyChiu/main</li> <li><a href="https://github.com/actions/upload-pages-artifact/commit/c70484322b1c476728dcd37fac23c4dea2a0c51a"><code>c704843</code></a> fix: linted README</li> <li><a href="https://github.com/actions/upload-pages-artifact/commit/9605915f1d2fc79418cdce4d5fbe80511c457655"><code>9605915</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-pages-artifact/issues/106">#106</a> from KittyChiu/kittychiu/update-readme-1</li> <li><a href="https://github.com/actions/upload-pages-artifact/commit/e59cdfe6d6b061aab8f0619e759cded914f3ab03"><code>e59cdfe</code></a> Update README.md</li> <li><a href="https://github.com/actions/upload-pages-artifact/commit/a2d67043267d885050434d297d3dd3a3a14fd899"><code>a2d6704</code></a> doc: updated usage section in readme</li> <li><a href="https://github.com/actions/upload-pages-artifact/commit/984864e7b70fb5cb764344dc9c4b5c087662ef50"><code>984864e</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-pages-artifact/issues/105">#105</a> from actions/Jcambass-patch-1</li> <li><a href="https://github.com/actions/upload-pages-artifact/commit/45dc78884ca148c05eddcd8ac0a804d3365e9014"><code>45dc788</code></a> Add workflow file for publishing releases to immutable action package</li> <li><a href="https://github.com/actions/upload-pages-artifact/commit/efaad07812d4b9ad2e8667cd46426fdfb7c22e22"><code>efaad07</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-pages-artifact/issues/102">#102</a> from actions/hidden-files</li> <li>Additional commits viewable in <a href="https://github.com/actions/upload-pages-artifact/compare/v3...v4">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Avoid specifying pin_memory for test DataLoaders to eliminate warnings when no accelerator is available. #4874 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Updated test configurations to rely on default memory pinning behavior in data loading, improving compatibility across environments. * Simplified test setup parameters to reduce potential flakiness and align with framework defaults. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!--pre-commit.ci start--> updates: - [github.com/astral-sh/ruff-pre-commit: v0.12.9 → v0.12.10](astral-sh/ruff-pre-commit@v0.12.9...v0.12.10) <!--pre-commit.ci end--> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Upgraded PyTorch to 2.8 across CPU and CUDA 12.x environments for improved compatibility and stability. * Updated development container to download the matching LibTorch 2.8 CPU bundle. * Refreshed CI pipelines (build, test, analysis) to install and validate against PyTorch 2.8. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Jinzhe Zeng <njzjz@qq.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…ent overflow (#4924) The `numel` function in the Paddle backend was using `int` for computing tensor element counts, which can overflow for large tensors. This fix changes the return type and intermediate calculations to `size_t` to handle larger tensor sizes safely. ## Problem The original implementation multiplied tensor dimensions as `int` values: ```cpp int numel(const paddle_infer::Tensor& x) const { // TODO: There might be a overflow problem here for multiply int numbers. int ret = 1; std::vector<int> x_shape = x.shape(); for (std::size_t i = 0, n = x_shape.size(); i < n; ++i) { ret *= x_shape[i]; // Can overflow for large tensors } return ret; } ``` For large tensors (e.g., shape `[50000, 50000, 10]` = 25 billion elements), this causes integer overflow and returns negative values. ## Solution - Changed return type from `int` to `size_t` - Changed intermediate calculations to use `size_t` with explicit casting - Updated all calling sites to use `size_t` variables - Removed the TODO comment since the overflow issue is now resolved ```cpp size_t numel(const paddle_infer::Tensor& x) const { size_t ret = 1; std::vector<int> x_shape = x.shape(); for (std::size_t i = 0, n = x_shape.size(); i < n; ++i) { ret *= static_cast<size_t>(x_shape[i]); // Safe from overflow } return ret; } ``` The `size_t` type can handle up to 2^64 elements on 64-bit systems (vs 2^31 for `int`), making it appropriate for tensor element counts. This change is backward compatible since `std::vector::resize()` and other consumers already accept `size_t`. Fixes #4551. <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/deepmodeling/deepmd-kit/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com>
support gradient accumulation for paddle backend. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Configurable gradient accumulation (acc_freq) that batches optimizer updates, optional gradient clipping, and multi‑GPU gradient sync to occur at the configured interval; acc_freq=1 preserves prior behavior. - **Documentation** - Added argument docs and a Paddle backend notice describing acc_freq. - **Tests** - Added tests exercising gradient accumulation and updated test cleanup. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Introduces model branch alias and info fields to model configuration, adds utility functions for handling model branch dictionaries, and updates related modules to use alias-based lookup and provide detailed branch information. Enhances multi-task model usability and improves logging of available model branches. example: ``` dp --pt show 0415_compat_new.pt model-branch [2025-08-14 10:05:54,246] DEEPMD WARNING To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information. [2025-08-14 10:05:59,122] DEEPMD INFO This is a multitask model [2025-08-14 10:05:59,122] DEEPMD INFO Available model branches are ['Dai2023Alloy', 'Zhang2023Cathode', 'Gong2023Cluster', 'Yang2023ab', 'UniPero', 'Huang2021Deep-PBE', 'Liu2024Machine', 'Zhang2021Phase', 'Jinag2021Accurate', 'Chen2023Modeling', 'Wen2021Specialising', 'Wang2022Classical', 'Wang2022Tungsten', 'Wu2021Deep', 'Huang2021Deep-PBEsol', 'Transition1x', 'Wang2021Generalizable', 'Wu2021Accurate', 'MPTraj', 'Li2025APEX', 'Shi2024SSE', 'Tuo2023Hybrid', 'Unke2019PhysNet', 'Shi2024Electrolyte', 'ODAC23', 'Alex2D', 'OMAT24', 'SPICE2', 'OC20M', 'OC22', 'Li2025General', 'RANDOM'], where 'RANDOM' means using a randomly initialized fitting net. [2025-08-14 10:05:59,125] DEEPMD INFO Detailed information: +-----------------------+------------------------------+--------------------------------+--------------------------------+ | Model Branch | Alias | description | observed_type | +-----------------------+------------------------------+--------------------------------+--------------------------------+ | Dai2023Alloy | Alloys, Domains_Alloy | The dataset contains | ['La', 'Fe', 'Ho', 'Cu', 'Sn', | | | | structure-energy-force-virial | 'Cd', 'Y', 'Be', 'V', 'Sm', | | | | data for 53 typical metallic | 'In', 'Pr', 'Mo', 'Mn', 'Gd', | | | | elements in alloy systems, | 'Ru', 'Nd', 'Li', 'Tm', 'K', | | | | including ~9000 intermetallic | 'Pt', 'Ir', 'Na', 'Hf', 'Dy', | | | | compounds and FCC, BCC, HCP | 'Ca', 'Nb', 'Au', 'Sr', 'Si', | | | | structures. It consists of two | 'Ge', 'Co', 'W', 'Cr', 'Zn', | | | | parts: DFT-generated relaxed | 'Ag', 'Ti', 'Ni', 'Zr', 'Pd', | | | | and deformed structures, and | 'Os', 'Ta', 'Rh', 'Sc', 'Tb', | | | | randomly distorted structures | 'Al', 'Ga', 'Re', 'Lu', 'Er', | | | | produced covering pure metals, | 'Mg', 'Ce', 'Pb'] | | | | solid solutions, and | | | | | intermetallics with vacancies. | | +-----------------------+------------------------------+--------------------------------+--------------------------------+ | OMAT24 | Default, Materials, Omat24 | OMat24 is a large-scale open | ['La', 'Fe', 'Cu', 'Cd', 'Be', | | | | dataset containing over 110 | 'Ar', 'V', 'Sm', 'In', 'Pm', | | | | million DFT calculations | 'Pr', 'Mn', 'Ru', 'He', 'Nd', | | | | spanning diverse structures | 'Th', 'Pa', 'K', 'Pt', 'Yb', | | | | and compositions. It is | 'Dy', 'Sr', 'Co', 'Np', 'Cr', | | | | designed to support AI-driven | 'Tl', 'Br', 'Se', 'Ni', 'Zr', | | | | materials discovery by | 'Pu', 'O', 'Xe', 'Tb', 'Ga', | | | | providing broad and deep | 'Lu', 'H', 'Ne', 'Er', 'Ce', | | | | coverage of chemical space. | 'I', 'Kr', 'Ho', 'Cs', 'Sn', | | | | | 'Rb', 'Y', 'N', 'F', 'Mo', | | | | | 'Gd', 'B', 'Li', 'Tm', 'Sb', | | | | | 'Ir', 'Hf', 'Na', 'Ca', 'Nb', | | | | | 'Au', 'As', 'Si', 'Ge', 'W', | | | | | 'Zn', 'Hg', 'Ag', 'Bi', 'Ti', | | | | | 'Os', 'Cl', 'Pd', 'P', 'U', | | | | | 'Tc', 'Ta', 'Ba', 'Rh', 'Sc', | | | | | 'C', 'S', 'Te', 'Al', 'Re', | | | | | 'Eu', 'Mg', 'Pb', 'Ac'] | +-----------------------+------------------------------+--------------------------------+--------------------------------+ ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Alias-based multi-task branch selection for evaluation and fine-tuning; new API to query model alias/branch info; show now prints a detailed model-branch table. * **Documentation** * Model config gains optional fields to declare branch aliases and per-branch info (PyTorch-only). * **Examples** * Added a two-task PyTorch example demonstrating aliases, shared components, and per-branch info. * **Tests** * Tests include the new example and now filter out table-like show output. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Duo <50307526+iProzd@users.noreply.github.com> Co-authored-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu> Co-authored-by: Han Wang <92130845+wanghan-iapcm@users.noreply.github.com>
…4916) This PR implements a feature request to skip all GitHub workflows on push events for bot-created branches to avoid redundant CI runs and save resources. ## Problem Bot-created branches (`copilot/*`, `dependabot/*`, and `pre-commit-ci-update-config`) currently trigger workflows on both push events and when PRs are created. This creates duplicate CI runs since the same tests will run again when the PR is opened, wasting CI time and resources. ## Solution Added `branches-ignore` patterns to workflow files that have push triggers to skip the following branch patterns: - `copilot/**` - GitHub Copilot branches - `dependabot/**` - Dependabot dependency update branches - `pre-commit-ci-update-config` - Pre-commit CI configuration update branches ## Changes Made Updated 8 workflow files with bot branch ignore patterns: - `build_cc.yml`, `build_wheel.yml`, `codeql.yml`, `package_c.yml`, `test_cc.yml`, `test_python.yml` - Added bot branch patterns to existing `branches-ignore` lists - `copilot-setup-steps.yml` - Added `branches-ignore` alongside existing `paths` filter - `mirror_gitee.yml` - Converted from array syntax to explicit push configuration with `branches-ignore` The `todo.yml` workflow was left unchanged since it only runs on the `devel` branch, making bot branch exclusions unnecessary. Example of the change: ```yaml on: push: branches-ignore: - "gh-readonly-queue/**" # existing - "copilot/**" # new - "dependabot/**" # new - "pre-commit-ci-update-config" # new ``` ## Impact - ✅ Bot branches will skip workflows on push events but still trigger them when PRs are created - ✅ Normal development branches continue to trigger workflows as expected - ✅ Reduces unnecessary CI runs and resource usage - ✅ Maintains full test coverage through PR-triggered workflows - ✅ All workflow files maintain valid YAML syntax Fixes #4915. <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com>
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added per-atom weighting for force evaluation: computes and reports weighted MAE/RMSE alongside unweighted metrics, includes weighted metrics in system-average summaries, logs weighted force metrics, and safely handles zero-weight cases. Also propagates the per-atom weight field into reporting. - Tests - Added end-to-end tests validating weighted vs unweighted force MAE/RMSE and verifying evaluator outputs when using per-atom weight masks. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
… and tests (#4936) - [x] Add comprehensive type hints to core modules excluding backends and tests - [x] **Fixed type annotation issues from code review:** - Fixed `head` parameter type from `Any` to `str` in calculator.py - Fixed `neighbor_list` parameter type to use proper ASE NeighborList type annotation - Fixed `**kwargs` type from `object` to `Any` in deep_polar.py - Fixed `write_model_devi_out` return type from `None` to `np.ndarray` to match actual return value - Fixed `get_natoms_vec` return type from `list[int]` to `np.ndarray` to match actual return type - Fixed `_get_natoms_2` return type from `list[int]` to `tuple[int, np.ndarray]` to match actual return values - Fixed `make_index` return type from `dict[str, int]` to `str` to match actual return value - Added missing imports for type annotations (ASE NeighborList, Any) **Current status:** All type annotation suggestions from code review have been addressed. All ruff checks pass with zero violations. <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/deepmodeling/deepmd-kit/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com>
This pull request extends the testing functionality in DeepMD by allowing users to specify training and validation data directly via input JSON files, in addition to existing system and datafile options. It updates the command-line interface, the main test logic, and adds comprehensive tests to cover these new features, including support for recursive glob patterns when selecting systems from JSON files. ### Feature enhancements to testing data sources * The `test` function in `deepmd/entrypoints/test.py` now accepts `train_json` and `valid_json` arguments, allowing users to specify training or validation systems for testing via input JSON files. It processes these files to extract system paths, including support for recursive glob patterns. The function also raises an error if no valid data source is specified. [[1]](diffhunk://#diff-299c01ed4ee7d0b3f636fe4cb4f0d660a5012b7e95ca0740098b3ace617ab16eL61-R71) [[2]](diffhunk://#diff-299c01ed4ee7d0b3f636fe4cb4f0d660a5012b7e95ca0740098b3ace617ab16eL104-R151) * **The command-line interface in `deepmd/main.py` is updated to add `--train-data` and `--valid-data` arguments for the test subparser, enabling direct specification of input JSON files for training and validation data.** ### Test coverage improvements * New and updated tests in `source/tests/pt/test_dp_test.py` verify the ability to run tests using input JSON files for both training and validation data, including cases with recursive glob patterns. This ensures robust handling of various data source configurations. [[1]](diffhunk://#diff-ce70e95ffdb1996c7887ea3f63b54d1ae0fef98059572ad03875ca36cfef3c34L33-R35) [[2]](diffhunk://#diff-ce70e95ffdb1996c7887ea3f63b54d1ae0fef98059572ad03875ca36cfef3c34R49-R59) [[3]](diffhunk://#diff-ce70e95ffdb1996c7887ea3f63b54d1ae0fef98059572ad03875ca36cfef3c34R103-R116) [[4]](diffhunk://#diff-ce70e95ffdb1996c7887ea3f63b54d1ae0fef98059572ad03875ca36cfef3c34R164-R273) * Additional argument parser tests in `source/tests/common/test_argument_parser.py` confirm correct parsing of the new `--train-data` and `--valid-data` options. ### Internal code improvements * Refactored imports and type annotations in `deepmd/entrypoints/test.py` to support the new functionality and improve code clarity. [[1]](diffhunk://#diff-299c01ed4ee7d0b3f636fe4cb4f0d660a5012b7e95ca0740098b3ace617ab16eR17) [[2]](diffhunk://#diff-299c01ed4ee7d0b3f636fe4cb4f0d660a5012b7e95ca0740098b3ace617ab16eR42-R50) [[3]](diffhunk://#diff-299c01ed4ee7d0b3f636fe4cb4f0d660a5012b7e95ca0740098b3ace617ab16eL77-R95) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added support for supplying test systems via JSON files, including selecting training or validation data. - Introduced CLI options --train-data and --valid-data for the test command. - Supports resolving relative paths from JSON and optional recursive glob patterns. - Changes - Test command now requires at least one data source (JSON, data file, or system); clearer errors when none or no systems found. - Tests - Expanded test coverage for JSON-driven inputs and recursive glob patterns; refactored helpers for improved readability. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Chun Cai <amoycaic@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Implements TensorFlow support for the `dp change-bias` command with proper checkpoint handling and variable restoration. This brings the TensorFlow backend to feature parity with the PyTorch implementation. ## Key Features - **Checkpoint file support**: Handles individual checkpoint files (`.ckpt`, `.meta`, `.data`, `.index`) and frozen models (`.pb`) - **Proper variable restoration**: Variables are correctly restored from checkpoints using session initialization before bias modification - **User-defined bias support**: Supports `-b/--bias-value` option with proper validation against model type_map - **Data-based bias calculation**: Leverages existing `change_energy_bias_lower` functionality for automatic bias computation - **Checkpoint preservation**: Saves modified variables to separate checkpoint directory for continued training - **Cross-backend consistency**: Identical CLI interface and functionality as PyTorch backend ## Before vs After **Variable restoration**: - Before: `Change energy bias of ['O', 'H'] from [0. 0.] to [calculated values]` (variables never restored) - After: `Change energy bias of ['O', 'H'] from [-93.57 -187.15] to [-93.60 -187.19]` (proper restoration) **Output**: Creates both updated checkpoint files AND frozen model for continued training **Documentation**: Comprehensive documentation covering both TensorFlow and PyTorch backends with examples and backend-specific details The implementation includes comprehensive test coverage with real model training to validate functionality without mocks. Fixes #4018. <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/deepmodeling/deepmd-kit/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Signed-off-by: Jinzhe Zeng <njzjz@qq.com> Signed-off-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com> Co-authored-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn> Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This PR implements comprehensive type annotation coverage for the deepmd.pt PyTorch backend and resolves critical TorchScript compilation errors that prevented model deployment. ## Type Annotation Enforcement Added complete type annotations to all deepmd.pt module functions, eliminating 7,030+ ANN violations across 107 Python files. This provides: - Better IDE support and code maintainability - Consistent typing standards throughout the PyTorch backend - Enhanced developer experience with clear function signatures ## TorchScript Compilation Fixes Resolved multiple TorchScript compilation errors that prevented model deployment: ```python # Before: TorchScript compilation failed sw.to(dtype=env.GLOBAL_PT_FLOAT_PRECISION) # Error on Optional[Tensor] # After: Proper None handling sw.to(dtype=env.GLOBAL_PT_FLOAT_PRECISION) if sw is not None else None ``` Key fixes include: - Added proper None checks before `.to()` calls on `Optional[torch.Tensor]` values - Resolved issues across all descriptor types (SE-A, SE-T, SE-T-TEBD, DPA1, DPA2, DPA3) - Fixed abstract method patterns that conflicted with TorchScript compilation - Corrected return type annotations in SpinModel to accurately reflect Optional types ## Pre-commit Compliance - Fixed deprecated type annotation imports (Dict→dict, Tuple→tuple) - Resolved import ordering and undefined name issues - Removed unnecessary imports and improved code consistency - All pre-commit checks now pass with zero violations The PyTorch backend now has complete type coverage and full TorchScript deployment compatibility, enabling production model serving scenarios. <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/deepmodeling/deepmd-kit/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Signed-off-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com> Co-authored-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn>
Fix #3672. Fixes backend conversion issues for dipole models when using the `sel_type` parameter. The `dp convert-backend` command was failing due to missing serialization support for `None` networks and incomplete dipole fitting serialization. - [x] Fix NetworkCollection serialization to handle `None` networks - [x] Add missing `@variables` dictionary for DipoleFittingSeA PyTorch compatibility - [x] Include `sel_type` in serialized data for proper backend conversion - [x] Fix TF fitting deserialization to skip `None` networks - [x] Add comprehensive tests for `sel_type` parameter - [x] Remove duplicate test classes and merge parameterized tests - [x] Clean up accidentally committed test output files - [x] Refactor additional_data property to return dictionary directly - [x] Resolve merge conflicts in .gitignore after rebase All tests pass and the `dp convert-backend` command now works for dipole models with `sel_type` parameter. The branch has been successfully rebased against the latest devel branch with all conflicts resolved. <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/deepmodeling/deepmd-kit/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Signed-off-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com> Co-authored-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn>
…pecific model instances (#4931) - [x] Add new `get_model()` method to `DeepEval` backends for accessing backend-specific model instances - [x] Fix JAX backend implementation to return JAX model instance - [x] Fix all backends to return backend-specific models instead of dpmodel conversion - [x] Fix PyTorch .pth model test to expect TorchScript ScriptModule - [x] Add type annotations to all `get_model()` methods ## Type annotations added Added proper return type annotations to all `get_model()` methods: - **Abstract base class**: `-> Any` (backend-agnostic) - **High-level wrapper**: `-> Any` (delegates to backend) - **PyTorch backend**: `-> "BaseModel"` (PyTorch model instance) - **TensorFlow backend**: `-> "tf.Graph"` (TensorFlow graph) - **JAX backend**: `-> Any` (JAX model instance) - **Paddle backend**: `-> "BaseModel"` (Paddle model instance) - **dpmodel backend**: `-> "BaseModel"` (dpmodel BaseModel) The type imports are properly placed in `TYPE_CHECKING` blocks to avoid runtime import issues while providing proper type hints for development tools and static analysis. All backends now return their native model types as documented, providing users with access to backend-specific capabilities while maintaining proper type safety. <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/deepmodeling/deepmd-kit/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com>
<!--pre-commit.ci start--> updates: - [github.com/astral-sh/ruff-pre-commit: v0.12.10 → v0.12.11](astral-sh/ruff-pre-commit@v0.12.10...v0.12.11) - [github.com/pre-commit/mirrors-clang-format: v20.1.8 → v21.1.0](pre-commit/mirrors-clang-format@v20.1.8...v21.1.0) <!--pre-commit.ci end--> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
<!--pre-commit.ci start--> updates: - [github.com/astral-sh/ruff-pre-commit: v0.15.5 → v0.15.6](astral-sh/ruff-pre-commit@v0.15.5...v0.15.6) - [github.com/pre-commit/mirrors-clang-format: v22.1.0 → v22.1.1](pre-commit/mirrors-clang-format@v22.1.0...v22.1.1) <!--pre-commit.ci end--> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
#5302) ## Summary - Add `dp freeze` support for the pt_expt backend, enabling checkpoint `.pt` → exported `.pte` conversion - Add end-to-end tests for both `dp freeze` and `dp test` with `.pte` models ## Background The pt_expt backend can export models to `.pte` via `deserialize_to_file()`, and `dp test` can already load `.pte` models through the registered `DeepEval`. However, `dp freeze` was not wired up — calling `dp freeze -b pt-expt` hit `RuntimeError: Unsupported command 'freeze'`. ## Changes **`deepmd/pt_expt/entrypoints/main.py`** - Add `freeze()` function: loads `.pt` checkpoint → reconstructs model via `get_model` + `ModelWrapper` → serializes → exports to `.pte` via `deserialize_to_file` - Wire `freeze` command in `main()` dispatcher with checkpoint directory resolution and `.pte` default suffix **`source/tests/pt_expt/test_dp_freeze.py`** (new) - `test_freeze_pte` — verify `.pte` file is created from checkpoint - `test_freeze_main_dispatcher` — test `main()` CLI dispatcher with freeze command - `test_freeze_default_suffix` — verify non-`.pte` output suffix is corrected to `.pte` **`source/tests/pt_expt/test_dp_test.py`** (new) - `test_dp_test_system` — test `dp test` with `-s` system path, verify `.e.out`, `.f.out`, `.v.out` outputs - `test_dp_test_input_json` — test `dp test` with `--valid-data` JSON input ## Test plan - [x] `python -m pytest source/tests/pt_expt/test_dp_freeze.py -v` (3 passed) - [x] `python -m pytest source/tests/pt_expt/test_dp_test.py -v` (2 passed) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added a "freeze" CLI command to convert PyTorch checkpoints into portable .pte model files, with output filename normalization and sensible default naming; multi-task head usage now emits a clear unsupported message. * **Tests** * Added unit tests for the freeze command and CLI dispatch behavior. * Added integration tests validating end-to-end dp_test workflows using frozen models. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
For standard deviation of `fparam/aparam`, $\sigma = \sqrt{\frac{1}{N}
\sum_{i=1}^{N} (x_i - \bar{x})^2}=\sqrt{\frac{\sum x_i^2}{N} - \left(
\frac{\sum x_i}{N} \right)^2}$.
When all `fparam`/`aparam` have equal values in one dimension,
$\frac{\sum x_i^2}{N} - \left( \frac{\sum x_i}{N} \right)^2$ equals
zero.
However, it sometimes becomes a very small negative number(for example,
1e-18) due to numerical instability, so $\sqrt{\frac{\sum x_i^2}{N} -
\left( \frac{\sum x_i}{N} \right)^2}$ becomes `nan`.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved numerical stability in variance/std calculations by ensuring
intermediate variance values are non-negative before taking the square
root. This prevents occasional floating-point underflow from producing
invalid results and yields more reliable statistical outputs across
edge-case inputs.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
#5325) - Fix natoms[0] -> natoms in generalized force branch (natoms is int) - Replace xp.einsum with array-API-compatible xp.sum + broadcasting - Fix return type annotation of Loss.call and EnergyLoss.call from dict[str, Array] to tuple[Array, dict[str, Array]] - Add TestEnerGF consistency test for generalized force code path - Add dpmodel-level unit tests for EnergyLoss (basic, aecoeff, generalized force, huber, serialize round-trip) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Enhanced numerical accuracy in energy loss force calculations through optimized computation methods. * **Tests** * Added comprehensive test coverage for energy loss calculations, including generalized coordinate scenarios. * Expanded multi-backend compatibility validation across TensorFlow, PyTorch, JAX, Array API, and Paddle. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
## Summary - Add `FrozenModel` to pt_expt backend for loading pre-frozen model files (`.pte`, `.pth`, `.dp`) - Create dpmodel-level `FrozenModel` (`NativeOP` + `BaseModel`) with all delegation methods, so pt_expt wraps it via `@torch_module` instead of duplicating code - pt_expt `FrozenModel` handles `.pte` natively via `serialize_from_file`, falls back to generic backend detection for other formats - Add pt_expt support to frozen model consistency test ## Test plan - [x] Cross-backend consistency test (`source/tests/consistent/model/test_frozen.py`) — pt_expt consistent_with_ref and self_consistent pass - [x] Existing pt/tf frozen model tests unaffected <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added support for loading and using frozen model files across workflows and exposed a FrozenModel in the Python API. * Broadened backend compatibility to include an additional experiment backend for frozen models. * **Tests** * Added/updated tests to validate frozen-model loading and evaluation across supported backends, including the new experiment backend. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
1. refactor name-based routing 2. add slice mode for HybridMuon opt 3. add Magma-lite damping for Muon path <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * HybridMuon gains routing modes (slice, 2d, flat), name-aware routing for biases/Adam variants, and a magma_muon option for Magma-lite damping. Optimizer now accepts named parameters; deprecated 2D-only options removed. * **Documentation** * Updated optimizer docs to describe new routing modes, magma_muon and flash_muon options, and adjusted lr_adjust default. * **Tests** * Expanded tests for routing modes, Magma damping, and state compatibility; some legacy tests consolidated. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Problem - DPA3-Omol-Large is already published on model hubs but is not exposed through `dp pretrained`. - Users currently cannot download or resolve it via a built-in model alias. Change - add `DPA3-Omol-Large` to the built-in pretrained model registry - include Hugging Face / hf-mirror / ModelScope download URLs and the model sha256 - update the pretrained-model docs and add alias/backend coverage in tests Notes - The SHA256 (`dc4d252b31450b41eb3546cc48f640ad0831c0b5d069ce27d996e0ff58fc037a`) was taken from the Hugging Face LFS object for `DPA3-Omol-Large.pt`. - In this environment I only ran lightweight local validation (`py_compile` + an AST-based registry check). I did not run the full project test suite because the repo test environment was not fully provisioned here. Authored by OpenClaw (model: gpt-5.4) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added "DPA3-Omol-Large" as a new pretrained model, available from multiple download mirrors for improved accessibility and reliability. * **Documentation** * Updated pretrained model examples to include "DPA3-Omol-Large". * **Tests** * Added tests to validate recognition and alias normalization for the new model name. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Introduced loss_func ("mse" or "mae") to select MSE vs MAE for
energy/force/virial/atom losses.
* Added f_use_norm to enable vector‑norm MAE behavior when allowed.
* **Validation**
* Enforced that f_use_norm is only valid when use_huber is enabled or
loss_func="mae"; invalid combos are rejected.
* **Tests**
* Extended loss tests and skipping logic to cover loss_func and
f_use_norm combinations.
* **Documentation**
* Updated docs to describe loss_func and resulting metric names (rmse_*
vs mae_*).
* **Chores**
* New options are persisted in serialized configurations.
* **Notes**
* Some backends currently only support "mse" (MAE not yet available
everywhere).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This is a **breaking change**: bump the model data versio of
`LinearEnergyAtomicModel` from 2 to 3 due to the bug fixing.
## Summary
- Add `LinearEnergyModel` to pt_expt backend, enabling linear
combination of multiple sub-models
- Add `get_linear_model` factory in pt_expt for constructing from config
dicts
- Fix bugs in dpmodel/pt shared code:
- `get_linear_model` (pt) not propagating `type_map` to sub-models
- `LinearEnergyAtomicModel` (dpmodel) missing `weights` parameter,
causing deserialization failure
- `_compute_weight` calling `array_api_compat.array_namespace()` with
Python list and using numpy dtype with torch
## Test plan
- [x] Cross-backend consistency test
(`source/tests/consistent/model/test_linear_ener.py`) — pt vs pt_expt,
with parameterized exclude types
- [x] Existing dpmodel/pt linear model tests still pass
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added LinearEnergyModel and a "linear_ener" fitting path to combine
multiple sub-models.
* Configurable weighting when combining sub-model energies: "mean",
"sum", or a custom vector; weights are validated and stored.
* **Behavior**
* Sub-model type mappings are propagated when omitted.
* Model serialization now persists weight settings and advances the
model version for compatibility.
* **Tests**
* Added cross-backend and unit tests validating weighting behaviors,
outputs, and selector updates.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Han Wang <92130845+wanghan-iapcm@users.noreply.github.com>
Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
## Summary
- Implement `SpinModel` and `SpinEnergyModel` in the pt_expt backend,
supporting spin degrees of freedom for magnetic systems
- Make dpmodel `SpinModel` array-API compatible so the same code works
across numpy/torch/jax backends
- Add spin virial correction (`coord_corr_for_virial`) to dpmodel and
pt_expt, matching the pt backend
- Fix `get_spin_model` in dpmodel to not mutate the caller's input data
dict (pt backend already used `deepcopy`)
## Changes
### dpmodel (`deepmd/dpmodel/model/`)
- `spin_model.py`: Replace all `np.*` operations with `array_api_compat`
equivalents (`xp.concat`, `xp.where`, `xp.zeros` with `device=`, slicing
instead of `xp.split`). Add `compute_or_load_stat` and virial correction
support via
`coord_corr_for_virial` / `extended_coord_corr`.
- `make_model.py`: Thread `coord_corr_for_virial` through `call_common`
→ `model_call_from_call_lower` (extends to ghost atoms via mapping) →
`call_common_lower` → `forward_common_atomic`.
- `model.py`: Add `copy.deepcopy(data)` in `get_spin_model` to prevent
in-place mutation of input dict.
### pt_expt (`deepmd/pt_expt/model/`)
- `spin_model.py` (new): `@torch_module` wrapper inheriting from dpmodel
`SpinModel`.
- `spin_ener_model.py` (new): `SpinEnergyModel` with `forward()` /
`forward_lower()` / `forward_lower_exportable()` providing user-facing
output translation.
- `make_model.py`, `transform_output.py`: Accept `extended_coord_corr`
for virial correction.
### Tests
- `test_spin_ener_model.py` (new): Unit tests for output keys/shapes,
serialize/deserialize round-trip, dpmodel consistency, force
finite-difference, virial finite-difference, and `torch.export`
exportability.
- `test_spin_ener.py`: Cross-backend consistency tests for
`call`/`call_lower`, `compute_or_load_stat`, and load-from-file. Virial
output now compared across pt and pt_expt.
## Test plan
- [x] `python -m pytest source/tests/pt_expt/model/ -v` — all 28 tests
pass
- [x] `python -m pytest source/tests/consistent/model/test_spin_ener.py
-v` — all 12 tests pass (18 skipped for uninstalled backends)
- [x] Force and virial verified by finite-difference tests
- [x] `torch.export.export` verified on `forward_lower_exportable`
- [x] `compute_or_load_stat` load-from-file verified across
dp/pt/pt_expt
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added SpinEnergyModel with exportable lower-level forward,
energy/force/virial outputs, and compute_or_load_stat preprocessing.
* Optional virial coordinate-correction can be supplied and is
propagated through forward paths.
* **Bug Fixes**
* Prevented in-place mutation of input data during model preparation.
* **Tests**
* Expanded tests for exportable workflows, force/virial validation,
multi-backend (including PT_EXPT) and array‑API strict modes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Han Wang <92130845+wanghan-iapcm@users.noreply.github.com>
Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Co-authored-by: Duo <50307526+iProzd@users.noreply.github.com>
## Summary This PR completely restructures the `learning-rate.md` documentation to improve clarity, organization, and accuracy. The previous version had content scattered across sections with significant repetition. The new structure follows a user-centric approach: quick start → configuration reference → mathematical theory. ## Key Changes ### Structural Improvements - **Reorganized section order**: Quick Start → Parameters → Schedule Types → Warmup → Mathematical Theory → Migration Guide - **Eliminated content duplication**: Removed redundant formulas between Theory and Instructions sections <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Reworked learning-rate guide into a structured, example-driven reference with Quick Start, explicit exponential and cosine schedules, and JSON configuration examples. * Added a Notation/Theory section, clear warmup formulas and mutual‑exclusivity rules, unified parameter descriptions (start_lr/stop_lr/stop_lr_ratio/decay_steps) and smooth vs stepped behavior. * Expanded migration guidance for versions prior to 3.1.3 and refreshed references. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Make long-running training progress easier to read by keeping the relative ETA and appending a concise absolute finish time across the pt, pd, tf, and pt_expt backends. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Training logs now include both remaining ETA and an estimated local finish time (YYYY-MM-DD HH:MM). * Timezone-aware local timestamps are shown across training frameworks for clearer cross-region monitoring and more consistent periodic timing output. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary - Add model compression (embedding net tabulation) for the pt_expt backend, matching the existing pt backend capability - Compressed models replace embedding net forward passes with polynomial lookup tables via C++ custom ops (`tabulate_fusion_se_*`), significantly speeding up inference - Support all compressible descriptors: `se_e2_a`, `se_r`, `se_t`, `se_t_tebd`, `dpa1`, `se_atten_v2`, `dpa2` (hybrid delegates automatically) ### Key changes **Infrastructure:** - `deepmd/pt_expt/utils/tabulate_ops.py` — Register `torch.library.register_fake` for all 5 custom ops to enable `torch.export`/`make_fx` tracing through compressed forward paths - `deepmd/pt_expt/utils/tabulate.py` — `DPTabulate` subclass that detects descriptor type via serialized data (avoids `isinstance` checks against pt-specific classes) - `deepmd/pt_expt/entrypoints/compress.py` — Entry point: load `.pte` → deserialize → `enable_compression()` → re-export `.pte` **Descriptors:** Each gets `enable_compression()` + `@cast_precision` `call()` override with compressed branch using the appropriate custom op. **dpmodel — compression state serialization (breaking version bumps):** The pt_expt backend persists models via `serialize()` → `model.json` → `deserialize()` (the `.pte` format), unlike pt/tf which use native framework save mechanisms (torch.jit.save / tf.saved_model) that capture the full runtime state. This means compression state (tabulated polynomial coefficients, precomputed type embeddings) must survive the serialize/deserialize round-trip for compressed `.pte` models to work. Each compressible descriptor's serialization version is bumped when the model is compressed. **Uncompressed models continue to use the old version**, so there is no breakage for existing uncompressed model files. All backends (pt, pd, tf) accept the new version in `deserialize()` and simply ignore the `"compress"` key. | Descriptor | Version bump | Added fields | |---|---|---| | `se_e2_a` | 2 → 3 | `compress_data`, `compress_info` | | `se_r` | 2 → 3 | `compress_data`, `compress_info` | | `se_t` | 2 → 3 | `compress_data`, `compress_info` | | `se_t_tebd` | 1 → 2 | `compress_data`, `compress_info`, `type_embd_data` | | `dpa1` | 2 → 3 | `type_embd_data`, `geo_compress`, `compress_data`/`info` (if geo) | | `se_atten_v2` | 2 → 3 | `type_embd_data`, `geo_compress`, `compress_data`/`info` (if geo) | | `dpa2` | 3 → 4 | compress dict inside `repinit_variable` | **dpmodel:** Initialize `self.compress = False` in all descriptor `__init__` methods. ## Test plan - [x] `source/tests/pt_expt/model/test_model_compression.py` — end-to-end compress → serialize → deserialize → eval - [x] `source/tests/pt_expt/descriptor/` — compressed forward, consistency, exportable, make_fx tests for all descriptors - [x] `source/tests/consistent/descriptor/` — cross-backend consistency tests pass with bumped versions <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added descriptor compression functionality to reduce model size and optimize memory usage during inference. * Introduced `compress` CLI command to enable tabulated embedding optimization on frozen trained models. * Enhanced descriptor serialization with improved version compatibility across multiple backends. * **Tests** * Added comprehensive test coverage for compressed descriptor forward passes and model compression workflows. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
## Summary - Add `--finetune`, `--model-branch`, and `--use-pretrain-script` support to `dp --pt-expt train`, mirroring the pt backend's finetune flow (load pretrained checkpoint, change type map, selective weight copy, output bias adjustment) - Support finetuning from both `.pt` checkpoints and frozen `.pte` models (embed `model_params` in `.pte` during freeze for `--use-pretrain-script`) - Fix a bug in dpmodel's `base_atomic_model.change_type_map` where `out_bias`/`out_std` were not extended before remapping when the new type map introduces unseen types, causing `IndexError` with negative remap indices ## Usage examples ```bash # Finetune from a .pt checkpoint dp --pt-expt train input.json --finetune pretrained.pt # Finetune from a frozen .pte model dp --pt-expt train input.json --finetune pretrained.pte # Copy descriptor/fitting config from pretrained model dp --pt-expt train input.json --finetune pretrained.pt --use-pretrain-script # Finetune from a multi-task pretrained model (select a branch) dp --pt-expt train input.json --finetune pretrained.pt --model-branch Default # Re-initialize fitting net randomly (only keep descriptor weights) dp --pt-expt train input.json --finetune pretrained.pt --model-branch RANDOM ``` ## Files changed | File | Change | |------|--------| | `deepmd/pt_expt/utils/finetune.py` | **New** — `get_finetune_rules()` for pt_expt, supports `.pt` and `.pte` | | `deepmd/pt_expt/entrypoints/main.py` | Wire `--finetune`/`--model-branch`/`--use-pretrain-script` through `train()` → `get_trainer()` → `Trainer`; pass `model_params` to `.pte` during freeze | | `deepmd/pt_expt/train/training.py` | Finetune weight loading in `Trainer.__init__` (`.pt` and `.pte`); `model_change_out_bias()` | | `deepmd/pt_expt/utils/serialization.py` | Embed/extract `model_params.json` in `.pte` archive | | `deepmd/dpmodel/atomic_model/base_atomic_model.py` | Fix `change_type_map` to extend `out_bias`/`out_std` for new types (array-api compatible) | | `source/tests/pt_expt/test_finetune.py` | **New** — 9 tests covering bias adjustment, type map change, CLI dispatch, `.pte` finetune, `--use-pretrain-script`, `random_fitting`, inherited weight consistency | | `source/tests/consistent/model/test_ener.py` | Add `test_change_type_map_new_type` verifying `out_bias`/`out_std` extension across dp, pt, pt_expt | ## Test plan - [x] `python -m pytest source/tests/pt_expt/test_finetune.py -v` (9 passed) - [x] `python -m pytest source/tests/pt_expt/test_training.py -v` (11 passed, no regression) - [x] `python -m pytest source/tests/consistent/model/test_ener.py -k change_type_map -v` (3 passed) - [x] `python -m pytest source/tests/consistent/descriptor/test_se_e2_a.py -v` (351 passed, no regression) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Fine-tuning workflow: supply pretrained checkpoints, select branch, and toggle pretrain-script behavior * Automatic expansion of atom type maps (new types get zero bias and unit std) while preserving existing mappings * Improved finetune resume: selective merging of pretrained descriptor/fitting weights and bias-adjustment modes * Export/import embeds/restores model metadata to/from artifacts * **Tests** * Unit and end-to-end tests for finetuning, bias adjustment, type-map expansion, and frozen-artifact scenarios <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
…PA2/DPA3 support (#5298) Add C/C++ inference support for the `.pt2` (torch.export / AOTInductor) backend, covering all major descriptor types: SE_E2_A, DPA1, DPA2, and DPA3. ### C/C++ inference backend (`DeepPotPTExpt`) - New `DeepPotPTExpt` backend that loads `.pt2` models via `torch::inductor::AOTIModelContainerRunnerCpu` - Supports PBC, NoPbc, fparam/aparam, multi-frame batching, atomic energy/virial, LAMMPS neighbor list (with ghost atoms, 2rc padding, type selection) - Registered alongside existing PT/TF/JAX/PD backends via the `.pt2` file extension ### dpmodel fixes for torch.export compatibility - Replace `[:, :nloc]` slicing with `xp_take_first_n()` in DPA1, DPA2, DPA3, and repflows/repformers — the original slicing creates `Ne(nall, nloc)` shape constraints that fail when `nall == nloc` (NoPbc case) - Replace flat `(nf*nall,)` indexing in `dpa1.py` and `exclude_mask.py` with `xp_take_along_axis` - Replace `xp.reshape(mapping, (nframes, -1, 1))` with `xp.expand_dims` in repflows/repformers — the `-1` resolves to `nall` during tracing ### pt_expt serialization - `.pt2` export via `torch.export.export` → `aot_compile` → package as zip - Python inference via `torch._inductor.aoti_load_package` ### Bug fix in all C++ backends - Fix ghost-to-local mapping when virtual atoms are present — the old code `mapping[ii] = lmp_list.mapping[fwd_map[ii]]` used post-filter indices as original indices; fixed to `mapping[ii] = fwd_map[lmp_list.mapping[bkw_map[ii]]]` - Fix use-after-free in `DeepPotPTExpt.cc` where `torch::from_blob` referenced a local vector after it went out of scope ### Test infrastructure - Model generation scripts (`gen_dpa1.py`, `gen_dpa2.py`, `gen_dpa3.py`, `gen_fparam_aparam.py`) that build from dpmodel config → serialize → export to both `.pth` and `.pt2` with identical weights - Remove pre-committed `.pth` files; regenerate in CI via `convert-models.sh` - C++ tests for all descriptor types: SE_E2_A, DPA1, DPA2, DPA3 (both `.pth` and `.pt2`, PBC + NoPbc, double + float) - Python unit tests for pt_expt inference (`test_deep_eval.py`) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added support for PyTorch exportable (.pt2) models and runtime detection, enabling AOTInductor-based inference across interfaces. * **Bug Fixes** * Improved neighbor/embedding extraction and broadcasting to increase backend export compatibility and robustness. * **Tests** * Added extensive C++ and Python test suites and reference-generation scripts to validate .pt2 inference paths and cross-format consistency. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
## Summary Add `dp --pt-expt change-bias` command to adjust the output bias (energy shift per atom type) of pt_expt models without retraining. This brings the pt_expt backend to parity with pt/tf/pd backends for this feature. ### Supported input/output formats | Input | Output | Notes | |-------|--------|-------| | `.pt` checkpoint | `.pt` checkpoint | Modify bias before freezing | | `.pte` frozen model | `.pte` frozen model | Round-trip: deserialize → modify bias → re-export | ### Bias modes - **Data-based** (`-s <data_dir>` or `-f <data_file>`): compute new bias from data via linear regression (`change-by-statistic` or `set-by-statistic`) - **User-defined** (`-b 0.1 3.2 ...`): set bias values directly ### Implementation details **`deepmd/pt_expt/entrypoints/main.py`** — `change_bias()` function + CLI dispatch: - `.pt` input: `torch.load` → extract `model_params` from `_extra_state` → `get_model()` → `ModelWrapper.load_state_dict()` → apply bias → `torch.save()` - `.pte` input: `serialize_from_file()` → `BaseModel.deserialize()` → apply bias → `model.serialize()` → `deserialize_to_file()` - Data loading uses pt_expt's own pipeline: `DeepmdDataSystem` + `make_stat_input` (numpy-based, from `deepmd.utils.model_stat`) - When `numb_batch=0` (default), uses all available batches via `max(data.get_nbatches())` **`deepmd/pt_expt/train/training.py`** — `model_change_out_bias()` helper: - Logs old/new bias values after calling the dpmodel-inherited `change_out_bias()` - Simpler than pt's version: no `DPModelCommon`/`compute_input_stats` check needed since pt_expt models inherit dpmodel's implementation directly ### Usage ```bash # Change bias using data (checkpoint) dp --pt-expt change-bias model.ckpt.pt -s /path/to/data -o updated.pt # Change bias using data file list dp --pt-expt change-bias model.ckpt.pt -f systems.txt -o updated.pt # Set bias to specific values dp --pt-expt change-bias model.ckpt.pt -b 0.1 3.2 -o updated.pt # Change bias on frozen model dp --pt-expt change-bias frozen.pte -s /path/to/data -o updated.pte ``` ## Test plan - [x] `python -m pytest source/tests/pt_expt/test_change_bias.py -v` — 4 end-to-end CLI tests: - `test_change_bias_with_data` — bias changes when using `-s` flag - `test_change_bias_with_data_sys_file` — bias changes when using `-f` flag - `test_change_bias_with_user_defined` — exact match with user-specified values - `test_change_bias_frozen_pte` — freeze → change-bias on `.pte` → verify bias changed - [x] `python -m pytest source/tests/consistent/model/test_ener.py -k test_change_out_bias -v` — cross-backend consistency passes <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added a `change-bias` command to adjust a model's output bias by supplying values or computing statistics from systems; supports checkpoint and frozen model formats while preserving model metadata. * Added a model bias-update helper to apply/statistically derive bias adjustments consistently. * **Tests** * End-to-end tests for data-driven, file-list, user-specified, and frozen-model workflows. * Added tests ensuring fitting-statistics are computed when expected and improved fixtures to clear leaked device contexts. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
There was a problem hiding this comment.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.