Releases: deepmodeling/deepmd-kit
v3.1.0
What's Changed
Highlights
DPA3
DPA3 is an advanced interatomic potential leveraging the message-passing architecture. Designed as a large atomic model (LAM), DPA3 is tailored to integrate and simultaneously train on datasets from various disciplines, encompassing diverse chemical and materials systems across different research domains. Its model design ensures exceptional fitting accuracy and robust generalization within and beyond the training domain. Furthermore, DPA3 maintains energy conservation and respects the physical symmetries of the potential energy surface, making it a dependable tool for a wide range of scientific applications.
Refer to examples/water/dpa3/input_torch.json
for the training script. After training, the PyTorch model can be converted to the JAX model.
PaddlePaddle backend
The PaddlePaddle backend features a similar Python interface to the PyTorch backend, ensuring compatibility and flexibility in model development. PaddlePaddle has introduced dynamic-to-static functionality and PaddlePaddle JIT compiler (CINN) in DeePMD-kit, which allow for dynamic shapes and higher-order differentiation. The dynamic-to-static functionality automatically captures the user’s dynamic graph code and converts it into a static graph. After conversion, the CINN compiler is used to optimize the computational graph, thereby enhancing the efficiency of model training and inference. In experiments with the DPA-2 model, we achieved approximately a 40% reduction in training time compared to the dynamic graph, effectively improving the model training efficiency.
Breaking changes
Other new features
- feat(pt/dp): support case embedding and sharable fitting by @iProzd in #4417
- feat(pt): train with energy Hessian by @1azyking in #4169
- feat: add new batch size rules for large systems by @caic99 in #4659
- feat: add method to access fele in pppm/dplr by @HanswithCMY in #4452
- feat (tf/pt): add atomic weights to tensor loss by @ChiahsinChu in #4466
- feat(pt): add
trainable
to property fitting by @ChiahsinChu in #4599 - Feat(pt): Support fitting_net input statistics. by @Chengqian-Zhang in #4504
- feat(jax): Hessian by @njzjz in #4649
- feat: add plugin mode for data modifier by @ChiahsinChu in #4621
- feat(pt): add eta message for pt backend by @HydrogenSulfate in #4725
- feat: add huber loss by @iProzd in #4684
- feat(pt): add AdamW for pt training by @iProzd in #4757
- Feat:support customized rglob by @anyangml in #4763
- feat(pt/pd): add size option to dp show by @iProzd in #4783
All changes in v3.0.1, v3.0.2, and v3.0.3 are included.
Contributors
- @iProzd #4417 #4655 #4419 #4609 #4633 #4647 #4675 #4684 #4730 #4757 #4754 #4756 #4760 #4778 #4781 #4783 #4792
- @pre-commit-ci #4420 #4449 #4464 #4473 #4497 #4521 #4539 #4552 #4566 #4574 #4579 #4596 #4602 #4611 #4645 #4660 #4672 #4690 #4699 #4708 #4712 #4719 #4723 #4736 #4748 #4767 #4779 #4791
- @njzjz #4482 #4483 #4484 #4507 #4619 #4410 #4438 #4442 #4446 #4459 #4485 #4479 #4508 #4534 #4531 #4542 #4550 #4553 #4557 #4561 #4565 #4570 #4575 #4547 #4582 #4613 #4624 #4558 #4638 #4636 #4640 #4649 #4668 #4680 #4720 #4728 #4738 #4692 #4700 #4704 #4702 #4717 #4724 #4726 #4729 #4735 #4753 #4774 #4765 #4776 #4775 #4766 #4780 #4786 #4794
- @Chengqian-Zhang #4471 #4504 #4639
- @HydrogenSulfate #4418 #4489 #4673 #4302 #4439 #4414 #4480 #4493 #4488 #4512 #4467 #4514 #4617 #4556 #4656 #4694 #4701 #4715 #4725 #4768 #4770
- @QuantumMisaka #4510
- @1azyking #4169
- @caic99 #4535 #4615 #4659 #4434 #4426 #4435 #4433 #4437 #4463 #4505 #4478 #4541 #4513 #4597 #4622 #4662 #4669 #4677 #4678 #4688 #4687 #4737 #4747 #4746 #4761 #4772 #4773 #4784 #4751 #4790
- @dependabot #4408 #4630
- @anyangml #4423 #4432 #4587 #4763
- @HanswithCMY #4452
- @ChiahsinChu #4466 #4538 #4599 #4621
- @RMeli #4577
- @Yi-FanLi #4581
- @wanghan-iapcm #4653
- @SumGuo-88 #4593
- @SigureMo #4664
- @njzjz-bot #4796
New Contributors
- @HanswithCMY made their first contribution in #4452
- @QuantumMisaka made their first contribution in #4510
- @1azyking made their first contribution in #4169
- @RMeli made their first contribution in #4577
- @SumGuo-88 made their first contribution in #4593
- @SigureMo made their first contribution in #4664
Full Changelog: v3.0.0...v3.1.0rc0
v3.1.0rc0
What's Changed
Highlights
DPA-3
DPA-3 is an advanced interatomic potential leveraging the message-passing architecture. Designed as a large atomic model (LAM), DPA-3 is tailored to integrate and simultaneously train on datasets from various disciplines, encompassing diverse chemical and materials systems across different research domains. Its model design ensures exceptional fitting accuracy and robust generalization within and beyond the training domain. Furthermore, DPA-3 maintains energy conservation and respects the physical symmetries of the potential energy surface, making it a dependable tool for a wide range of scientific applications.
Refer to examples/water/dpa3/input_torch.json
for the training script. After training, the PyTorch model can be converted to the JAX model.
PaddlePaddle backend
The PaddlePaddle backend features a similar Python interface to the PyTorch backend, ensuring compatibility and flexibility in model development. PaddlePaddle has introduced dynamic-to-static functionality and PaddlePaddle JIT compiler (CINN) in DeePMD-kit, which allow for dynamic shapes and higher-order differentiation. The dynamic-to-static functionality automatically captures the user’s dynamic graph code and converts it into a static graph. After conversion, the CINN compiler is used to optimize the computational graph, thereby enhancing the efficiency of model training and inference. In experiments with the DPA-2 model, we achieved approximately a 40% reduction in training time compared to the dynamic graph, effectively improving the model training efficiency.
Breaking changes
Other new features
- feat(pt/dp): support case embedding and sharable fitting by @iProzd in #4417
- feat(pt): train with energy Hessian by @1azyking in #4169
- feat: add new batch size rules for large systems by @caic99 in #4659
- feat: add method to access fele in pppm/dplr by @HanswithCMY in #4452
- feat (tf/pt): add atomic weights to tensor loss by @ChiahsinChu in #4466
- feat(pt): add
trainable
to property fitting by @ChiahsinChu in #4599 - Feat(pt): Support fitting_net input statistics. by @Chengqian-Zhang in #4504
- feat(jax): Hessian by @njzjz in #4649
- feat: add plugin mode for data modifier by @ChiahsinChu in #4621
- feat(pt): add eta message for pt backend by @HydrogenSulfate in #4725
- feat: add huber loss by @iProzd in #4684
- feat(pt): add AdamW for pt training by @iProzd in #4757
All changes in v3.0.1, v3.0.2, and v3.0.3 are included.
Contributors
- @iProzd #4417 #4655 #4419 #4609 #4633 #4647 #4675 #4684 #4730 #4757 #4754 #4756 #4760 #4778
- @pre-commit-ci #4420 #4449 #4464 #4473 #4497 #4521 #4539 #4552 #4566 #4574 #4579 #4596 #4602 #4611 #4645 #4660 #4672 #4690 #4699 #4708 #4712 #4719 #4723 #4736 #4748 #4767
- @njzjz #4482 #4483 #4484 #4507 #4619 #4410 #4438 #4442 #4446 #4459 #4485 #4479 #4508 #4534 #4531 #4542 #4550 #4553 #4557 #4561 #4565 #4570 #4575 #4547 #4582 #4613 #4624 #4558 #4638 #4636 #4640 #4649 #4668 #4680 #4720 #4728 #4738 #4692 #4700 #4704 #4702 #4717 #4724 #4726 #4729 #4735 #4753 #4774 #4765 #4776 #4775 #4766
- @Chengqian-Zhang #4471 #4504 #4639
- @HydrogenSulfate #4418 #4489 #4673 #4302 #4439 #4414 #4480 #4493 #4488 #4512 #4467 #4514 #4617 #4556 #4656 #4694 #4701 #4715 #4725 #4768 #4770
- @QuantumMisaka #4510
- @1azyking #4169
- @caic99 #4535 #4615 #4659 #4434 #4426 #4435 #4433 #4437 #4463 #4505 #4478 #4541 #4513 #4597 #4622 #4662 #4669 #4677 #4678 #4688 #4687 #4737 #4747 #4746 #4761 #4772 #4773
- @dependabot #4408 #4630
- @anyangml #4423 #4432 #4587
- @HanswithCMY #4452
- @ChiahsinChu #4466 #4538 #4599 #4621
- @RMeli #4577
- @Yi-FanLi #4581
- @wanghan-iapcm #4653
- @SumGuo-88 #4593
- @SigureMo #4664
New Contributors
- @HanswithCMY made their first contribution in #4452
- @QuantumMisaka made their first contribution in #4510
- @1azyking made their first contribution in #4169
- @RMeli made their first contribution in #4577
- @SumGuo-88 made their first contribution in #4593
- @SigureMo made their first contribution in #4664
Full Changelog: v3.0.0...v3.1.0rc0
v3.0.3
What's Changed
Breaking changes
- breaking(wheel): bump minimal macos version to 11.0 (#4704)
Bugfixes
- fix(tf): fix dplr Python inference (#4753)
- fix: data type of nloc, nall-nloc in the input of border_op (#4653)
- fix(data): Throw error when data's element is not present in
input.json/type_map
(#4639) - fix(ase): aviod duplicate stress calculation for ase calculator (#4633)
- fix(pt): improve OOM detection (#4638)
- fix(tf): always use float64 for the global tensor (#4735)
- fix(jax): set
default_matmul_precision
totensorfloat32
(#4726) - fix(jax): fix NaN in sigmoid grad (#4724)
- fix: fix compatibility with CMake 4.0 (#4680)
CI/CD
- fix(CI): set CMAKE_POLICY_VERSION_MINIMUM environment variable (#4692)
- CI: bump PyTorch to 2.7 (#4717)
- fix(tests): fix tearDownClass and release GPU memory (#4702)
- fix(CI): upgrade setuptools to fix its compatibility with wheel (#4700)
Full Changelog: v3.0.2...v3.0.3
v3.1.0a0
What's Changed
Highlights
DPA-3
DPA-3 is an advanced interatomic potential leveraging the message-passing architecture. Designed as a large atomic model (LAM), DPA-3 is tailored to integrate and simultaneously train on datasets from various disciplines, encompassing diverse chemical and materials systems across different research domains. Its model design ensures exceptional fitting accuracy and robust generalization within and beyond the training domain. Furthermore, DPA-3 maintains energy conservation and respects the physical symmetries of the potential energy surface, making it a dependable tool for a wide range of scientific applications.
Refer to examples/water/dpa3/input_torch.json
for the training script. After training, the PyTorch model can be converted to the JAX model.
PaddlePaddle backend
The PaddlePaddle backend features a similar Python interface to the PyTorch backend, ensuring compatibility and flexibility in model development. PaddlePaddle has introduced dynamic-to-static functionality and PaddlePaddle JIT compiler (CINN) in DeePMD-kit, which allow for dynamic shapes and higher-order differentiation. The dynamic-to-static functionality automatically captures the user’s dynamic graph code and converts it into a static graph. After conversion, the CINN compiler is used to optimize the computational graph, thereby enhancing the efficiency of model training and inference. In experiments with the DPA-2 model, we achieved approximately a 40% reduction in training time compared to the dynamic graph, effectively improving the model training efficiency.
Other new features
- feat(pt/dp): support case embedding and sharable fitting by @iProzd in #4417
- feat(pt): train with energy Hessian by @1azyking in #4169
- feat: add new batch size rules for large systems by @caic99 in #4659
- feat: add method to access fele in pppm/dplr by @HanswithCMY in #4452
- feat (tf/pt): add atomic weights to tensor loss by @ChiahsinChu in #4466
- feat(pt): add
trainable
to property fitting by @ChiahsinChu in #4599 - Feat(pt): Support fitting_net input statistics. by @Chengqian-Zhang in #4504
- feat(jax): Hessian by @njzjz in #4649
- feat: add plugin mode for data modifier by @ChiahsinChu in #4621
All changes in v3.0.1 and v3.0.2 are included.
Contributors
- @iProzd #4417 #4655 #4419 #4609 #4633 #4647 #4675
- @pre-commit-ci #4420 #4449 #4464 #4473 #4497 #4521 #4539 #4552 #4566 #4574 #4579 #4596 #4602 #4611 #4645 #4660 #4672
- @njzjz #4482 #4483 #4484 #4507 #4619 #4410 #4438 #4442 #4446 #4459 #4485 #4479 #4508 #4534 #4531 #4542 #4550 #4553 #4557 #4561 #4565 #4570 #4575 #4547 #4582 #4613 #4624 #4558 #4638 #4636 #4640 #4649 #4668 #4680
- @Chengqian-Zhang #4471 #4504 #4639
- @HydrogenSulfate #4418 #4489 #4673 #4302 #4439 #4414 #4480 #4493 #4488 #4512 #4467 #4514 #4617 #4556 #4656
- @QuantumMisaka #4510
- @1azyking #4169
- @caic99 #4535 #4615 #4659 #4434 #4426 #4435 #4433 #4437 #4463 #4505 #4478 #4541 #4513 #4597 #4622 #4662 #4669 #4677 #4678
- @dependabot #4408 #4630
- @anyangml #4423 #4432 #4587
- @HanswithCMY #4452
- @ChiahsinChu #4466 #4538 #4599 #4621
- @RMeli #4577
- @Yi-FanLi #4581
- @wanghan-iapcm #4653
- @SumGuo-88 #4593
- @SigureMo #4664
New Contributors
- @HanswithCMY made their first contribution in #4452
- @QuantumMisaka made their first contribution in #4510
- @1azyking made their first contribution in #4169
- @RMeli made their first contribution in #4577
- @SumGuo-88 made their first contribution in #4593
- @SigureMo made their first contribution in #4664
Full Changelog: v3.0.0...v3.1.0a0
v3.0.2
What's Changed
This patch version only contains minor features, bug fixes, enhancements, and documentation improvements.
New features
Enhancement
- Perf: replace unnecessary
torch.split
with indexing by @caic99 in #4505 - Perf: use F.linear for MLP by @caic99 in #4513
- chore: improve neighbor stat log by @njzjz in #4561
- chore: bump pytorch to 2.6.0 by @njzjz in #4575
Bugfix
- Fix: Modify docs of DPA models by @QuantumMisaka in #4510
- fix(pt): fix clearing the list in set_eval_descriptor_hook by @njzjz in #4534
- [fix bug] load atomic_*.npy for tf tensor model by @ChiahsinChu in #4538
- fix: lower
num_workers
to 4 by @caic99 in #4535 - fix: fix YAML conversion by @njzjz in #4565
- fix(cc): remove C++ 17 usage by @njzjz in #4570
- Fix version in DeePMDConfigVersion.cmake by @RMeli in #4577
- fix(pt): detach computed descriptor tensor to prevent OOM by @njzjz in #4547
- fix(pt): throw errors for GPU tensors and the CPU OP library by @njzjz in #4582
- use variable to store the bias of atomic polarizability by @Yi-FanLi in #4581
- Fix: pt tensor loss label name by @anyangml in #4587
- CI: pin jax to 0.5.0 by @njzjz in #4613
- fix(array-api): fix xp.where errors by @njzjz in #4624
Documentation
- docs: fix the header of the scaling test table by @njzjz in #4507
- docs: add
sphinx.configuration
to .readthedocs.yml by @njzjz in #4553 - docs: add v3 paper citations by @njzjz in #4619
- docs: add PyTorch Profiler support details to TensorBoard documentation by @caic99 in #4615
CI/CD
New Contributors
- @QuantumMisaka made their first contribution in #4510
- @RMeli made their first contribution in #4577
Full Changelog: v3.0.1...v3.0.2
v3.0.1
This patch version only contains bug fixes, enhancements, and documentation improvements.
What's Changed
Enhancements
- Perf: print summary on rank 0 (#4434)
- perf: optimize training loop (#4426)
- chore: refactor training loop (#4435)
- Perf: remove redundant checks on data integrity (#4433)
- Perf: use fused Adam optimizer (#4463)
Bug fixes
- Fix: add model_def_script to ZBL (#4423)
- fix: add pairtab compression (#4432)
- fix(tf): pass type_one_side & exclude_types to DPTabulate in
se_r
(#4446) - fix: print dlerror if dlopen fails (#4485)
Documentation
- chore(pt): update multitask example (#4419)
- docs: update DPA-2 citation (#4483)
- docs: update deepmd-gnn URL (#4482)
- docs: fix a minor typo on the title of install-from-c-library.md (#4484)
Other Changes
- build(deps): bump pypa/cibuildwheel from 2.21 to 2.22 by @dependabot in #4408
Full Changelog: v3.0.0...v3.0.1
v3.0.0
DeePMD-kit v3: Multiple-backend Framework, DPA-2 Large Atomic Model, and Plugin Mechanisms
After eight months of public tests, we are excited to present the first stable version of DeePMD-kit v3, an advanced version that enables deep potential models with TensorFlow, PyTorch, or JAX backends. Additionally, DeePMD-kit v3 introduces support for the DPA-2 model, a novel architecture optimized for large atomic models. This release enhances plugin mechanisms, making integrating and developing new models easier.
Highlights
Multiple-backend framework: TensorFlow, PyTorch, and JAX support
DeePMD-kit v3 adds a versatile, pluggable framework providing consistent training and inference experience across multiple backends. Version 3.0.0 includes:
- TensorFlow backend: Known for its computational efficiency with a static graph design.
- PyTorch backend: A dynamic graph backend that simplifies model extension and development.
- DP backend: Built with NumPy and Array API, a reference backend for development without heavy deep-learning frameworks.
- JAX backend: Based on the DP backend via Array API, a static graph backend.
Features | TensorFlow | PyTorch | JAX | DP |
---|---|---|---|---|
Descriptor local frame | ✅ | |||
Descriptor se_e2_a | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e2_r | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e3 | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e3_tebd | ✅ | ✅ | ✅ | |
Descriptor DPA1 | ✅ | ✅ | ✅ | ✅ |
Descriptor DPA2 | ✅ | ✅ | ✅ | |
Descriptor Hybrid | ✅ | ✅ | ✅ | ✅ |
Fitting energy | ✅ | ✅ | ✅ | ✅ |
Fitting dipole | ✅ | ✅ | ✅ | ✅ |
Fitting polar | ✅ | ✅ | ✅ | ✅ |
Fitting DOS | ✅ | ✅ | ✅ | ✅ |
Fitting property | ✅ | ✅ | ✅ | |
ZBL | ✅ | ✅ | ✅ | ✅ |
DPLR | ✅ | |||
DPRc | ✅ | ✅ | ✅ | ✅ |
Spin | ✅ | ✅ | ✅ | |
Gradient calculation | ✅ | ✅ | ✅ | |
Model training | ✅ | ✅ | ||
Model compression | ✅ | ✅ | ||
Python inference | ✅ | ✅ | ✅ | ✅ |
C++ inference | ✅ | ✅ | ✅ |
Critical features of the multiple-backend framework include the ability to:
- Train models using different backends with the same training data and input script, allowing backend switching based on your efficiency or convenience needs.
# Training a model using the TensorFlow backend
dp --tf train input.json
dp --tf freeze
dp --tf compress
# Training a model using the PyTorch backend
dp --pt train input.json
dp --pt freeze
dp --pt compress
- Convert models between backends using
dp convert-backend
, with backend-specific file extensions (e.g.,.pb
for TensorFlow and.pth
for PyTorch).
# Convert from a TensorFlow model to a PyTorch model
dp convert-backend frozen_model.pb frozen_model.pth
# Convert from a PyTorch model to a TensorFlow model
dp convert-backend frozen_model.pth frozen_model.pb
# Convert from a PyTorch model to a JAX model
dp convert-backend frozen_model.pth frozen_model.savedmodel
# Convert from a PyTorch model to the backend-independent DP format
dp convert-backend frozen_model.pth frozen_model.dp
- Run inference across backends via interfaces like
dp test
, Python/C++/C interfaces, or third-party packages (e.g., dpdata, ASE, LAMMPS, AMBER, Gromacs, i-PI, CP2K, OpenMM, ABACUS, etc.).
# In a LAMMPS file:
# run LAMMPS with a TensorFlow backend model
pair_style deepmd frozen_model.pb
# run LAMMPS with a PyTorch backend model
pair_style deepmd frozen_model.pth
# run LAMMPS with a JAX backend model
pair_style deepmd frozen_model.savedmodel
# Calculate model deviation using different models
pair_style deepmd frozen_model.pb frozen_model.pth frozen_model.savedmodel out_file md.out out_freq 100
- Add a new backend to DeePMD-kit much more quickly if you want to contribute to DeePMD-kit.
DPA-2 model: a large atomic model as a multi-task learner
The DPA-2 model offers a robust architecture for large atomic models (LAM), accurately representing diverse chemical systems for high-quality simulations. In this release, DPA-2 can be trained using the PyTorch backend, supporting both single-task (see examples/water/dpa2
) or multi-task (see examples/water_multi_task/pytorch_example
) training schemes. DPA-2 is available for Python/C++ inference in the JAX backend.
The DPA-2 descriptor comprises repinit
and repformer
, as shown below.
The PyTorch backend supports training strategies for large atomic models, including:
- Parallel training: Train large atomic models on multiple GPUs for efficiency.
torchrun --nproc_per_node=4 --no-python dp --pt train input.json
- Multi-task training: For large atomic models trained across a broad range of data calculated on different DFT levels with shared descriptors. An example is given in
examples/water_multi_task/pytorch_example/input_torch.json
. - Finetune: Training a pre-train large atomic model on a smaller, task-specific dataset. The PyTorch backend has supported
--finetune
argument in thedp --pt train
command line.
Plugin mechanisms for external models
In version 3.0.0, the plugin capabilities have been implemented to support the development and integration of potential energy models using TensorFlow, PyTorch, or JAX backends, leveraging DeePMD-kit's trainer, loss functions, and interfaces. A plugin example is deepmd-gnn, which supports training the MACE and NequIP models in the DeePMD-kit with the familiar commands.
dp --pt train mace.json
dp --pt freeze
dp --pt test -m frozen_model.pth -s ../data/
Other new features
- Descriptor se_e3_tebd. (#4066)
- Fitting the property (#3867).
- New training parameters:
max_ckpt_keep
(#3441),change_bias_after_training
(#3993), andstat_file
. - New command line interface:
dp change-bias
(#3993) anddp show
(#3796). - Support generating JSON schema for integration with VSCode (#3849).
- The latest LAMMPS version (stable_29Aug2024_update1) is supported. (#4088, #4179)
Breaking changes
- The deepmodeling conda channel is deprecated. Use the conda-forge channel instead. (#3462, #4385)
- The offline package and conda packages for CUDA 11 are dropped.
- Python 3.7 and 3.8 supports are dropped. (#3185, #4185)
- The minimal versions of deep learning frameworks: TensorFlow 2.7, PyTorch 2.1, JAX 0.4.33, and NumPy 1.21.
- We require all model files to have the correct filename extension for all interfaces so a corresponding backend can load them. TensorFlow model files must end with
.pb
extension. - Bias is removed by default from type embedding. (#3958)
- The spin model is refactored, and its usage in the LAMMPS module has been changed. (#3301, #4321)
- Multi-task training support is removed from the TensorFlow backend. (#3763)
- The
set_prefix
key is deprecated. (#3753) dp test
now uses all sets for training and test. In previous versions, only the last set is used as the test set in dp test. (#3862)- The Python module structure is fully refactored. The old
deepmd
module was moved todeepmd.tf
without other API changes, anddeepmd_utils
was moved todeepmd
without other API changes. (#3177, #3178) - Python class
DeepTensor
(includingDeepDiople
andDeepPolar
) now returns atomic tensor in the dimension ofnatoms
instead ofnsel_atoms
. (#3390) - C++ 11 support is dropped. (#4068)
For other changes, refer to Full Changelog: v2.2.11...v3.0.0rc0
Contributors
The PyTorch backend was developed in the dptech-corp/deepmd-pytorch repository, and then it was fully merged into the deepmd-kit repository in #3180. Contributors to the deepmd-pytorch repository:
- @20171130
- @CaRoLZhangxy
- @amcadmus
- @guolinke
- @iProzd
- @nahso
- @njzjz
- @qin2xue3jian4
- @shishaochen
- @zjgemi
Contributors to the deepmd-kit repository:
- @CaRoLZhangxy: #3162 #3287 #3337 #3375 #3379 #3434 #3436 #3612 #3613 #3614 #3656 #3657 #3740 #3780 #3917 #3919 #4209 #4237
- @Chengqian-Zhang: #3615 #3796 #3828 #3840 #3867 #3912 #4120 #4145 #4280
- @ChiahsinChu: #4246 #4248
- @Cloudac7: #4031
- @HydrogenSulfate: #4117
- @LiuGroupHNU: #3978
- @Mancn-Xu: #3567
- @Yi-FanLi: #3822 #4013 #4084 #4283
- @anyangml: #3192 #3210 #3212 #3248 #3266 #3281 #3296 #3309 #3314 #3321 #3327 #3338 #3351 #3362 #3376 #3385 #3398 #3410 #3426 #3432 #3435 #3447 #3451 #3452 #3468 #3485 #3486 #3575 #3584 #3654 #3662 #3663 #3706 #3757 #3759 #3812 #3824 #3876 #3946 #3975 #4194 #4205 #4292 #4296 #4335 #4339 #4370 #4380
- @caic99: #3465 #4165 #4401
- @chazeon: #3473 #3652 #3653 #3739
- @cherryWangY: #3877 #4227 #4297 #4298 #4299 #4300
- @dependabot: #3231 #3312 #3446 #3487 #3777 #3882 #4045 #4127 #4374
- @hztttt: #3762
- @iProzd: #3180 #3203 #3245 #3261 #3301 #3355 #3359 #3367 #3371 #3378 #3380 #3387 #3388 #3409 #3411 #3441 #3442 #3445 #3456 #3480 #3569 #3571 #3573 #3607 #3616 #3619 #3696 #3698 #3712 #3717 #3718 #3725 #3746 #3748 #3758 #3763 #3768 #3773 #3774 #3775 #3781 #3782 #3785 #3803 #3813 #3814 #3815 #3826 #3837 #3841 #3842 #3843 #3873 #3906 #3914 #3916 #3925 #3926 #3927 #3933 #3944 #3945 #3957 #3958 #3967 #3971 #39...
v3.0.0rc0
DeePMD-kit v3: Multiple-backend Framework, DPA-2 Large Atomic Model, and Plugin Mechanisms
We are excited to present the first release candidate of DeePMD-kit v3, an advanced version that enables deep potential models with TensorFlow, PyTorch, or JAX backends. Additionally, DeePMD-kit v3 introduces support for the DPA-2 model, a novel architecture optimized for large atomic models. This release enhances plugin mechanisms, making integrating and developing new models easier.
Highlights
Multiple-backend framework: TensorFlow, PyTorch, and JAX support
DeePMD-kit v3 adds a versatile, pluggable framework providing consistent training and inference experience across multiple backends. Version 3.0.0 includes:
- TensorFlow backend: Known for its computational efficiency with a static graph design.
- PyTorch backend: A dynamic graph backend that simplifies model extension and development.
- DP backend: Built with NumPy and Array API, a reference backend for development without heavy deep-learning frameworks.
- JAX backend: Based on the DP backend via Array API, a static graph backend.
Features | TensorFlow | PyTorch | JAX | DP |
---|---|---|---|---|
Descriptor local frame | ✅ | |||
Descriptor se_e2_a | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e2_r | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e3 | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e3_tebd | ✅ | ✅ | ✅ | |
Descriptor DPA1 | ✅ | ✅ | ✅ | ✅ |
Descriptor DPA2 | ✅ | ✅ | ✅ | |
Descriptor Hybrid | ✅ | ✅ | ✅ | ✅ |
Fitting energy | ✅ | ✅ | ✅ | ✅ |
Fitting dipole | ✅ | ✅ | ✅ | ✅ |
Fitting polar | ✅ | ✅ | ✅ | ✅ |
Fitting DOS | ✅ | ✅ | ✅ | ✅ |
Fitting property | ✅ | ✅ | ✅ | |
ZBL | ✅ | ✅ | ✅ | ✅ |
DPLR | ✅ | |||
DPRc | ✅ | ✅ | ✅ | ✅ |
Spin | ✅ | ✅ | ✅ | |
Gradient calculation | ✅ | ✅ | ✅ | |
Model training | ✅ | ✅ | ||
Model compression | ✅ | ✅ | ||
Python inference | ✅ | ✅ | ✅ | ✅ |
C++ inference | ✅ | ✅ | ✅ |
Critical features of the multiple-backend framework include the ability to:
- Train models using different backends with the same training data and input script, allowing backend switching based on your efficiency or convenience needs.
# Training a model using the TensorFlow backend
dp --tf train input.json
dp --tf freeze
dp --tf compress
# Training a model using the PyTorch backend
dp --pt train input.json
dp --pt freeze
dp --pt compress
- Convert models between backends using
dp convert-backend
, with backend-specific file extensions (e.g.,.pb
for TensorFlow and.pth
for PyTorch).
# Convert from a TensorFlow model to a PyTorch model
dp convert-backend frozen_model.pb frozen_model.pth
# Convert from a PyTorch model to a TensorFlow model
dp convert-backend frozen_model.pth frozen_model.pb
# Convert from a PyTorch model to a JAX model
dp convert-backend frozen_model.pth frozen_model.savedmodel
# Convert from a PyTorch model to the backend-independent DP format
dp convert-backend frozen_model.pth frozen_model.dp
- Run inference across backends via interfaces like
dp test
, Python/C++/C interfaces, or third-party packages (e.g., dpdata, ASE, LAMMPS, AMBER, Gromacs, i-PI, CP2K, OpenMM, ABACUS, etc.).
# In a LAMMPS file:
# run LAMMPS with a TensorFlow backend model
pair_style deepmd frozen_model.pb
# run LAMMPS with a PyTorch backend model
pair_style deepmd frozen_model.pth
# run LAMMPS with a JAX backend model
pair_style deepmd frozen_model.savedmodel
# Calculate model deviation using different models
pair_style deepmd frozen_model.pb frozen_model.pth frozen_model.savedmodel out_file md.out out_freq 100
- Add a new backend to DeePMD-kit much more quickly if you want to contribute to DeePMD-kit.
DPA-2 model: Towards a universal large atomic model for molecular and material simulation
The DPA-2 model offers a robust architecture for large atomic models (LAM), accurately representing diverse chemical systems for high-quality simulations. In this release, DPA-2 is trainable in the PyTorch backend, with an example configuration available in examples/water/dpa2
. DPA-2 is available for Python inference in the JAX backend.
The DPA-2 descriptor comprises repinit
and repformer
, as shown below.
The PyTorch backend supports training strategies for large atomic models, including:
- Parallel training: Train large atomic models on multiple GPUs for efficiency.
torchrun --nproc_per_node=4 --no-python dp --pt train input.json
- Multi-task training: For large atomic models trained across a broad range of data calculated on different DFT levels with shared descriptors. An example is given in
examples/water_multi_task/pytorch_example/input_torch.json
. - Finetune: Training a pre-train large atomic model on a smaller, task-specific dataset. The PyTorch backend has supported
--finetune
argument in thedp --pt train
command line.
Plugin mechanisms for external models
In v3.0.0, plugin capabilities allow you to develop models with TensorFlow, PyTorch, or JAX, leveraging DeePMD-kit's trainer, loss functions, and interfaces. A plugin example is deepmd-gnn, which supports training the MACE and NequIP models in the DeePMD-kit with the familiar commands.
dp --pt train mace.json
dp --pt freeze
dp --pt test -m frozen_model.pth -s ../data/
Other new features
- Descriptor se_e3_tebd. (#4066)
- Fitting the property (#3867).
- New training parameters:
max_ckpt_keep
(#3441),change_bias_after_training
(#3993), andstat_file
. - New command line interface:
dp change-bias
(#3993) anddp show
(#3796). - Support generating JSON schema for integration with VSCode (#3849).
- The latest LAMMPS version (stable_29Aug2024_update1) is supported. (#4088, #4179)
Breaking changes
- Python 3.7 and 3.8 supports are dropped. (#3185, #4185)
- We require all model files to have the correct filename extension for all interfaces so a corresponding backend can load them. TensorFlow model files must end with
.pb
extension. - Bias is removed by default from type embedding. (#3958)
- The spin model is refactored, and its usage in the LAMMPS module has been changed. (#3301, #4321)
- Multi-task training support is removed from the TensorFlow backend. (#3763)
- The
set_prefix
key is deprecated. (#3753) dp test
now uses all sets for training and test. In previous versions, only the last set is used as the test set in dp test. (#3862)- The Python module structure is fully refactored. The old
deepmd
module was moved todeepmd.tf
without other API changes, anddeepmd_utils
was moved todeepmd
without other API changes. (#3177, #3178) - Python class
DeepTensor
(includingDeepDiople
andDeepPolar
) now returns atomic tensor in the dimension ofnatoms
instead ofnsel_atoms
. (#3390) - C++ 11 support is dropped. (#4068)
For other changes, refer to Full Changelog: v2.2.11...v3.0.0rc0
Contributors
The PyTorch backend was developed in the dptech-corp/deepmd-pytorch repository, and then it was fully merged into the deepmd-kit repository in #3180. Contributors to the deepmd-pytorch repository:
- @20171130
- @CaRoLZhangxy
- @amcadmus
- @guolinke
- @iProzd
- @nahso
- @njzjz
- @qin2xue3jian4
- @shishaochen
- @zjgemi
Contributors to the deepmd-kit repository:
- @CaRoLZhangxy: #3162 #3287 #3337 #3375 #3379 #3434 #3436 #3612 #3613 #3614 #3656 #3657 #3740 #3780 #3917 #3919 #4209 #4237
- @Chengqian-Zhang: #3615 #3796 #3828 #3840 #3867 #3912 #4120 #4145 #4280
- @ChiahsinChu: #4246 #4248
- @Cloudac7: #4031
- @HydrogenSulfate: #4117
- @LiuGroupHNU: #3978
- @Mancn-Xu: #3567
- @Yi-FanLi: #3822 #4013 #4084 #4283
- @anyangml: #3192 #3210 #3212 #3248 #3266 #3281 #3296 #3309 #3314 #3321 #3327 #3338 #3351 #3362 #3376 #3385 #3398 #3410 #3426 #3432 #3435 #3447 #3451 #3452 #3468 #3485 #3486 #3575 #3584 #3654 #3662 #3663 #3706 #3757 #3759 #3812 #3824 #3876 #3946 #3975 #4194 #4205 #4292 #4335 #4339
- @caic99: #3465 #4165
- @chazeon: #3473 #3652 #3653 #3739
- @cherryWangY: #3877 #4227 #4297 #4298 #4299 #4300
- @dependabot: #3231 #3312 #3446 #3487 #3777 #3882 #4045 #4127
- @hztttt: #3762
- @iProzd: #3180 #3203 #3245 #3261 #3301 #3355 #3359 #3367 #3371 #3378 #3380 #3387 #3388 #3409 #3411 #3441 #3442 #3445 #3456 #3480 #3569 #3571 #3573 #3607 #3616 #3619 #3696 #3698 #3712 #3717 #3718 #3725 #3746 #3748 #3758 #3763 #3768 #3773 #3774 #3775 #3781 #3782 #3785 #3803 #3813 #3814 #3815 #3826 #3837 #3841 #3842 #3843 #3873 #3906 #3914 #3916 #3925 #3926 #3927 #3933 #3944 #3945 #3957 #3958 #3967 #3971 #3976 #3992 #3993 #4006 #4007 #4015 #4066 #4089 #4138 #4139 #4148 #4162 #4222 #4223 #4224 #4225 #4243 #4244 #4321 #4323 #4324 #4344 #4353 #4354
- @iid-ccme: #4340
- @nahso: #3726 #3727
- @njzjz: #3164 #3167 #3169 #3170 #3171 #3172 #3173 #3174 #3175 #3176 #3177 #3178 #3179 #3181 #3185 #3186 #3187 #3191 #3193 #3194 #3195 #3196 #3198 #3200 #3201 #3204 #3205 #3206 #3207 #3213 #3217 #3220 #3221 #3222 #3223 #3226 #3228 #3229 #3237 #3238 #3239 #3243 #3244 #3247 #3249 #3250 #325...
v3.0.0b4
What's Changed
Breaking changes
- breaking: drop C++ 11 by @njzjz in #4068
- breaking(pt/dp): tune new sub-structures for DPA2 by @iProzd in #4089
The default values of new optionsg1_out_conv
andg1_out_mlp
are set toTrue
. The behaviors in previous versions areFalse
.
New features
- feat pt : Support property fitting by @Chengqian-Zhang in #3867
- feat(pt/dp): support three-body type embedding by @iProzd in #4066
- feat: load customized OP library in the C++ interface by @njzjz in #4073
- feat: make
dp neighbor-stat --type-map
optional by @njzjz in #4049 - feat: directional nlist by @wanghan-iapcm in #4052
- feat(pt): support
eval_typeebd
forDeepEval
by @njzjz in #4110 - feat:
DeepEval.get_model_def_script
and commondp show
by @njzjz in #4131 - chore: support preset bias of atomic model output by @wanghan-iapcm in #4116
- feat(jax): support neural networks in #4156
Enhancement
- fix: bump LAMMPS to stable_29Aug2024 by @njzjz in #4088
- chore(pt): cleanup deadcode by @wanghan-iapcm in #4142
- chore(pt): make comm_dict for dpa2 noncompulsory when nghost is 0 by @njzjz in #4144
- Set ROCM_ROOT to ROCM_PATH when it exist by @sigbjobo in #4150
- chore(pt): move deepmd.pt.infer.deep_eval.eval_model to tests by @njzjz in #4153
Documentation
- docs: improve docs for environment variables by @njzjz in #4070
- docs: dynamically generate command outputs by @njzjz in #4071
- docs: improve error message for inconsistent type maps by @njzjz in #4074
- docs: add multiple packages to
intersphinx_mapping
by @njzjz in #4075 - docs: document CMake variables using Sphinx styles by @njzjz in #4079
- docs: update ipi installation command by @njzjz in #4081
- docs: fix the default value of
DP_ENABLE_PYTORCH
by @njzjz in #4083 - docs: fix defination of
se_e3
by @njzjz in #4113 - docs: update DeepModeling URLs by @njzjz-bot in #4119
- docs(pt): examples for new dpa2 model by @iProzd in #4138
Bugfix
- fix: fix PT AutoBatchSize OOM bug and merge execute_all into base by @njzjz in #4047
- fix: replace
datetime.datetime.utcnow
which is deprecated by @njzjz in #4067 - fix:fix LAMMPS MPI tests with mpi4py 4.0.0 by @njzjz in #4032
- fix(pt): invalid type_map when multitask training by @Cloudac7 in #4031
- fix: manage testing models in a standard way by @njzjz in #4028
- fix(pt): fix ValueError when array byte order is not native by @njzjz in #4100
- fix(pt): convert
torch.__version__
tostr
when serializing by @njzjz in #4106 - fix(tests): fix
skip_dp
by @njzjz in #4111 - [Fix] Wrap log_path with Path by @HydrogenSulfate in #4117
- fix: bugs in uts for property fit by @Chengqian-Zhang in #4120
- fix: type of the preset out bias by @wanghan-iapcm in #4135
- fix(pt): fix zero inputs for LayerNorm by @njzjz in #4134
- fix(pt/dp): share params of repinit_three_body by @iProzd in #4139
- fix(pt): move entry point from deepmd.pt.model to deepmd.pt by @njzjz in #4146
- fix: fix DPH5Path.glob for new keys by @njzjz in #4152
- fix(pt): make state_dict safe for weights_only by @iProzd in #4148
- fix(pt): fix compute_output_stats_global when atomic_output is None by @njzjz in #4155
- fix(pt ut): make separated uts deterministic by @iProzd in #4162
- fix(pt): finetuning property/dipole/polar/dos fitting with multi-dimensional data causes error by @Chengqian-Zhang in #4145
Dependency updates
- chore(deps): bump scikit-build-core to 0.9.x by @njzjz in #4038
- build(deps): bump pypa/cibuildwheel from 2.19 to 2.20 by @dependabot in #4045
- build(deps): bump pypa/cibuildwheel from 2.20 to 2.21 by @dependabot in #4127
CI/CD
- ci: add
include-hidden-files
toactions/upload-artifact
by @njzjz in #4095 - ci: test Python 3.12 by @njzjz in #4059
- CI(codecov): do not notify until all reports are ready by @njzjz in #4136
Full Changelog: v3.0.0b3...v3.0.0b4