Skip to content

Commit

Permalink
Merge pull request #6 from deepmodeling/devel
Browse files Browse the repository at this point in the history
Devel update
  • Loading branch information
iProzd committed Apr 23, 2021
2 parents 7374920 + 7defc15 commit 7f45c18
Show file tree
Hide file tree
Showing 76 changed files with 4,336 additions and 653 deletions.
79 changes: 28 additions & 51 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,48 +5,38 @@

# Table of contents
- [About DeePMD-kit](#about-deepmd-kit)
- [Highlights in v2.0](#highlights-in-deepmd-kit-v2.0)
- [Highlighted features](#highlighted-features)
- [Code structure](#code-structure)
- [License and credits](#license-and-credits)
- [Deep Potential in a nutshell](#deep-potential-in-a-nutshell)
- [Download and install](#download-and-install)
- [Use DeePMD-kit](#use-deepmd-kit)
- [Code structure](#code-structure)
- [Troubleshooting](#troubleshooting)

# About DeePMD-kit
DeePMD-kit is a package written in Python/C++, designed to minimize the effort required to build deep learning based model of interatomic potential energy and force field and to perform molecular dynamics (MD). This brings new hopes to addressing the accuracy-versus-efficiency dilemma in molecular simulations. Applications of DeePMD-kit span from finite molecules to extended systems and from metallic systems to chemically bonded systems.

For more information, check the [documentation](https://deepmd.readthedocs.io/).

## Highlights in DeePMD-kit v2.0

* [Model compression](doc/use-deepmd-kit.md#compress-a-model). Accelerate the efficiency of model inference for 4-15 times.
* [New descriptors](doc/use-deepmd-kit.md#write-the-input-script). Including [`se_e2_r`](doc/train-se-e2-r.md) and [`se_e3`](doc/train-se-e3.md).
* [Hybridization of descriptors](doc/train-hybrid.md). Hybrid descriptor constructed from concatenation of several descriptors.
* Atom type embedding.
* Training and inference the dipole (vector) and polarizability (matrix).
* Split of training and validation dataset.
* Optimized training on GPUs.


## Highlighted features
* **interfaced with TensorFlow**, one of the most popular deep learning frameworks, making the training process highly automatic and efficient, in addition Tensorboard can be used to visualize training procedure.
* **interfaced with high-performance classical MD and quantum (path-integral) MD packages**, i.e., LAMMPS and i-PI, respectively.
* **implements the Deep Potential series models**, which have been successfully applied to finite and extended systems including organic molecules, metals, semiconductors, and insulators, etc.
* **implements MPI and GPU supports**, makes it highly efficient for high performance parallel and distributed computing.
* **highly modularized**, easy to adapt to different descriptors for deep learning based potential energy models.

## Code structure
The code is organized as follows:

* `data/raw`: tools manipulating the raw data files.

* `examples`: example json parameter files.

* `source/3rdparty`: third-party packages used by DeePMD-kit.

* `source/cmake`: cmake scripts for building.

* `source/ipi`: source code of i-PI client.

* `source/lib`: source code of DeePMD-kit library.

* `source/lmp`: source code of Lammps module.

* `source/op`: tensorflow op implementation. working with library.

* `source/train`: Python modules and scripts for training and testing.


## License and credits
The project DeePMD-kit is licensed under [GNU LGPLv3.0](./LICENSE).
If you use this code in any future publications, please cite this using
Expand Down Expand Up @@ -87,43 +77,30 @@ A quick-start on using DeePMD-kit can be found [here](doc/use-deepmd-kit.md).
A full [document](doc/train-input.rst) on options in the training input script is available.


# Troubleshooting
In consequence of various differences of computers or systems, problems may occur. Some common circumstances are listed as follows.
If other unexpected problems occur, you're welcome to contact us for help.
# Code structure
The code is organized as follows:

## Model compatability
* `data/raw`: tools manipulating the raw data files.

When the version of DeePMD-kit used to training model is different from the that of DeePMD-kit running MDs, one has the problem of model compatability.
* `examples`: examples.

DeePMD-kit guarantees that the codes with the same major and minor revisions are compatible. That is to say v0.12.5 is compatible to v0.12.0, but is not compatible to v0.11.0 nor v1.0.0.
* `deepmd`: DeePMD-kit python modules.

## Installation: inadequate versions of gcc/g++
Sometimes you may use a gcc/g++ of version <4.9. If you have a gcc/g++ of version > 4.9, say, 7.2.0, you may choose to use it by doing
```bash
export CC=/path/to/gcc-7.2.0/bin/gcc
export CXX=/path/to/gcc-7.2.0/bin/g++
```
* `source/api_cc`: source code of DeePMD-kit C++ API.

If, for any reason, for example, you only have a gcc/g++ of version 4.8.5, you can still compile all the parts of TensorFlow and most of the parts of DeePMD-kit. i-Pi will be disabled automatically.
* `source/ipi`: source code of i-PI client.

## Installation: build files left in DeePMD-kit
When you try to build a second time when installing DeePMD-kit, files produced before may contribute to failure. Thus, you may clear them by
```bash
cd build
rm -r *
```
and redo the `cmake` process.
* `source/lib`: source code of DeePMD-kit library.

* `source/lmp`: source code of Lammps module.

## MD: cannot run LAMMPS after installing a new version of DeePMD-kit
This typically happens when you install a new version of DeePMD-kit and copy directly the generated `USER-DEEPMD` to a LAMMPS source code folder and re-install LAMMPS.
* `source/op`: tensorflow op implementation. working with library.

To solve this problem, it suffices to first remove `USER-DEEPMD` from LAMMPS source code by
```bash
make no-user-deepmd
```
and then install the new `USER-DEEPMD`.

If this does not solve your problem, try to decompress the LAMMPS source tarball and install LAMMPS from scratch again, which typically should be very fast.

# Troubleshooting

See the [troubleshooting page](doc/troubleshooting.md).


[1]: http://www.global-sci.com/galley/CiCP-2017-0213.pdf
Expand Down
12 changes: 10 additions & 2 deletions deepmd/loss/tensor.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,9 @@ def build (self,
polar_hat = label_dict[self.label_name]
polar = model_dict[self.tensor_name]

# YWolfeee: get the 2 norm of label, i.e. polar_hat
normalized_term = tf.sqrt(tf.reduce_sum(tf.square(polar_hat)))

# YHT: added for global / local dipole combination
l2_loss = global_cvt_2_tf_float(0.0)
more_loss = {
Expand Down Expand Up @@ -117,7 +120,7 @@ def build (self,
self.l2_loss_global_summary = tf.summary.scalar('l2_global_loss',
tf.sqrt(more_loss['global_loss']) / global_cvt_2_tf_float(atoms))

# YHT: should only consider atoms with dipole, i.e. atoms
# YWolfeee: should only consider atoms with dipole, i.e. atoms
# atom_norm = 1./ global_cvt_2_tf_float(natoms[0])
atom_norm = 1./ global_cvt_2_tf_float(atoms)
global_loss *= atom_norm
Expand All @@ -128,7 +131,12 @@ def build (self,
self.l2_l = l2_loss

self.l2_loss_summary = tf.summary.scalar('l2_loss', tf.sqrt(l2_loss))
return l2_loss, more_loss

# YWolfeee: loss normalization, do not influence the printed loss,
# just change the training process
#return l2_loss, more_loss
return l2_loss / normalized_term, more_loss


def eval(self, sess, feed_dict, natoms):
atoms = 0
Expand Down
12 changes: 6 additions & 6 deletions deepmd/train/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,22 +113,22 @@ def _init_param(self, jdata):
# elif fitting_type == 'wfc':
# self.fitting = WFCFitting(fitting_param, self.descrpt)
elif fitting_type == 'dipole':
if descrpt_type == 'se_a':
if descrpt_type == 'se_e2_a':
self.fitting = DipoleFittingSeA(**fitting_param)
else :
raise RuntimeError('fitting dipole only supports descrptors: se_a')
raise RuntimeError('fitting dipole only supports descrptors: se_e2_a')
elif fitting_type == 'polar':
# if descrpt_type == 'loc_frame':
# self.fitting = PolarFittingLocFrame(fitting_param, self.descrpt)
if descrpt_type == 'se_a':
if descrpt_type == 'se_e2_a':
self.fitting = PolarFittingSeA(**fitting_param)
else :
raise RuntimeError('fitting polar only supports descrptors: loc_frame and se_a')
raise RuntimeError('fitting polar only supports descrptors: loc_frame and se_e2_a')
elif fitting_type == 'global_polar':
if descrpt_type == 'se_a':
if descrpt_type == 'se_e2_a':
self.fitting = GlobalPolarFittingSeA(**fitting_param)
else :
raise RuntimeError('fitting global_polar only supports descrptors: loc_frame and se_a')
raise RuntimeError('fitting global_polar only supports descrptors: loc_frame and se_e2_a')
else :
raise RuntimeError('unknow fitting type ' + fitting_type)

Expand Down
27 changes: 24 additions & 3 deletions deepmd/utils/argcheck.py
Original file line number Diff line number Diff line change
Expand Up @@ -393,19 +393,40 @@ def loss_ener():
Argument("relative_f", [float,None], optional = True, doc = doc_relative_f)
]

# YWolfeee: Modified to support tensor type of loss args.
def loss_tensor(default_mode):
if default_mode == "local":
doc_global_weight = "The prefactor of the weight of global loss. It should be larger than or equal to 0. If not provided, training will be atomic mode, i.e. atomic label should be provided."
doc_local_weight = "The prefactor of the weight of atomic loss. It should be larger than or equal to 0. If it's not provided and global weight is provided, training will be global mode, i.e. global label should be provided. If both global and atomic weight are not provided, training will be atomic mode, i.e. atomic label should be provided."
return [
Argument("pref_weight", [float,int], optional = True, default = None, doc = doc_global_weight),
Argument("pref_atomic_weight", [float,int], optional = True, default = None, doc = doc_local_weight),
]
else:
doc_local_weight = "The prefactor of the weight of atomic loss. It should be larger than or equal to 0. If not provided, training will be global mode, i.e. global label should be provided."
doc_global_weight = "The prefactor of the weight of global loss. It should be larger than or equal to 0. If it's not provided and atomic weight is provided, training will be atomic mode, i.e. atomic label should be provided. If both global and atomic weight are not provided, training will be global mode, i.e. global label should be provided."
return [
Argument("pref_weight", [float,int], optional = True, default = None, doc = doc_global_weight),
Argument("pref_atomic_weight", [float,int], optional = True, default = None, doc = doc_local_weight),
]

def loss_variant_type_args():
doc_loss = 'The type of the loss. \n\.'
doc_loss = 'The type of the loss. The loss type should be set to the fitting type or left unset.\n\.'


return Variant("type",
[Argument("ener", dict, loss_ener())],
[Argument("ener", dict, loss_ener()),
Argument("dipole", dict, loss_tensor("local")),
Argument("polar", dict, loss_tensor("local")),
Argument("global_polar", dict, loss_tensor("global"))
],
optional = True,
default_tag = 'ener',
doc = doc_loss)


def loss_args():
doc_loss = 'The definition of loss function. The type of the loss depends on the type of the fitting. For fitting type `ener`, the prefactors before energy, force, virial and atomic energy losses may be provided. For fitting type `dipole`, `polar` and `global_polar`, the loss may be an empty `dict` or unset.'
doc_loss = 'The definition of loss function. The loss type should be set to the fitting type or left unset.\n\.'
ca = Argument('loss', dict, [],
[loss_variant_type_args()],
optional = True,
Expand Down
41 changes: 41 additions & 0 deletions doc/data-conv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Data


In this example we will convert the DFT labeled data stored in VASP `OUTCAR` format into the data format used by DeePMD-kit. The example `OUTCAR` can be found in the directory.
```bash
$deepmd_source_dir/examples/data_conv
```


## Definition

The DeePMD-kit organize data in **`systems`**. Each `system` is composed by a number of **`frames`**. One may roughly view a `frame` as a snap short on an MD trajectory, but it does not necessary come from an MD simulation. A `frame` records the coordinates and types of atoms, cell vectors if the periodic boundary condition is assumed, energy, atomic forces and virial. It is noted that the `frames` in one `system` share the same number of atoms with the same type.



## Data conversion

It is conveninent to use [dpdata](https://github.com/deepmodeling/dpdata) to convert data generated by DFT packages to the data format used by DeePMD-kit.

To install one can execute
```bash
pip install dpdata
```

An example of converting data [VASP](https://www.vasp.at/) data in `OUTCAR` format to DeePMD-kit data can be found at
```
$deepmd_source_dir/examples/data_conv
```

Switch to that directory, then one can convert data by using the following python script
```python
import dpdata
dsys = dpdata.LabeledSystem('OUTCAR')
dsys.to('deepmd/npy', 'deepmd_data', set_size = dsys.get_nframes())
```

`get_nframes()` method gets the number of frames in the `OUTCAR`, and the argument `set_size` enforces that the set size is equal to the number of frames in the system, viz. only one `set` is created in the `system`.

The data in DeePMD-kit format is stored in the folder `deepmd_data`.

A list of all [supported data format](https://github.com/deepmodeling/dpdata#load-data) and more nice features of `dpdata` can be found at the [official website](https://github.com/deepmodeling/dpdata).
25 changes: 25 additions & 0 deletions doc/train-hybrid.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Train a Deep Potential model using descriptor `"hybrid"`

This descriptor hybridize multiple descriptors to form a new descriptor. For example we have a list of descriptor denoted by D_1, D_2, ..., D_N, the hybrid descriptor this the concatenation of the list, i.e. D = (D_1, D_2, ..., D_N).

To use the descriptor in DeePMD-kit, one firstly set the `type` to `"hybrid"`, then provide the definitions of the descriptors by the items in the `list`,
```json=
"descriptor" :{
"type": "hybrid",
"list" : [
{
"type" : "se_e2_a",
...
},
{
"type" : "se_e2_r",
...
}
]
},
```

A complete training input script of this example can be found in the directory
```bash
$deepmd_source_dir/examples/water/hybrid/input.json
```
Loading

0 comments on commit 7f45c18

Please sign in to comment.