Skip to content

Commit

Permalink
Merge pull request #795 from AnguseZhang/master
Browse files Browse the repository at this point in the history
Merge devel to master
  • Loading branch information
AnguseZhang committed Jul 11, 2022
2 parents 0767dce + 5d5cb2f commit 8dea29e
Show file tree
Hide file tree
Showing 78 changed files with 2,933 additions and 928 deletions.
1 change: 1 addition & 0 deletions .gitignore
Expand Up @@ -41,3 +41,4 @@ dbconfig.json
.idea/*
_build
tests/generator/calypso_test_path
doc/api/
47 changes: 19 additions & 28 deletions README.md
Expand Up @@ -373,7 +373,6 @@ In `PARAM`, you can specialize the task as you expect.

"_comment": " that's all ",
"numb_models": 4,
"train_param": "input.json",
"default_training_param": {
"model": {
"type_map": [
Expand Down Expand Up @@ -499,9 +498,8 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
| **use_ele_temp** | int | 0 | Currently only support fp_style vasp. 0(default): no electron temperature. 1: eletron temperature as frame parameter. 2: electron temperature as atom parameter.
| *#Data*
| init_data_prefix | String | "/sharedext4/.../data/" | Prefix of initial data directories
| ***init_data_sys*** | List of string|["CH4.POSCAR.01x01x01/.../deepmd"] |Directories of initial data. You may use either absolute or relative path here.
| ***init_data_sys*** | List of string|["CH4.POSCAR.01x01x01/.../deepmd"] |Directories of initial data. You may use either absolute or relative path here. Systems will be detected recursively in the directories.
| ***sys_format*** | String | "vasp/poscar" | Format of initial data. It will be `vasp/poscar` if not set.
| init_multi_systems | Boolean | false | If set to `true`, `init_data_sys` directories should contain sub-directories of various systems. DP-GEN will regard all of these sub-directories as inital data systems.
| init_batch_size | String of integer | [8] | Each number is the batch_size of corresponding system for training in `init_data_sys`. One recommended rule for setting the `sys_batch_size` and `init_batch_size` is that `batch_size` mutiply number of atoms ot the stucture should be larger than 32. If set to `auto`, batch size will be 32 divided by number of atoms. |
| sys_configs_prefix | String | "/sharedext4/.../data/" | Prefix of `sys_configs`
| **sys_configs** | List of list of string | [<br />["/sharedext4/.../POSCAR"], <br />["....../POSCAR"]<br />] | Containing directories of structures to be explored in iterations.Wildcard characters are supported here. |
Expand All @@ -515,10 +513,10 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
| *#Exploration*
| **model_devi_dt** | Float | 0.002 (recommend) | Timestep for MD |
| **model_devi_skip** | Integer | 0 | Number of structures skipped for fp in each MD
| **model_devi_f_trust_lo** | Float or List of float | 0.05 | Lower bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
| **model_devi_f_trust_hi** | Float or List of float | 0.15 | Upper bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
| **model_devi_v_trust_lo** | Float or List of float | 1e10 | Lower bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
| **model_devi_v_trust_hi** | Float or List of float | 1e10 | Upper bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
| **model_devi_f_trust_lo** | Float or List of float or Dict[str, float] | 0.05 | Lower bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
| **model_devi_f_trust_hi** | Float or List of float or Dict[str, float] | 0.15 | Upper bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
| **model_devi_v_trust_lo** | Float or List of float or Dict[str, float] | 1e10 | Lower bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
| **model_devi_v_trust_hi** | Float or List of float or Dict[str, float] | 1e10 | Upper bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
| model_devi_adapt_trust_lo | Boolean | False | Adaptively determines the lower trust levels of force and virial. This option should be used together with `model_devi_numb_candi_f`, `model_devi_numb_candi_v` and optionally with `model_devi_perc_candi_f` and `model_devi_perc_candi_v`. `dpgen` will make two sets: 1. From the frames with force model deviation lower than `model_devi_f_trust_hi`, select `max(model_devi_numb_candi_f, model_devi_perc_candi_f*n_frames)` frames with largest force model deviation. 2. From the frames with virial model deviation lower than `model_devi_v_trust_hi`, select `max(model_devi_numb_candi_v, model_devi_perc_candi_v*n_frames)` frames with largest virial model deviation. The union of the two sets is made as candidate dataset|
| model_devi_numb_candi_f | Int | 10 | See `model_devi_adapt_trust_lo`.|
| model_devi_numb_candi_v | Int | 0 | See `model_devi_adapt_trust_lo`.|
Expand All @@ -537,7 +535,8 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
| **model_devi_jobs["ensembles"]** | String | "nvt" | Determining which ensemble used in MD, **options** include “npt” and “nvt”. |
| model_devi_jobs["neidelay"] | Integer | "10" | delay building until this many steps since last build |
| model_devi_jobs["taut"] | Float | "0.1" | Coupling time of thermostat (ps) |
| model_devi_jobs["taup"] | Float | "0.5" | Coupling time of barostat (ps)
| model_devi_jobs["taup"] | Float | "0.5" | Coupling time of barostat (ps) |
| model_devi_jobs["model_devi_f_trust_lo"] <br> model_devi_jobs["model_devi_f_trust_hi"] <br> model_devi_jobs["model_devi_v_trust_lo"] <br> model_devi_jobs["model_devi_v_trust_hi"] | Float or Dict[str, float] | See global model_devi config above like **model_devi_f_trust_lo**. For dict, should be set for each index in sys_idx, respectively. |
| *#Labeling*
| **fp_style** | string | "vasp" | Software for First Principles. **Options** include “vasp”, “pwscf”, “siesta” and “gaussian” up to now. |
| **fp_task_max** | Integer | 20 | Maximum of structures to be calculated in `02.fp` of each iteration. |
Expand Down Expand Up @@ -571,7 +570,7 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
| **user_fp_params** | Dict | |Parameters for cp2k calculation. find detail in manual.cp2k.org. only the kind section must be set before use. we assume that you have basic knowledge for cp2k input.
| **external_input_path** | String | | Conflict with key:user_fp_params, use the template input provided by user, some rules should be followed, read the following text in detail.
| *fp_style == ABACUS*
| **user_fp_params** | Dict | |Parameters for ABACUS INPUT. find detail [Here](https://github.com/deepmodeling/abacus-develop/blob/develop/docs/input-main.md#out-descriptor). If `deepks_model` is set, the model file should be in the pseudopotential directory.
| **user_fp_params** | Dict | |Parameters for ABACUS INPUT. find detail [Here](https://github.com/deepmodeling/abacus-develop/blob/develop/docs/input-main.md#out-descriptor). If `deepks_model` is set, the model file should be in the pseudopotential directory. You can also set `KPT` file by adding `k_points` that corresponds to a list of six integers in this dictionary.
| **fp_orb_files** | List | |List of atomic orbital files. The files should be in pseudopotential directory.
| **fp_dpks_descriptor** | String | |DeePKS descriptor file name. The file should be in pseudopotential directory.

Expand Down Expand Up @@ -1016,7 +1015,6 @@ Here is an example of `param.json` for QM7 dataset:
"auto"
],
"numb_models": 4,
"train_param": "input.json",
"default_training_param": {
"model": {
"type_map": [
Expand Down Expand Up @@ -1086,7 +1084,6 @@ Here is an example of `param.json` for QM7 dataset:
},
"_comment": "that's all"
},
"use_clusters": true,
"fp_style": "gaussian",
"shuffle_poscar": false,
"fp_task_max": 1000,
Expand All @@ -1109,7 +1106,7 @@ Here is an example of `param.json` for QM7 dataset:
}
```

Here `pick_data` is the data to simplify and currently only supports `MultiSystems` containing `System` with `deepmd/npy` format, and `use_clusters` should always be `true`. `init_pick_number` and `iter_pick_number` are the numbers of picked frames. `e_trust_lo`, `e_trust_hi` mean the range of the deviation of the frame energy, and `f_trust_lo` and `f_trust_hi` mean the range of the max deviation of atomic forces in a frame. `fp_style` can only be `gaussian` currently. Other parameters are as the same as those of generator.
Here `pick_data` is the directory to data to simplify where the program recursively detects systems `System` with `deepmd/npy` format. `init_pick_number` and `iter_pick_number` are the numbers of picked frames. `e_trust_lo`, `e_trust_hi` mean the range of the deviation of the frame energy, and `f_trust_lo` and `f_trust_hi` mean the range of the max deviation of atomic forces in a frame. `fp_style` can only be `gaussian` currently. Other parameters are as the same as those of generator.


## Set up machine
Expand Down Expand Up @@ -1139,7 +1136,7 @@ an example of new dpgen's machine.json
```json
{
"api_version": "1.0",
"train": [
"train":
{
"command": "dp",
"machine": {
Expand All @@ -1163,9 +1160,8 @@ an example of new dpgen's machine.json
"para_deg": 3,
"source_list": ["/home/user1234/deepmd.1.2.4.env"]
}
}
],
"model_devi":[
},
"model_devi":
{
"command": "lmp",
"machine":{
Expand All @@ -1186,9 +1182,8 @@ an example of new dpgen's machine.json
"group_size": 5,
"source_list": ["/home/user1234/deepmd.1.2.4.env"]
}
}
],
"fp":[
},
"fp":
{
"command": "vasp_std",
"machine":{
Expand All @@ -1210,7 +1205,6 @@ an example of new dpgen's machine.json
"source_list": ["~/vasp.env"]
}
}
]
}
```
note1: the key "local_root" in dpgen's machine.json is always `./`
Expand All @@ -1222,7 +1216,7 @@ When switching into a new machine, you may modifying the `MACHINE`, according to
An example for `MACHINE` is:
```json
{
"train": [
"train":
{
"machine": {
"batch": "slurm",
Expand All @@ -1245,9 +1239,8 @@ An example for `MACHINE` is:
"qos": "data"
},
"command": "USERPATH/dp"
}
],
"model_devi": [
},
"model_devi":
{
"machine": {
"batch": "slurm",
Expand All @@ -1271,9 +1264,8 @@ An example for `MACHINE` is:
},
"command": "lmp_serial",
"group_size": 1
}
],
"fp": [
},
"fp":
{
"machine": {
"batch": "slurm",
Expand All @@ -1300,7 +1292,6 @@ An example for `MACHINE` is:
"command": "vasp_gpu",
"group_size": 1
}
]
}
```
Following table illustrates which key is needed for three types of machine: `train`,`model_devi` and `fp`. Each of them is a list of dicts. Each dict can be considered as an independent environmnet for calculation.
Expand Down
2 changes: 2 additions & 0 deletions conda/meta.yaml
Expand Up @@ -28,6 +28,7 @@ requirements:
- ase
- GromacsWrapper
- custodian
- netCDF4

run:
- python >=3.6
Expand All @@ -40,6 +41,7 @@ requirements:
- ase
- GromacsWrapper
- custodian
- netCDF4

test:
imports:
Expand Down
32 changes: 27 additions & 5 deletions doc/conf.py
Expand Up @@ -40,17 +40,20 @@

extensions = [
'deepmodeling_sphinx',
'dargs.sphinx',
"sphinx_rtd_theme",
'myst_parser',
'sphinx.ext.autosummary',
'sphinx.ext.viewcode',
'sphinxarg.ext',
]


# Tell sphinx what the primary language being documented is.
primary_domain = 'cpp'
primary_domain = 'py'

# Tell sphinx what the pygments highlight language should be.
highlight_language = 'cpp'
highlight_language = 'py'

#
myst_heading_anchors = 4
Expand Down Expand Up @@ -81,9 +84,28 @@
autosummary_generate = True
master_doc = 'index'

intersphinx_mapping = {
"python": ("https://docs.python.org/", None),
"dargs": ("https://docs.deepmodeling.com/projects/dargs/en/latest/", None),
"dpdata": ("https://docs.deepmodeling.com/projects/dpdata/en/latest/", None),
"dpdispatcher": ("https://docs.deepmodeling.com/projects/dpdispatcher/en/latest/", None),
"ase": ("https://wiki.fysik.dtu.dk/ase/", None),
"numpy": ("https://docs.scipy.org/doc/numpy/", None),
"pamatgen": ("https://pymatgen.org/", None),
"monty": ("https://guide.materialsvirtuallab.org/monty/", None),
"paramiko": ("https://docs.paramiko.org/en/stable/", None),
"custodian": ("https://cloudcustodian.io/docs/", None),
"GromacsWrapper": ("https://gromacswrapper.readthedocs.io/en/latest/", None),
}


def run_apidoc(_):
from sphinx.ext.apidoc import main
sys.path.append(os.path.join(os.path.dirname(__file__), '..'))
cur_dir = os.path.abspath(os.path.dirname(__file__))
module = os.path.join(cur_dir, "..", "dpgen")
main(['-M', '--tocfile', 'api', '-H', 'DP-GEN API', '-o', os.path.join(cur_dir, "api"), module, '--force'])

def generate_arginfo(app):
subprocess.check_output((sys.executable, "gen_arginfo.py"), universal_newlines=True)

def setup(app):
app.connect('builder-inited', generate_arginfo)
app.connect('builder-inited', run_apidoc)
5 changes: 0 additions & 5 deletions doc/gen_arginfo.py

This file was deleted.

58 changes: 53 additions & 5 deletions doc/index.rst
Expand Up @@ -2,13 +2,60 @@
DPGEN's documentation
==========================

.. _parameters::
.. _overview::

.. toctree::
:maxdepth: 2
:caption: Parameters
:caption: Overview

overview/cli


.. _installation::

.. toctree::
:maxdepth: 2
:caption: Installation


.. _run::

.. toctree::
:maxdepth: 2
:caption: Run

run/run-process.rst
run/param.rst
run/mdata.rst

.. _init::

.. toctree::
:maxdepth: 2
:caption: Init

init/init-bulk-mdata
init/init-surf-mdata
init/init-reaction
init/init-reaction-jdata
init/init-reaction-mdata

.. _autotest::

.. toctree::
:maxdepth: 2
:caption: Autotest


.. _simplify::

.. toctree::
:maxdepth: 2
:caption: Simplify

simplify/simplify-jdata
simplify/simplify-mdata

run-mdata.rst

.. _tutorial:

Expand All @@ -17,16 +64,17 @@ DPGEN's documentation
:caption: Tutorial
:glob:

toymodels/*
Tutorials <https://tutorials.deepmodeling.com/en/latest/Tutorials/DP-GEN/>


.. _Contribution:

.. toctree::
:maxdepth: 2
:caption: Contribution Guild
:caption: Contribution Guide

README.md
api/api

* :ref:`genindex`
* :ref:`modindex`
Expand Down
6 changes: 6 additions & 0 deletions doc/init/init-bulk-mdata.rst
@@ -0,0 +1,6 @@
dpgen init_bulk machine parameters
==================================

.. dargs::
:module: dpgen.data.arginfo
:func: init_bulk_mdata_arginfo
6 changes: 6 additions & 0 deletions doc/init/init-reaction-jdata.rst
@@ -0,0 +1,6 @@
dpgen init_reaction parameters
======================================

.. dargs::
:module: dpgen.data.arginfo
:func: init_reaction_jdata_arginfo
6 changes: 6 additions & 0 deletions doc/init/init-reaction-mdata.rst
@@ -0,0 +1,6 @@
dpgen init_reaction machine parameters
======================================

.. dargs::
:module: dpgen.data.arginfo
:func: init_reaction_mdata_arginfo
21 changes: 21 additions & 0 deletions doc/init/init-reaction.md
@@ -0,0 +1,21 @@
# init_reaction

`dpgen init_reaction` is a workflow to initilize data for reactive systems of small gas-phase molecules. The workflow was introduced in the "Initialization" section of [Energy & Fuels, 2021, 35 (1), 762–769](https://10.1021/acs.energyfuels.0c03211).

To start the workflow, one needs a box containing reactive systems. The following packages are required for each of the step:
- Exploring: [LAMMPS](https://github.com/lammps/lammps)
- Sampling: [MDDatasetBuilder](https://github.com/tongzhugroup/mddatasetbuilder)
- Labeling: [Gaussian](https://gaussian.com/)

The Exploring step uses LAMMPS [pair_style reaxff](https://docs.lammps.org/latest/pair_reaxff.html) to run a short ReaxMD NVT MD simulation. In the Sampling step, molecular clusters are taken and k-means clustering algorithm is applied to remove the redundancy, which is described in [Nature Communications, 11, 5713 (2020)](https://doi.org/10.1038/s41467-020-19497-z). The Labeling step calculates energies and forces using the Gaussian package.

An example of `reaction.json` is given below:

```{literalinclude} ../../examples/init/reaction.json
:language: json
:linenos:
```

For detailed parameters, see [parametes](init-reaction-jdata.rst) and [machine parameters](init-reaction-mdata.rst).

The genereated data can be used to continue DP-GEN concurrent learning workflow. Read [Energy & Fuels, 2021, 35 (1), 762–769](https://10.1021/acs.energyfuels.0c03211) for details.
6 changes: 6 additions & 0 deletions doc/init/init-surf-mdata.rst
@@ -0,0 +1,6 @@
dpgen init_surf machine parameters
==================================

.. dargs::
:module: dpgen.data.arginfo
:func: init_surf_mdata_arginfo
7 changes: 7 additions & 0 deletions doc/overview/cli.rst
@@ -0,0 +1,7 @@
Command line interface
======================

.. argparse::
:module: dpgen.main
:func: main_parser
:prog: dpgen
2 changes: 2 additions & 0 deletions doc/requirements.txt
Expand Up @@ -2,6 +2,8 @@ sphinx>=4.0.2
recommonmark
sphinx_rtd_theme
sphinx_markdown_tables
sphinx-argparse
myst-parser
deepmodeling_sphinx
dargs>=0.3.1
.
4 changes: 0 additions & 4 deletions doc/run-mdata.rst

This file was deleted.

0 comments on commit 8dea29e

Please sign in to comment.