Skip to content

Commit

Permalink
Merge pull request #40 from njzjz/master
Browse files Browse the repository at this point in the history
refactor documentation
  • Loading branch information
njzjz committed Jun 5, 2021
2 parents 7fadb30 + 4463083 commit f6fa6ed
Show file tree
Hide file tree
Showing 7 changed files with 149 additions and 143 deletions.
14 changes: 14 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
## How to contribute

DPDispatcher welcomes every people (or organization) to use under the LGPL-3.0 License.

And Contributions are welcome and are greatly appreciated! Every little bit helps, and credit will always be given.

If you want to contribute to dpdispatcher, just open a issue, submiit a pull request , leave a comment on github discussion, or contact deepmodeling team.

Any forms of improvement are welcome.

- use, star or fork dpdispatcher
- improve the documents
- report or fix bugs
- request, discuss or implement features
154 changes: 13 additions & 141 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,152 +1,24 @@
origin files:
[https://www.yuque.com/xingyeyongtantiao/dpdispatcher/rdydgb](https://www.yuque.com/xingyeyongtantiao/dpdispatcher/rdydgb)

developers' discussion (temporarily in Chinese):
[https://www.yuque.com/docs/share/08ab09f3-f84d-4ed3-b777-9e0c791963b6?#](https://www.yuque.com/docs/share/08ab09f3-f84d-4ed3-b777-9e0c791963b6?#)

# DPDispatcher

## introduction:
### short introduction
dpdispatcher is a python package used to generate HPC(High Performance Computing) scheduler systems (Slurm/PBS/LSF/dpcloudserver) jobs input scripts and submit these scripts to HPC systems and poke until they finish.
DPDispatcher is a python package used to generate HPC(High Performance Computing) scheduler systems (Slurm/PBS/LSF/dpcloudserver) jobs input scripts and submit these scripts to HPC systems and poke until they finish.
DPDispatcher will monitor (poke) until these jobs finish and download the results files (if these jobs is running on remote systems connected by SSH).

dpdispatcher will monitor (poke) until these jobs finish and download the results files (if these jobs is running on remote systems connected by SSH).


### the set of abstraction provided by dpdispatcher.
`Task` class, which represents a command to be run on batch job system, as well as the essential files need by the command.

`Submission` class, which represents a collection of jobs defined by the HPC system.
And there may be common files to be uploaded by them.
dpdispatcher will create and submit these jobs when a `submission` instance execute `run_submission` method.
This method will poke until the jobs finish and return.

`Job` class, a class used by `Submission` class, which represents a job on the HPC system.
`Submission` will generate `job`s' submitting scripts used by HPC systems automatically with the `Task` and `Resources`

`Resources` class, which represents the computing resources for each job within a `submission`.


## How to contribute
dpdispatcher welcomes every people (or organization) to use under the LGPL-3.0 License.


And Contributions are welcome and are greatly appreciated! Every little bit helps, and credit will always be given.

If you want to contribute to dpdispatcher, just open a issue, submiit a pull request , leave a comment on github discussion, or contact deepmodeling team.

Any forms of improvement are welcome.

- use, star or fork dpdispatcher
- improve the documents
- report or fix bugs
- request, discuss or implement features

For more information, check the [documentation](https://dpdispatcher.readthedocs.io/).

## Installation

dpdispatcher is maintained by deepmodeling's developers now and welcome other people.
DPDispatcher can installed by `pip`:



## example


```python3
machine = Machine.load_from_json('machine.json')
resources = Resources.load_from_json('resources.json')

## with open('compute.json', 'r') as f:
## compute_dict = json.load(f)

## machine = Machine.load_from_dict(compute_dict['machine'])
## resources = Resources.load_from_dict(compute_dict['resources'])

task0 = Task.load_from_json('task.json')

task1 = Task(command='cat example.txt', task_work_path='dir1/', forward_files=['example.txt'], backward_files=['out.txt'], outlog='out.txt')
task2 = Task(command='cat example.txt', task_work_path='dir2/', forward_files=['example.txt'], backward_files=['out.txt'], outlog='out.txt')
task3 = Task(command='cat example.txt', task_work_path='dir3/', forward_files=['example.txt'], backward_files=['out.txt'], outlog='out.txt')
task4 = Task(command='cat example.txt', task_work_path='dir4/', forward_files=['example.txt'], backward_files=['out.txt'], outlog='out.txt')

task_list = [task0, task1, task2, task3, task4]

submission = Submission(work_base='lammps_md_300K_5GPa/',
machine=machine,
resources=reasources,
task_list=task_list,
forward_common_files=['graph.pb'],
backward_common_files=[]
)

## submission.register_task_list(task_list=task_list)

submission.run_submission(clean=False)
```bash
pip install dpdispatcher
```

example resources for GPU2080Ti
```python3
resources = Resources(number_node=1,
cpu_per_node=8,
gpu_per_node=2,
queue_name="GPU2080TI",
group_size=12,
custom_flags=[
"#SBATCH --mem=32G",
## "#SBATCH --account=deepmodeling"
## "#SBATCH --cluster=gpucluster"
],
strategy={'if_cuda_multi_devices': true},
para_deg=3,
source_list=["~/deepmd.env"],
)
```
## Usage

See [Getting Started](https://dpdispatcher.readthedocs.io/en/latest/getting-started.html) for usage.

machine.json
```json
{
"machine_type": "Slurm",
"context_type": "SSHContext",
"local_root" : "/home/user123/workplace/22_new_project/",
"remote_root": "~/dpdispatcher_work_dir/",
"remote_profile":{
"hostname": "39.106.xx.xxx",
"username": "user1",
"port": 22,
"timeout": 10
}
}
```

resources.json
```json
{
"number_node": 1,
"cpu_per_node": 4,
"gpu_per_node": 1,
"queue_name": "GPUV100",
"group_size": 5
}
```
## Contributing

task.json
```json
{
"command": "lmp -i input.lammps",
"task_work_path": "bct-0/",
"forward_files": [
"conf.lmp",
"input.lammps"
],
"backward_files": [
"log.lammps"
],
"outlog": "log",
"errlog": "err",
}
```
DPDispatcher is maintained by Deep Modeling's developers and welcome other people.
See [Contributing Guide](CONTRIBUTING.md) to become a contributor! 🤓
4 changes: 2 additions & 2 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@
# -- Project information -----------------------------------------------------

project = 'DPDispatcher'
copyright = '2020, Deep Potential'
author = 'Deep Potential'
copyright = '2020, Deep Modeling'
author = 'Deep Modeling'


# -- General configuration ---------------------------------------------------
Expand Down
102 changes: 102 additions & 0 deletions doc/getting-started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Getting Started

DPDispatcher provides the following classes:

- `Task` class, which represents a command to be run on batch job system, as well as the essential files need by the command.
- `Submission` class, which represents a collection of jobs defined by the HPC system.
And there may be common files to be uploaded by them.
DPDispatcher will create and submit these jobs when a `submission` instance execute `run_submission` method.
This method will poke until the jobs finish and return.
- `Job` class, a class used by `Submission` class, which represents a job on the HPC system.
`Submission` will generate `job`s' submitting scripts used by HPC systems automatically with the `Task` and `Resources`
- `Resources` class, which represents the computing resources for each job within a `submission`.

You can use DPDispatcher in a Python script to submit five tasks:

```python
from dpdispatcher import Machine, Resources, Task, Submission

machine = Machine.load_from_json('machine.json')
resources = Resources.load_from_json('resources.json')

task0 = Task.load_from_json('task.json')

task1 = Task(command='cat example.txt', task_work_path='dir1/', forward_files=['example.txt'], backward_files=['out.txt'], outlog='out.txt')
task2 = Task(command='cat example.txt', task_work_path='dir2/', forward_files=['example.txt'], backward_files=['out.txt'], outlog='out.txt')
task3 = Task(command='cat example.txt', task_work_path='dir3/', forward_files=['example.txt'], backward_files=['out.txt'], outlog='out.txt')
task4 = Task(command='cat example.txt', task_work_path='dir4/', forward_files=['example.txt'], backward_files=['out.txt'], outlog='out.txt')

task_list = [task0, task1, task2, task3, task4]

submission = Submission(work_base='lammps_md_300K_5GPa/',
machine=machine,
resources=reasources,
task_list=task_list,
forward_common_files=['graph.pb'],
backward_common_files=[]
)

submission.run_submission(clean=False)
```

where `machine.json` is
```json
{
"machine_type": "Slurm",
"context_type": "SSHContext",
"local_root" : "/home/user123/workplace/22_new_project/",
"remote_root": "~/dpdispatcher_work_dir/",
"remote_profile":{
"hostname": "39.106.xx.xxx",
"username": "user1",
"port": 22,
"timeout": 10
}
}
```

`resources.json` is
```json
{
"number_node": 1,
"cpu_per_node": 4,
"gpu_per_node": 1,
"queue_name": "GPUV100",
"group_size": 5
}
```

and `task.json` is
```json
{
"command": "lmp -i input.lammps",
"task_work_path": "bct-0/",
"forward_files": [
"conf.lmp",
"input.lammps"
],
"backward_files": [
"log.lammps"
],
"outlog": "log",
"errlog": "err",
}
```

You may also submit mutiple GPU jobs:
```python
resources = Resources(number_node=1,
cpu_per_node=8,
gpu_per_node=2,
queue_name="GPU2080TI",
group_size=12,
custom_flags=[
"#SBATCH --mem=32G",
],
strategy={'if_cuda_multi_devices': true},
para_deg=3,
source_list=["~/deepmd.env"],
)
```

The details of parameters can be found in [Machine Parameters](machine).
7 changes: 7 additions & 0 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,18 @@
DPDispatcher's documentation
======================================

DPDispatcher is a Python package used to generate HPC (High Performance Computing) scheduler systems (Slurm/PBS/LSF/dpcloudserver) jobs input scripts and submit these scripts to HPC systems and poke until they finish.

DPDispatcher will monitor (poke) until these jobs finish and download the results files (if these jobs is running on remote systems connected by SSH).

.. toctree::
:maxdepth: 2
:caption: Contents:


install
getting-started
machine
api


Expand Down
8 changes: 8 additions & 0 deletions doc/install.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Install DPDispatcher
====================

DPDispatcher can installed by `pip`:

```bash
pip install dpdispatcher
```
3 changes: 3 additions & 0 deletions doc/machine.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Machine parameters
======================================
.. include:: machine-auto.rst

0 comments on commit f6fa6ed

Please sign in to comment.