-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add one sentence "#SBATCH -A TG-DMR160007" #451
Labels
Comments
tfcao888666
added
documentation
Improvements or additions to documentation
enhancement
New feature or request
labels
Jul 1, 2021
Hi Jinze,
Thank you for the response. I have chance, "
"deepmd_path": "~/miniconda3/bin/dp",
"train_machine": {
"batch": "slurm",
"work_path" :
"/expanse/lustre/scratch/tfcao/temp_project/batis3-dp/ini",
"_comment" : "that's all"
},
"train_resources": {
"numb_node": 1,
"task_per_node":64,
"partition" : "compute",
"exclude_list" : [],
"source_list": [ "~/miniconda3/bin/activate" ],
"module_list": [ ],
* "custom_flags": ["TG-DMR160007"],*
"time_limit": "2:00:0",
"mem_limit": 32,
"_comment": "that's all"
},
"lmp_command": "~/miniconda3/bin/lmp",
"model_devi_group_size": 1,
"_comment": "model_devi on localhost",
"model_devi_machine": {
"batch": "slurm",
"work_path" :
"/expanse/lustre/scratch/tfcao/temp_project/batis3-dp/ini",
"_comment" : "that's all"
},
"_comment": " if use GPU, numb_nodes(nn) should always be 1 ",
"_comment": " if numb_nodes(nn) = 1 multi-threading rather than mpi is
assumed",
"model_devi_resources": {
"numb_node": 1,
"task_per_node":64,
"source_list": ["~/miniconda3/bin/activate" ],
"module_list": [ ],
"time_limit": "2:00:0",
"mem_limit": 32,
"partition" : "compute",
* "custom_flags": ["TG-DMR160007"],*
"_comment": "that's all"
},
"_comment": "fp on localhost ",
"fp_command": "mpirun -np 64 /home/tfcao/vasp_bin/regular/vasp",
"fp_group_size": 1,
"fp_machine": {
"batch": "slurm",
"work_path" :
"/expanse/lustre/scratch/tfcao/temp_project/batis3-dp/ini",
"_comment" : "that's all"
},
"fp_resources": {
"numb_node": 1,
"task_per_node":64,
"numb_gpu": 0,
"exclude_list" : [],
"source_list": [],
"module_list": [],
"with_mpi" : false,
"time_limit": "2:00:0",
"partition" : "compute",
"_comment": "that's all"
},
"_comment": " that's all "
}
"
It seems that is does not work. Could you have a look?
Thank you!
…On Thu, Jul 1, 2021 at 5:03 PM Jinzhe Zeng ***@***.***> wrote:
See #367 <#367>, and
custom_flags is provided in #368
<#368>.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#451 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQHBQTANYPMXMKLZ4ZVQXC3TVT63LANCNFSM47VVMDOQ>
.
|
The correct one should be |
Hi Jinze,
It still does not work
this is the final file "#!/bin/bash -l
#SBATCH -N 1
#SBATCH --ntasks-per-node=64
#SBATCH -t 2:00:0
#SBATCH --partition=compute
cd sys-0002-0002-0006
test $? -ne 0 && exit 1
if [ ! -f tag_0_finished ] ;then
mpirun -np 64 /home/tfcao/vasp_bin/regular/vasp 1>> log 2>> err
if test $? -ne 0; then exit 1; else touch tag_0_finished; fi
fi
cd
/expanse/lustre/scratch/tfcao/temp_project/batis3-dp/ini/8078abd8-d0c7-4e26-9cec-f5e6ea0f4420
test $? -ne 0 && exit 1
wait
touch 8078abd8-d0c7-4e26-9cec-f5e6ea0f4420_tag_finished
~
Here is the machine file.
"
"{
"deepmd_path": "~/miniconda3/bin/dp",
"train_machine": {
"batch": "slurm",
"work_path" :
"/expanse/lustre/scratch/tfcao/temp_project/batis3-dp/ini",
"_comment" : "that's all"
},
"train_resources": {
"numb_node": 1,
"task_per_node":64,
"partition" : "compute",
"exclude_list" : [],
"source_list": [ "~/miniconda3/bin/activate" ],
"module_list": [ ],
"custom_flags": ["-A TG-DMR160007"],
"time_limit": "2:00:0",
"mem_limit": 32,
"_comment": "that's all"
},
"lmp_command": "~/miniconda3/bin/lmp",
"model_devi_group_size": 1,
"_comment": "model_devi on localhost",
"model_devi_machine": {
"batch": "slurm",
"work_path" :
"/expanse/lustre/scratch/tfcao/temp_project/batis3-dp/ini",
"_comment" : "that's all"
},
"_comment": " if use GPU, numb_nodes(nn) should always be 1 ",
"_comment": " if numb_nodes(nn) = 1 multi-threading rather than mpi is
assumed",
"model_devi_resources": {
"numb_node": 1,
"task_per_node":64,
"source_list": ["~/miniconda3/bin/activate" ],
"module_list": [ ],
"time_limit": "2:00:0",
"mem_limit": 32,
"partition" : "compute",
"custom_flags": ["-A TG-DMR160007"],
"_comment": "that's all"
},
"_comment": "fp on localhost ",
"fp_command": "mpirun -np 64 /home/tfcao/vasp_bin/regular/vasp",
"fp_group_size": 1,
"fp_machine": {
"batch": "slurm",
"work_path" :
"/expanse/lustre/scratch/tfcao/temp_project/batis3-dp/ini",
"_comment" : "that's all"
},
"fp_resources": {
"numb_node": 1,
"task_per_node":64,
"numb_gpu": 0,
"exclude_list" : [],
"source_list": [],
"module_list": [],
"with_mpi" : false,
"time_limit": "2:00:0",
"partition" : "compute",
"_comment": "that's all"
},
"_comment": " that's all "
}
"
…On Thu, Jul 1, 2021 at 8:05 PM Jinzhe Zeng ***@***.***> wrote:
The correct one should be "-A TG-DMR160007" instead of "TG-DMR160007".
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#451 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQHBQTGDPXUCEGR6ZNKDDCTTVUUHBANCNFSM47VVMDOQ>
.
|
You added it to |
Hi Jinze,
I changed it
"
"deepmd_path": "~/miniconda3/bin/dp",
"train_machine": {
"batch": "slurm",
"work_path" :
"/expanse/lustre/scratch/tfcao/temp_project/batis3-dp/ini",
"_comment" : "that's all"
},
"train_resources": {
"numb_node": 1,
"task_per_node":64,
"partition" : "compute",
"custom_flags": "-A TG-DMR160007",
"exclude_list" : [],
"source_list": [ "~/miniconda3/bin/activate" ],
"module_list": [ ],
"time_limit": "2:00:0",
"mem_limit": 32,
"_comment": "that's all"
},
"lmp_command": "~/miniconda3/bin/lmp",
"model_devi_group_size": 1,
"_comment": "model_devi on localhost",
"model_devi_machine": {
"batch": "slurm",
"custom_flags": "-A TG-DMR160007",
"work_path" :
"/expanse/lustre/scratch/tfcao/temp_project/batis3-dp/ini",
"_comment" : "that's all"
},
"_comment": " if use GPU, numb_nodes(nn) should always be 1 ",
"_comment": " if numb_nodes(nn) = 1 multi-threading rather than mpi is
assumed",
"model_devi_resources": {
"numb_node": 1,
"task_per_node":64,
"source_list": ["~/miniconda3/bin/activate" ],
"module_list": [ ],
"time_limit": "2:00:0",
"mem_limit": 32,
"partition" : "compute",
"custom_flags": "-A TG-DMR160007",
"_comment": "that's all"
},
"_comment": "fp on localhost ",
"fp_command": "mpirun -np 64 /home/tfcao/vasp_bin/regular/vasp",
"fp_group_size": 1,
"fp_machine": {
"batch": "slurm",
"custom_flags": "-A TG-DMR160007",
"work_path" :
"/expanse/lustre/scratch/tfcao/temp_project/batis3-dp/ini",
"_comment" : "that's all"
},
"fp_resources": {
"numb_node": 1,
"task_per_node":64,
"numb_gpu": 0,
"exclude_list" : [],
"source_list": [],
"module_list": [],
"with_mpi" : false,
"time_limit": "2:00:0",
"partition" : "compute",
"_comment": "that's all"
},
"_comment": " that's all "
}
"
And the code: " temp_exclude = ""
for ii in res['exclude_list'] :
temp_exclude += ii
temp_exclude += ","
temp_exclude = temp_exclude[:-1]
ret += '#SBATCH --exclude=%s \n' % temp_exclude
for flag in res.get('custom_flags', []):
ret += '#SBATCH %s \n' % flag
ret += "\n"
"
Could you help me check again! Thank you!
Best regards!'
Tengfei
…On Thu, Jul 1, 2021 at 8:49 PM Jinzhe Zeng ***@***.***> wrote:
You added it to model_devi_resources but you are running a fp task?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#451 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQHBQTBLRR2LOYIRXPWSSGTTVUZOLANCNFSM47VVMDOQ>
.
|
I don't understand your change... In #451 (comment), it seems that you only added it to |
Hi Jinzhe,
I have figured out. Thank you!
…On Thu, Jul 1, 2021 at 10:45 PM Jinzhe Zeng ***@***.***> wrote:
I don't understand your change... In #451 (comment)
<#451 (comment)>,
it seems that you only added it to model_devi_resources (but not
fp_resources). However, you are running a fp task, right? You should add custom_flags:
["-A TG-DMR160007"] to fp_resources.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#451 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQHBQTB22RCI6T2OYGJ56JDTVVHA3ANCNFSM47VVMDOQ>
.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Hi All,
I want add subject name "#SBATCH -A TG-DMR160007" to the .sub file, so that I could submit job. Could you tell me how to add on the machine fille. Here is my machine. Thank you!
{
"deepmd_path": "
/miniconda3/bin/dp",/miniconda3/bin/activate" ],"train_machine": {
"batch": "slurm",
"work_path" : "/expanse/lustre/scratch/tfcao/temp_project/batis3-dp/ini",
"_comment" : "that's all"
},
"train_resources": {
"numb_node": 1,
"task_per_node":64,
"partition" : "compute",
"exclude_list" : [],
"source_list": [ "
"module_list": [ ],
"time_limit": "2:00:0",
"mem_limit": 32,
"_comment": "that's all"
},
}
~
Summary
Detailed Description
Further Information, Files, and Links
The text was updated successfully, but these errors were encountered: