-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to run a flow with specific hosts? #157
Comments
Hi, a queue system is not mandatory. For further details see here In this case, you need to specify the maximum number of cores that can be used by the scheduler One can use the pre_run:
- ulimit -s unlimited
- ... Could you provide an example of script you use to run Abinit in parallel on your cluster? |
Thanks for the prompt reply. I have been running successfully on a single node using abipy, but am interested in running on various hosts simultaneously. I normally run ABINIT in the following manner:
where node[0-2] are the hosts over which ABINIT will be parallelized. I could also create a file with the hostnames, and execute with
where hosts.file would contain
Thanks a lot for the help. |
Ok, I see the problem. I didn't consider the hostst.file syntax but I think it's possible to support it.
So AbiPy will select the "optimal" configuration and will write the associated submission script It's possible to disable this feature and one can also enforce a particular number of CPUs with:
See examples in the abipy/benchmarks directory This approach, however, assumes pre-generated input files whose parallel variables (npfft, npband, npkpt) are compatible with the number of MPI ranks requested by the user (this is very important I can add support for hostfiles once I know the total number of CPUS and the number of procs The challenge is how to optimise the resources when multiple calculations are executed concurrently i.e. one can have 2 calculations requiring 72 procs each and these calculations This means that the AbiPy scheduler should keep a record of the nodes that have Could you give more details about your typical workflows so that I can have a better view |
Sure thing, We specialize in first principle calculation of optical properties. Typically, we use ABINIT to calculate the electronic density and energies, which we then use to calculate various susceptiblities for any given material. We typically work with DFT-LDA, but we also have some domain over GW and BSE calculations, particularly in conjunction with the DP/EXC code. I am currently trying to audit Anyway, that's the gist of it. As for this particular issue of using a specific set of nodes: it occurs to me that the user could add the hosts to the |
manager.yml already provides an option (
If I use the following manager.yml: qadapters:
# List of qadapters objects
- priority: 1
queue:
qtype: shell
qname: localhost
job:
mpi_runner: mpirun
mpi_runner_options: "--hostfile ${HOME}/my_hosts"
# source a script to setup the environment.
#pre_run: "source ~/env.sh"
limits:
timelimit: 1:00:00
max_cores: 2
hardware:
num_nodes: 1
sockets_per_node: 1
cores_per_socket: 2
mem_per_node: 4 Gb and I run one of the examples in #!/bin/bash
cd /Users/gmatteo/git_repos/abipy/abipy/examples/flows/flow_si_ebands/w0/t0
# OpenMp Environment
export OMP_NUM_THREADS=1
mpirun --hostfile ${HOME}/my_hosts -n 1 abinit < /Users/gmatteo/git_repos/abipy/abipy/examples/flows/flow_si_ebands/w0/t0/run.files > /Users/gmatteo/git_repos/abipy/abipy/examples/flows/flow_si_ebands/w0/t0/run.log 2> /Users/gmatteo/git_repos/abipy/abipy/examples/flows/flow_si_ebands/w0/t0/run.err This solution should work and does not require any change in the present implementation. Remember to set: # Limit on the number of jobs that can be present in the queue. (DEFAULT: 200)
max_njobs_inqueue: 2
# Maximum number of cores that can be used by the scheduler.
max_ncores_used: 4 in your scheduler.yml to avoid oversubscribing nodes. |
I think this solution will work great for me, and I will try it out during the week. Thanks! |
Hi,
I have just starting to experiment with abipy, which I plan to use to automate and optimize production calculations. I normally run ABINIT on a cluster without a queue system, and I would like to use the same method for running with abipy. Is there any way to specify the specific hosts I want to use with MPI without relying on a queue system?
Thanks,
Sean
The text was updated successfully, but these errors were encountered: