Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to edit the settings of the job manager to do sequential submission of the job queue ? #3

Closed
bradraj opened this issue Jul 14, 2018 · 4 comments

Comments

@bradraj
Copy link

bradraj commented Jul 14, 2018

The abipy submits the job simultaneously and it crashes the system. When the next job input depends on the previous job then it is running sequencially. But when it is not dependent, all the jobs are submitted at once and when the RAM of the system overloads, it crashes the system.

Is there any way to submit the job using job scheduler sequencially one by one ???

@gmatteo
Copy link
Member

gmatteo commented Jul 16, 2018

Add the following options to scheduler.yml

# Limit on the number of jobs that can be present in the queue. (DEFAULT: 200)
max_njobs_inqueue: 2

# Maximum number of cores that can be used by the scheduler.
max_ncores_used: 4

To get the list of options supported in scheduler.yml and manager.yml, use:

abidoc.py scheduler
abidcoc.py manager

@bradraj
Copy link
Author

bradraj commented Jul 17, 2018

I have given the following options in scheduler.yml

max_njobs_inqueue:1
max_nlaunches:1
#no of seconds to wait
seconds: 5

One job is submitted at a given time. If the job takes more than 5 seconds, the previous job is running and at the same time the new job is getting submitted. No matter the status of the old job whether it is running or not, for every 5 second a new job is getting submitted. This when the jobs are not inter-related

I don't know how much time each job will take and I can't give a fiex wait time. Is there a way to fix this ?

Is there a way to make all the jobs wait till previous one gets completed ?

@gmatteo
Copy link
Member

gmatteo commented Jul 17, 2018

Try to set max_ncores_used to the total number of physical CPUS available on your machine.
This adds an additional constraint to the scheduler.

If the problems persists, send me your manager.yml and the script used to run the calculations.

@bradraj
Copy link
Author

bradraj commented Jul 18, 2018

Adding the max_ncores_used did the trick. It limited the submission of programs. It would be useful if you can in the future add max_memory_used for the scheduler. Thanks a ton.

@bradraj bradraj closed this as completed Jul 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants