Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add argument to specify a slurm reservation #1145

Merged
merged 5 commits into from Feb 23, 2021
Merged

Add argument to specify a slurm reservation #1145

merged 5 commits into from Feb 23, 2021

Conversation

akremin
Copy link
Member

@akremin akremin commented Feb 23, 2021

This adds two new flags to desi_run_night: --ignore-existing and --reservation. --ignore-existing allows us to, by default, skip over a night if the night has already been submitted for the prod. This prevents overwriting of files, and rerunning slurm jobs. If it is resubmitted on purpose, then --ignore-existing can be set to do that.

The main point of the PR is for --reservation. This PR also includes the ability to set this in the yaml file for desi_run_prod. it also propagates reservation down into multiple function to where it actually calls sbatch.

I tested this on two separate nights with:
desi_run_night -n 20210220 --reservation=cascades
desi_run_night -n 20210221 --reservation=cascades

Below is a screenshot of sqs showing that the jobs were submitted to the reservation, including arcs, flats, joint jobs of all varieties, and both types of science jobs.

image

@akremin akremin requested a review from sbailey February 23, 2021 22:25
@sbailey
Copy link
Contributor

sbailey commented Feb 23, 2021

+1 for --reservation

For --ignore-existing, it wasn't immediately obvious to me what the sign of that was: "ignore" in the sense of "proceed the same whether or not they previously existed (i.e. overwrite them)", or "ignore" in the sense of "skip nights that previously existed and do nothing for them"? I think the former? If so, consider --overwrite-existing. If the latter, consider --skip-existing.

Also for clarification: does --ignore-existing work at the level of a night, or at the level of individual jobs? e.g. in an attempt to keep arcs/flats but redo sciences or a particular tile for a night by editing the processing table and re-launching?

@akremin
Copy link
Member Author

akremin commented Feb 23, 2021

Good suggestion. I meant that if the flag is explicitly set you would proceed even if the slurm files were previously generated or processing table exists (it checks both, with an informative error message in either case).

I'll change it to be --overwrite-existing. The reason I didn't use overwrite originally is that I didn't want people to think this would allow them to overwrite existing data, which this can't do. But it does overwrite the slurm files and processing table, so it's not totally misleading.

It works at the level of a night. It was a simple sanity check I wanted to keep from re-submitting a night that had been submitted already. I considered the situation you describe of submitting two different times for cals and then sciences. In that case, you would need to include the flag. Not a big issue, in my opinion.

@sbailey sbailey merged commit 8ff82d1 into master Feb 23, 2021
@sbailey sbailey deleted the reservations branch February 23, 2021 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants