New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check file outputs before submitting a spectro pipeline job #1217
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ated the low level functions needed for that
Looks good! Features sounds great. For now I will trust your testing + any mop up of issues we discover when running Denali. Merging now. Note: this will almost certainly create merge conflicts for the knl branch in PR #1215, so I will merge this first and rebase that branch and resubmit it. |
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request adds a new function
check_for_outputs_on_disk
that is run within the pipeline prior to creating or submitting a job to the queue during the reduction of spectra data. The motivation is the fact that we may submit jobs that will do nothing more than check if outputs exist for several sub-tasks and then exit, filling the queue for no reason. We may also submit large 5 node jobs to process 30 cameras where only 2 still need to be processed. This does that checking prior to submission and only requests resources and processing for the cameras that still need to be processed.If all output files are present, then by default the job is not submitted. If some files are present, by default it submits a smaller job to only process the missing data. The expected files are determined using the PROCCAMWORD to know what cameras should be processed and the type of job being submitted. The file names are generated using
the desispec.io.findfile
.findfile
was updated slightly here in the hopes of updating the redshift formats. Those have since become obsolete again in PR #1192. That doesn't usefindfile
, however, so I will leave this as-is and let a future PR bring everything back in line. The redshift features included in this code are placeholders for the next PR that will integrate redshift fitting into the nightly pipeline managerdesi_daily_proc_manager
, so the incorrect naming doesn't impact the code.I have added two command-line arguments
--dont-check-job-outputs
and--dont-resubmit-partial-jobs
to both the nightly processing scriptdesi_daily_proc_manager
and the re-run scriptdesi_run_night
. The default (without either flag) is to check for files on disk. If all files exist then the job is skipped and not submitted. If some exist then a smaller job is submitted to process only missing cameras. If--dont-resubmit-partial-jobs
is set and the other is not, then the pipeline will skip jobs with all outputs existing but otherwise will submit the full job to be run even if some cameras exist. If--dont-check-job-outputs
is set, then no checking is done and the jobs are submitted to the queue even if the outputs exist. This is analogous to what was done prior to this addition.I tested numerous scenarios to ensure that it does the correct thing in various circumstances. I removed just the cals from a night, removed just
cframe-*
, removedcframe-*
andstdstar-*
files, removed some tiles but not others, etc., etc. In addition to the calls given explicitly in the script below, I also tested with and without the flags for both command-line scripts.When it is run, it provides useful context-specific log messages to tell you what action it took (if any):
All the tests were performed using a test prod setup with the following script: