New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Pipeline Metadata Handling and Implement QA Cuts #1304
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This focuses on how the pipeline reads and interprets the raw data files in determining what and how to process each exposure before it is submitted for processing.
Changes in existing logic
load_table
so that tables with older datamodels can now be loaded.desi-*.fit.fz
file instead of the checksum.desi-*.fit.fz
files are used as the primary source of header information, and therequest
file is only used to verify things if available (so dropping a dependency on therequest*.json
file).Additions
etc*json
file (falling back to fiberassign file if missing, with default set if neither have the appropriate keywords).Tests
I tested this by running on all nights from December 14th 2020 - June 10th 2021 using the
create_exposure_table
script.I also tested using the daily processing code
desi_daily_proc_manager --override-night=${NIGHT} --dry-run-level=1
They worked, switching the state for the rare exposures that fail the new cuts. Only main survey tiles were flagged by the cuts (as designed). The metadata appears to be found in the files and defaults fill in gaps appropriately.
No differences were found in comparing the old and new exposure tables for existing columns, except when the data was flagged where relevant columns were populated with information on why the cut was performed.