Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Pipeline Metadata Handling and Implement QA Cuts #1304

Merged
merged 17 commits into from Jun 12, 2021

Conversation

akremin
Copy link
Member

@akremin akremin commented Jun 12, 2021

This focuses on how the pipeline reads and interprets the raw data files in determining what and how to process each exposure before it is submitted for processing.

Changes in existing logic

  • Fixes bug in load_table so that tables with older datamodels can now be loaded.
  • New exposures are identified by the existence of the raw desi-*.fit.fz file instead of the checksum.
  • The fiberassign and desi-*.fit.fz files are used as the primary source of header information, and the request file is only used to verify things if available (so dropping a dependency on the request*.json file).
  • EXPTIME cut is set to less than 60 instead of 59s. (Verified that this change wouldn't impact any previous decisions on an SV or main survey exposure that was already observed).
  • Column order of the "exposure table", in addition to adding new columns, but doesn't remove any existing columns.

Additions

  • Checks to see if "EXTRA" hdu in fiberassign and flags the exposure as a dither if it is present.
  • Uses etc information from the etc*json file (falling back to fiberassign file if missing, with default set if neither have the appropriate keywords).
  • Adds some useful etc and fiberassign keywords to the exposure table for reference and to understand automated decision making.
  • Adds one derived parameter to the exposure table: SPEED = (EFFTIME_ETC / EXPTIME) * (EBVFAC*AIRFAC)^2
    • I generally try to avoid putting derived quantities in the table, but this seemed important to store for understanding the speed cuts.
  • For main survey tiles, it uses the etc information to make cuts on survey speed and efftime:
    • EFFTIME_ETC < 0.05*GOALTIME
    • SPEED < 0.5*PROGRAM_SPEED_THRESHOLD
    • Where PROGRAM_SPEED_THRESHOLD is 1/2.5 for dark, 1/6 for bright.

Tests

I tested this by running on all nights from December 14th 2020 - June 10th 2021 using the create_exposure_table script.

I also tested using the daily processing code desi_daily_proc_manager --override-night=${NIGHT} --dry-run-level=1

They worked, switching the state for the rare exposures that fail the new cuts. Only main survey tiles were flagged by the cuts (as designed). The metadata appears to be found in the files and defaults fill in gaps appropriately.

No differences were found in comparing the old and new exposure tables for existing columns, except when the data was flagged where relevant columns were populated with information on why the cut was performed.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 27.186% when pulling 41bffa3 on pipe_metadata_and_qacuts into e7da995 on master.

@akremin akremin merged commit 5bf5742 into master Jun 12, 2021
@akremin akremin deleted the pipe_metadata_and_qacuts branch June 12, 2021 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants