Improve Pipeline Metadata Handling and Implement QA Cuts #1304

akremin · 2021-06-12T00:03:17Z

This focuses on how the pipeline reads and interprets the raw data files in determining what and how to process each exposure before it is submitted for processing.

Changes in existing logic

Fixes bug in load_table so that tables with older datamodels can now be loaded.
New exposures are identified by the existence of the raw desi-*.fit.fz file instead of the checksum.
The fiberassign and desi-*.fit.fz files are used as the primary source of header information, and the request file is only used to verify things if available (so dropping a dependency on the request*.json file).
EXPTIME cut is set to less than 60 instead of 59s. (Verified that this change wouldn't impact any previous decisions on an SV or main survey exposure that was already observed).
Column order of the "exposure table", in addition to adding new columns, but doesn't remove any existing columns.

Additions

Checks to see if "EXTRA" hdu in fiberassign and flags the exposure as a dither if it is present.
Uses etc information from the etc*json file (falling back to fiberassign file if missing, with default set if neither have the appropriate keywords).
Adds some useful etc and fiberassign keywords to the exposure table for reference and to understand automated decision making.
Adds one derived parameter to the exposure table: SPEED = (EFFTIME_ETC / EXPTIME) * (EBVFAC*AIRFAC)^2
- I generally try to avoid putting derived quantities in the table, but this seemed important to store for understanding the speed cuts.
For main survey tiles, it uses the etc information to make cuts on survey speed and efftime:
- EFFTIME_ETC < 0.05*GOALTIME
- SPEED < 0.5*PROGRAM_SPEED_THRESHOLD
- Where PROGRAM_SPEED_THRESHOLD is 1/2.5 for dark, 1/6 for bright.

Tests

I tested this by running on all nights from December 14th 2020 - June 10th 2021 using the create_exposure_table script.

I also tested using the daily processing code desi_daily_proc_manager --override-night=${NIGHT} --dry-run-level=1

They worked, switching the state for the rare exposures that fail the new cuts. Only main survey tiles were flagged by the cuts (as designed). The metadata appears to be found in the files and defaults fill in gaps appropriately.

No differences were found in comparing the old and new exposure tables for existing columns, except when the data was flagged where relevant columns were populated with information on why the cut was performed.

logic

coveralls · 2021-06-12T00:13:12Z

Coverage decreased (-0.1%) to 27.186% when pulling 41bffa3 on pipe_metadata_and_qacuts into e7da995 on master.

akremin added 17 commits May 25, 2021 23:26

update to not require a request file in pipeline

11c1140

add etc and fa to pipeline exposure checking and add speed cut

24baaac

pipeline checks for raw data instead of checksum

28b48e1

cleaned up exptable columns and improved summarize_exposure

a643e1c

logic

cleaned up exposure summary code in pipeline

6fcf337

lowercase obstype

d8fcbfa

only look for fiberassign for science exposures

f30fd0e

improved reporting in summarize_exposure

226be88

more reporting in summarize_exp and organized exptable.py

a451513

make exp an int and fix reporting typo

bee142f

warn when changing laststep in pipeline

c153586

reorder exptab cols and fix exptable loading bug

8cb5024

give info on flagged exps in comments, add speed column

7f9218c

fix default for ndarray in exptab loading

f8dba9d

make night lookup even more robust

5015937

keep exptime cut for all, but only cut on SN for main survey

42a9018

include SURVEY keyword in exptab and use for deciding if main survey

41bffa3

sbailey approved these changes Jun 12, 2021

View reviewed changes

akremin merged commit 5bf5742 into master Jun 12, 2021

akremin deleted the pipe_metadata_and_qacuts branch June 12, 2021 00:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Pipeline Metadata Handling and Implement QA Cuts #1304

Improve Pipeline Metadata Handling and Implement QA Cuts #1304

akremin commented Jun 12, 2021

coveralls commented Jun 12, 2021

Improve Pipeline Metadata Handling and Implement QA Cuts #1304

Improve Pipeline Metadata Handling and Implement QA Cuts #1304

Conversation

akremin commented Jun 12, 2021

Changes in existing logic

Additions

Tests

coveralls commented Jun 12, 2021