New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-26140: Centralize Gen 3 pipeline configuration info for ap_verify datasets #139
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As best I can tell these pipelines work how I'd expect, but I'm a bit lost in some of the details of the instrument-specific fakes pipelines. Please clarify, and make everything two-spaced indents while you're at it (sorry).
pipelines/ApVerifyWithFakes.yaml
Outdated
- isr | ||
- characterizeImage | ||
- calibrate | ||
- visitFakes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think this needs to be here, but it is odd to me that it's not in the prepareFakes
subset. Maybe a comment explaining?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure, is "this" just the visitFakes
line? That probably is in the wrong place, based on me misunderstanding what that task is for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont really know what each of these tasks is used for, but as a note, it is possible to include a label in more than one subset, they do not have to be mutually exclusive. In DRP there is processCcd, but also Single frame that includes even more. These are just helpful organizers for people to help run or refer to collections of tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually did mean to create a partition, because that's the organization that makes the most sense to me.
And on third(?) thought, I think visitFakes
belongs in apPipeWithFakes
(because it runs characterizeImage
and calibrate
on the data) but not in prepareFakes
(because it requires partially processed raws whereas the other tasks can be done without them). Not ready to push yet, but here's what I have so far:
apPipe:
subset:
- isr
- characterizeImage
- calibrate
- imageDifferenceNoFakes
description: >
The AP pipeline without fakes. Only includes processing through
image differencing.
prepareFakes:
subset:
- createFakes
- coaddFakes
description: >
Creation of fake sources.
apPipeWithFakes:
subset:
- visitFakes # characterizeImage and calibrate with fakes
- imageDifference
- transformDiaSrcCat
- diaPipe
- matchFakes
description: >
The AP pipeline with fakes. Requires apPipe and prepareFakes subsets.
Like configs, pipelines are copied to the workspace directory for ease of access and to allow reconstruction of exactly what was run.
The pipelines now inherit the task defaults (where technically possible), with deep coadds being requested only by the ap_verify datasets that use them.
This solution avoids the need to alter the ap_verify CLI or the ap_verify dataset framework, making it relatively easy to undo the workaround once DM-31492 is resolved.
7481bba
to
59d1a0e
Compare
This PR adds instrument specializations of the
ApVerify
andApVerifyWithFakes
pipelines, and moves many of the previous pipeline overrides (in particular, the assumption ofdeep
coadds) into individual datasets. It also modifies the behavior of theap_verify.py --pipeline
argument to more naturally support dataset-specific pipelines.