Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-29221: Add ApPipe.yaml pipelines to appropriate repos #74

Merged
merged 7 commits into from Apr 21, 2021

Conversation

mrawls
Copy link
Collaborator

@mrawls mrawls commented Apr 10, 2021

This PR adds a top-level ApPipe.yaml to ap_pipe. It imports from a new standalone ProcessCcd.yaml in the same location. In addition, there are two new subdirectories for DECam and HSC. These contain camera-specific pipelines and configurations which import from the main ApPipe.yaml.

@mrawls mrawls requested a review from kfindeisen April 10, 2021 01:35
Copy link
Member

@kfindeisen kfindeisen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly style and config questions. The one thing that worries me is the division of DECam processing into two different pipelines; if it's impossible to do those in a single pipeline, that's going to be very hard to reconcile with ap_verify's goal of being instrument-agnostic.

pipelines/DarkEnergyCamera/RunIsr.yaml Outdated Show resolved Hide resolved
@@ -0,0 +1,5 @@
description: ProcessCcd - A set of tasks to run when processing raw images.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that this is for forward-compatibility with the Great Vision, but the pipe_tasks pipeline is still being worked on. Might it be better to use the pipe_tasks file for now, and switch to a local file later?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slow enough to respond that RFC-775 is out now, so I do think this is the right place for it. Two additional reasons: (1) I can't imagine a version of generic gen3 ProcessCcd that isn't just these three steps written down, so it's not like there's complicated stuff to remember to copy over, and (2) I don't want to import things called "DRP" for use by AP if we can help it.

pipelines/BpsApPipe.yaml Outdated Show resolved Hide resolved
pipelines/BpsApPipe.yaml Outdated Show resolved Hide resolved
pipelines/BpsApPipe.yaml Outdated Show resolved Hide resolved
pipelines/ApPipe.yaml Show resolved Hide resolved
diaPipe:
class: lsst.ap.association.DiaPipelineTask
config:
# Remember to run make_apdb.py with the same isolation_level and db_url as here
apdb.isolation_level: 'READ_UNCOMMITTED'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I did this in ap_verify.yaml, but I think that was a mistake, since it caused us to use uncommitted reads even when safer options were available (e.g., PostgreSQL). I now think the entire APDB configuration should be left up to the user, with no defaults.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting - I added it here because ap_association fails if you don't specify READ_UNCOMMITTED (I think that's true for both DB types), and I want the pipeline to work. I routinely set this config for both sqlite and postgres.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's surprising -- I was definitely under the impression that this setting was only required for SQLite. It's certainly only enforced for SQLite...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think you're right it's only required for sqlite. I guess I can make the isolation_level a comment (like the connection_timeout) instead of an actual config, but I worry about decisions like this leading to more people messaging me when they try to run the AP Pipeline with default settings and it doesn't "just work." I wish we had a "diaPipeUsesWhateverConfigsMakeApdbUsed=True" button I could just turn on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe we should revisit making the APDB location part of the task config. Configuring something that is always workspace- or workflow-dependent is not what either CmdLineTask or PipelineTask were designed for. Though I don't know what else we can do, especially in Gen 3 where the task inputs are strictly controlled...

pipelines/DarkEnergyCamera/ApPipe.yaml Outdated Show resolved Hide resolved
characterizeImage:
class: lsst.pipe.tasks.characterizeImage.CharacterizeImageTask
config:
refObjLoader.ref_dataset_name: 'panstarrs'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these refcat overrides needed for DECam, specifically? It seems to me like it should depend on what repo you're processing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both DECam ISR pipelines override outputExposure to "postISRCCD". Does there need to be a similar input override?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't anything DECam in /repo/main, so I can't say for sure yet. This is what I had to do to get it to work with my gen3 repo in /project/mrawls/hits2015-3. How does ap_verify handle gen3 refcats with the HiTS datasets?

Good catch re. the outputExposure setting, that appears to be an extraneously specified default, so I'll remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ap_verify, the overrides are applied as part of dataset-specific configs. They're not included in the pipeline.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess I'm frustrated there is no good place to set sensible defaults to prepare for a reality with no ap_verify-style datasets.

I took a look at ap_verify_ci_hits2015 and I do see the appropriate settings in config/calibrate.py to panstarrs and gaia. It appears to use the obs_decam default for characterizeImage, which is ps1_pv3_3pi_20170110, but only for Gen2 🤯 I don't know how the characterizeImage ref_dataset_name is set in Gen3.

Bottom line, I would like users to be able to run this pipeline out of the box with whatever the refcat names in /repo/main for panstarrs and gaia wind up being, which will likely be panstarrs and gaia, for photometry and astrometry respectively. Plus the mappings to phot_g_mean are basically always correct. Suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The characterizeImage thing sounds like a hidden bug. Maybe it's not being used? I'm pretty sure that refcat isn't available in our Jenkins environment.

As for what to do, the ps1_pv3_3pi_20170110 defaults were motivated by those datasets being set up on lsst-dev, but I'm guessing we can't change those while Gen 2 is still in use. I think the RFC-775-ish solution is to create a pipeline specialized for /repo/main?

Though now I'm wondering why you didn't have to do this for HSC...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are good defaults for HSC refcats assuming everything is panstarrs somewhere (in obs_subaru) that obs_decam lacks. When the initial HSC gen3 repo was set up, this was worked out rather early on via the refcats collection so images could be processed at all. Everything I've done with DECam has been much more ad hoc, because as in many situations, certain refcat defaults were hardcoded into obs_subaru and never thought about for obs_decam.

I guess for now, I will make a ApPipe_hits2015-3 with these configs that imports all the rest of everything to go with my repo by the same name. And later we can add one for /repo/main which will be fun to come up with a name for!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All I can find is https://github.com/lsst/obs_subaru/blob/master/config/calibrate.py, which sets ps1_pv3_3pi_20170110 explicitly. 😕

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps they just stick with that name instead of calling it a more succinct "panstarrs." We may have to follow suit for DECam in the shared repo, time will tell.

@mrawls
Copy link
Collaborator Author

mrawls commented Apr 20, 2021

I think I addressed all your comments @kfindeisen, can you please take another look?

Copy link
Member

@kfindeisen kfindeisen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

description: Run IsrTask for DECam with only intra-chip crosstalk. Inter-chip crosstalk needs pre-prepared crosstalkSources.
instrument: lsst.obs.decam.DarkEnergyCamera
tasks:
isrOscan:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isrOscan and not isr?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originated with Chris W, but I like it because it distinguishes this "do overscan correction to prepare the crosstalk sources, which technically means running a mini IsrTask" ISR from "real" ISR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Isn't this the old workflow, where you just have "real" ISR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I got them mixed up, even with the extremely explicit naming. Yeah, I think we can just call it isr.

@mrawls
Copy link
Collaborator Author

mrawls commented Apr 21, 2021

I moved comments about diaPipe configs to the "main" ApPipe.yaml pipeline and left them un-default-ed everywhere for the time being. I moved the DECam refcat-specific configs to a new pipeline specifically for my /project/mrawls/hits2015-3 repo that imports the DECam ApPipe.yaml pipeline. I think this avoids setting too many situation-specific defaults and guides users toward what they will need to configure at runtime.

@mrawls mrawls merged commit 7517e90 into master Apr 21, 2021
@mrawls mrawls deleted the tickets/DM-29221 branch April 21, 2021 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants