New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-22541 Add initial gen3 support to GetCoaddAsTemplateTask. #143
Conversation
kgabor
commented
Jan 9, 2020
- GetCoaddAsTemplateTask.assembleTemplateExposure() to create the stitched coadd from gen3 butler inputs. Note only one tract is supported as in gen2.
- Change templateExposure terminology to coaddExposure to keep template as an input of ImageDifferenceTask.
- Add TODO comments for missing dcr features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new Gen 3 version has lost a lot of the logging from the previous version, and I think it would be helpful to maintain as much as possible.
Aside from that, I have two main concerns. The first is that there is a lot of code duplication between run
and assembleTemplateExposure
. This might be okay, since we will eventually remove Gen 2 support, but I worry that code changes and bug fixes might make it into only one version of the code. I would like to see as much of the common code factored out into smaller functions, and where that's not practical add a comment warning of the duplication and that any code changes need to be made in both places.
My second concern is that assembleTemplateExposure
changes the way patches are selected. In the Gen 2 case, the patch list is built from all the patches that overlap the science exposure. This has the advantage that if a needed patch is missing it raises a warning. In the Gen 3 version, however, it loops over the patches in each coadd, and selects any that are also in the science image. If a patch that's in the science image isn't found in any of the input coadds you will never get warned that it's missing. I think you need to make sure that all of the patches in the Gen 3 patchList
really overlap with the science image (or, is that guaranteed?), and either switch back to iterating over patchList
or find a way to verify that every patch found data from a coadd.
python/lsst/ip/diffim/getTemplate.py
Outdated
"""Assemble the template exposure from the coadd patches that are received | ||
as inputs. Only one tract is supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The short description should fit on one line, within the line length. Additional information can go in a separate paragraph before Parameters, or in Notes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's actually not quite true — https://developer.lsst.io/python/numpydoc.html#py-docstring-short-summary
python/lsst/ip/diffim/getTemplate.py
Outdated
@@ -182,6 +182,119 @@ def run(self, exposure, sensorRef, templateIdList=None): | |||
return pipeBase.Struct(exposure=coaddExposure, | |||
sources=None) | |||
|
|||
def assembleTemplateExposure(self, butlerQC, skyMapRef, coaddExposureRefs, exposure): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest putting the exposure
parameter first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, once the Gen 2 butler is gone, won't this be the run
method? Maybe this should be called runGen3
to anticipate that change. Or, the current run
method could be renamed to runGen2
and this could become run
.
python/lsst/ip/diffim/getTemplate.py
Outdated
patchDict = dict( | ||
(tractInfo.getSequentialPatchIndex(p), p) for p in patchList | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could safely fit on one line.
python/lsst/ip/diffim/getTemplate.py
Outdated
patchDict = dict( | ||
(tractInfo.getSequentialPatchIndex(p), p) for p in patchList | ||
) | ||
self.log.debug("Considering patches: %s" % str(patchList)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change the log level of this message from info to debug? Does it print a lot of extraneous patches?
python/lsst/ip/diffim/getTemplate.py
Outdated
|
||
tractInfo = skyMap.findTract(ctrSkyPos) | ||
skyCorners = [expWcs.pixelToSky(pixPos) for pixPos in expBoxD.getCorners()] | ||
patchList = tractInfo.findPatchList(skyCorners) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There used to be a check whether patchList
was empty. If that's no longer needed, please add a comment explaining why, or add the check back in.
python/lsst/ip/diffim/getTemplate.py
Outdated
self.log.info("Using template input tract=%s, patch=%s" % | ||
(tractInfo.getId(), dataId['patch'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two adjacent coaddRef
s might both contain the same patch in the region where they overlap. This message needs some more information to disambiguate which coadd is being used.
482e5ca
to
4c99a76
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting closer, but I'd still like to see some changes. In addition to renaming your run
-type methods to be more consistent with the rest of the stack, I'd like the dictionary of coadds replaced with a dictionary of coadd data references. That means that we don't have to load all of the coadds in memory at the start, but more importantly will work better with DM-22952 to enable DCR-coadds in Gen 3.
python/lsst/ip/diffim/getTemplate.py
Outdated
availableCoadds = dict() | ||
for patchInfo in patchList: | ||
patchNumber = tractInfo.getSequentialPatchIndex(patchInfo) | ||
patchArgDict = dict( | ||
datasetType=self.getCoaddDatasetName() + "_sub", | ||
bbox=patchInfo.getOuterBBox(), | ||
tract=tractInfo.getId(), | ||
patch="%s,%s" % (patchInfo.getIndex()[0], patchInfo.getIndex()[1]), | ||
numSubfilters=self.config.numSubfilters, | ||
) | ||
|
||
if self.config.coaddName != 'dcr' and sensorRef.datasetExists(**patchArgDict): | ||
self.log.info("Reading patch %s" % patchArgDict) | ||
availableCoadds[patchNumber] = sensorRef.get(**patchArgDict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid loading all of the coadd exposures ahead of time. Aside from the extra memory, this won't work with other types of coadds (including DcrCoadds), but I believe it will work if you use DeferredDatasetHandle
in the Gen 3 version. To do that, you will need to add deferLoad=True
to the coaddExposures
connection definition in imageDifferenceTask
. That will give you a butler object that will mostly behave the same way as the Gen 2 butler. You probably still want to test for existence of the coadd in runDataRef
, since you can't use .datasetExists(**patchArgDict)
on a DeferredDatasetHandle
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes here would just be replacing availableCoadds
with availableCoaddRefs
above, then availableCoaddRefs[patchNumber] = sensorRef
here and replace availableCoadds
with availableCoaddRefs
in the call to run
below.
python/lsst/ip/diffim/getTemplate.py
Outdated
(tractInfo.getId(), dataId['patch'])) | ||
availableCoadds[dataId['patch']] = butlerQC.get(coaddRef) | ||
|
||
templateExposure = self.makeTemplateExposure(tractInfo, patchList, skyCorners, availableCoadds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, makeTemplateExposure
should now be run
python/lsst/ip/diffim/getTemplate.py
Outdated
Patches to consider for making the template exposure. | ||
skyCorners : list of `lsst.geom.SpherePoint` | ||
Sky corner coordinates to be covered by the template exposure. | ||
availableCoadds : `dict` of `int` : `lsst.afw.image.exposureF` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the suggested changes above, this should be renamed to availableCoaddRefs
, and the description would become something like:
availableCoaddRefs : `dict` of `lsst.daf.persistence.ButlerDataRef`
4c99a76
to
5f45179
Compare
* GetCoaddAsTemplateTask.assembleTemplateExposure() to create the stitched coadd from gen3 butler inputs. Note only one tract is supported as in gen2. * Change templateExposure terminology to coaddExposure to keep _template_ as an input of ImageDifferenceTask. * Add TODO comments for missing dcr features. * Separate common Gen2 and Gen3 GetCoaddAsTemplateTask code into run(). * Loop over patches in the same order in both cases. * Change for sequential patch numbers for logging - dcr code not changed. * Docstring updates and fixes.
Add deferred loading under gen3, passing patchArgDict to run() under gen2.
d322761
to
d954c0c
Compare