New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-25304: Add task to extract and process bright stars #402
Conversation
3498dd3
to
5c7b5ab
Compare
5c7b5ab
to
cf14459
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, some stuff for you to consider. Once you address these I will take another look
Computes the flux of an object in an annulus. This is required to | ||
normalize each bright star stamp as their central pixels are likely | ||
saturated and/or contain ghosts, and cannot be used. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you probably want to inherit the docstring from PipelineTask, but as it is your task will have weird documentation generated by sphinx. I will help you look into what the best way to have proper documentation is.
return annulusStat.getValue() | ||
|
||
@pipeBase.timeMethod | ||
def run(self, inputExposure, refObjLoader=None, dataId=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataId needs to be documented
RunnerClass = pipeBase.ButlerInitializedTaskRunner | ||
|
||
def __init__(self, butler=None, initInputs=None, *args, **kwargs): | ||
pipeBase.Task.__init__(self, *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super() MUST be used here instead of a direct call to pipeBase.Task (im surprised this worked at all)
pipeBase.Task.__init__(self, *args, **kwargs) | ||
# Compute (model) stamp size depending on provided "buffer" value | ||
self.modelStampSize = (int(self.config.stampSize[0]*self.config.modelStampBuffer), | ||
int(self.config.stampSize[1]*self.config.modelStampBuffer)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you explicitly choose int here (that will floor)? If so why not ceil? if not, think about which is more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two reasons to favor int
:
- if the resulting stamp size corresponds to an even number of pixels, we will add an extra pixel. So we end up with either the integer number immediately below or immediately above the exact (real number)
stampSize*modelStampBuffer
. Going withceil
would lead to either the integer immediately higher, or the one above that. - we will be saving quite a lot of stamps of that size, so since it hardly matters in terms of the resulting processed stamps (by construction, we are taking a buffer to ensure we don't lose any useful pixel even after warping), I figure it is always best to go with one less pixel in each dimension.
self.refCatLoader.ref_dataset_name = "gaia_dr2_20200414" | ||
|
||
|
||
class ProcessBrightStarsTask(pipeBase.PipelineTask, pipeBase.CmdLineTask): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have sat on this long enough you might be able to drop gen2 support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not just yet, if only because:
a) I am (and will keep on) running this on RC2 a lot;
b) the Gaia refcat is not yet included in any gen3 repos (as far as I know).
Happy to clean things up when the time comes though!
badMasks = self.config.badMaskPlanes | ||
andMask = annulusMask.getPlaneBitMask(badMasks[0]) | ||
for bm in badMasks[1:]: | ||
andMask = andMask | annulusMask.getPlaneBitMask(bm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to avoid special casing indexes like 0 an rest and using temporary variables consider:
from operator import ior
from functools import reduce
...
reduce(ior, (annulusMask.getPlaneBitMask(bm) for bm in badMasks))
Its up to you which you find nicer though, if you want to keep yours that's fine, though I would suggest switching to |=
# Extract stamps around bright stars | ||
extractedStamps = self.extractStamps(inputExposure, refCatLoader=refObjLoader) | ||
# Warp (and shift, and potentially rotate) them | ||
self.log.info("Applying warp to %i star stamps from exposure %s" % (len(extractedStamps.starIms), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same from above
len(warpedStars), dataId)) | ||
fluxes = [] | ||
for wstar in warpedStars: | ||
annularFlux = self.computeAnnularFlux(wstar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above comment on computeAnnularFlux
gaiaGMag=extractedStamps.GMags[j], | ||
gaiaId=extractedStamps.gaiaIds[j], | ||
annularFlux=fluxes[j]) | ||
for j in range(len(warpedStars))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider changing this to for j, warp in enumerate(warpedStars)
you still get j to index by, but I think it is a bit clearer. This would be even more of a win if you have the annularFlux calculated by a method, either in this loop or another (or automatically triggered or something if r1, r2 are specified in constructor).
and cpix[1] >= self.config.stampSize[1]/2 | ||
and cpix[1] < inputExposure.getDimensions()[1] - self.config.stampSize[1]/2): | ||
starIms += [inputExposure.getCutout(sp, geom.Extent2I(self.config.stampSize))] | ||
pixCenters += [cpix] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont use += [] just to append to a list, you are for sure allocating a new list each time to throw it away. Depending on when the size constraints are hit, you are reallocating the main list anyway. It is better to just use append, and let the reallocating happen when needed.
e1d8112
to
523b02f
Compare
dc12df7
to
f1e2692
Compare
ca7f4d5
to
bf1e596
Compare
bf1e596
to
1048955
Compare
dataRef.put(output.brightStarStamps, "brightStarStamps") | ||
return pipeBase.Struct(brightStarStamps=output.brightStarStamps) | ||
|
||
def runQuantum(self, butlerQC, inputRefs, outputRefs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We absolutely can not merge this as is. You cannot assume that 1) a gen2 butler is available 2) this is run on a machine that has this file path. If this would merge, it would seem to work in some cases, and then would fail in ci and integration tests.
At this point I dont think we should be prioritizing getting things done with the mentality of defaulting to gen2 and then addressing gen3 later as a secondary concern. Gen2 is deprecated, so unless there is some technical reason something can not be done in gen3 yet, I feel this should be the bar for considering something "working". The only technical limitation I know of getting Gaia support is simply doing the work.
There are two paths forward I see. 1) work with the current bootstrapping to get Gaia calibrations migrated when the repo is converted to a gen3 repo. 2) Do the work to get Gaia ingested naively.
What you have below is fine for proving that you code could work, but it is not there yet for merging to master.
The image from which bright star stamps should be extracted. | ||
refObjLoader : `LoadIndexedReferenceObjectsTask`, optional | ||
Loader to find objects within a reference catalog. | ||
dataId : `dict` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In gen3 world, this will be a lsst.daf.butler.DataId (which itself is already either a DataCoordinate or dict). It should be included in the documentation, and it is up to you if you want to also include dict (as one might not know DataId could be a dict) or you could document it as "dict
or lsst.daf.butler.DataCoordinate
"
aa94dad
to
8839285
Compare
8839285
to
3242708
Compare
3242708
to
1e39c11
Compare
1e39c11
to
92cfdb7
Compare
Main PR out of the 4 for DM-25304, the others being daf_butler, obs_base and obs_subaru.
Jenkins run #32381.