DM-25304: Add task to extract and process bright stars #402

MorganSchmitz · 2020-07-31T16:36:29Z

Main PR out of the 4 for DM-25304, the others being daf_butler, obs_base and obs_subaru.

Jenkins run #32381.

natelust

Overall looks good, some stuff for you to consider. Once you address these I will take another look

natelust · 2020-08-10T16:14:38Z

python/lsst/pipe/tasks/processBrightStars.py

+        Computes the flux of an object in an annulus. This is required to
+        normalize each bright star stamp as their central pixels are likely
+        saturated and/or contain ghosts, and cannot be used.
+    """


I know you probably want to inherit the docstring from PipelineTask, but as it is your task will have weird documentation generated by sphinx. I will help you look into what the best way to have proper documentation is.

natelust · 2020-08-10T16:58:08Z

python/lsst/pipe/tasks/processBrightStars.py

+        return annulusStat.getValue()
+
+    @pipeBase.timeMethod
+    def run(self, inputExposure, refObjLoader=None, dataId=None):


dataId needs to be documented

natelust · 2020-08-10T20:49:23Z

python/lsst/pipe/tasks/processBrightStars.py

+    RunnerClass = pipeBase.ButlerInitializedTaskRunner
+
+    def __init__(self, butler=None, initInputs=None, *args, **kwargs):
+        pipeBase.Task.__init__(self, *args, **kwargs)


super() MUST be used here instead of a direct call to pipeBase.Task (im surprised this worked at all)

natelust · 2020-08-10T21:11:19Z

python/lsst/pipe/tasks/processBrightStars.py

+        pipeBase.Task.__init__(self, *args, **kwargs)
+        # Compute (model) stamp size depending on provided "buffer" value
+        self.modelStampSize = (int(self.config.stampSize[0]*self.config.modelStampBuffer),
+                               int(self.config.stampSize[1]*self.config.modelStampBuffer))


Did you explicitly choose int here (that will floor)? If so why not ceil? if not, think about which is more appropriate.

Two reasons to favor int:

if the resulting stamp size corresponds to an even number of pixels, we will add an extra pixel. So we end up with either the integer number immediately below or immediately above the exact (real number) stampSize*modelStampBuffer. Going with ceil would lead to either the integer immediately higher, or the one above that.

we will be saving quite a lot of stamps of that size, so since it hardly matters in terms of the resulting processed stamps (by construction, we are taking a buffer to ensure we don't lose any useful pixel even after warping), I figure it is always best to go with one less pixel in each dimension.

natelust · 2020-10-19T15:22:51Z

python/lsst/pipe/tasks/processBrightStars.py

+        self.refCatLoader.ref_dataset_name = "gaia_dr2_20200414"
+
+
+class ProcessBrightStarsTask(pipeBase.PipelineTask, pipeBase.CmdLineTask):


I have sat on this long enough you might be able to drop gen2 support

I'd rather not just yet, if only because:

a) I am (and will keep on) running this on RC2 a lot;
b) the Gaia refcat is not yet included in any gen3 repos (as far as I know).

Happy to clean things up when the time comes though!

natelust · 2020-10-19T18:24:07Z

python/lsst/pipe/tasks/processBrightStars.py

+        badMasks = self.config.badMaskPlanes
+        andMask = annulusMask.getPlaneBitMask(badMasks[0])
+        for bm in badMasks[1:]:
+            andMask = andMask | annulusMask.getPlaneBitMask(bm)


to avoid special casing indexes like 0 an rest and using temporary variables consider:

from operator import ior from functools import reduce ... reduce(ior, (annulusMask.getPlaneBitMask(bm) for bm in badMasks))

Its up to you which you find nicer though, if you want to keep yours that's fine, though I would suggest switching to |=

natelust · 2020-10-19T18:32:52Z

python/lsst/pipe/tasks/processBrightStars.py

+        # Extract stamps around bright stars
+        extractedStamps = self.extractStamps(inputExposure, refCatLoader=refObjLoader)
+        # Warp (and shift, and potentially rotate) them
+        self.log.info("Applying warp to %i star stamps from exposure %s" % (len(extractedStamps.starIms),


same from above

natelust · 2020-10-19T18:33:12Z

python/lsst/pipe/tasks/processBrightStars.py

+            len(warpedStars), dataId))
+        fluxes = []
+        for wstar in warpedStars:
+            annularFlux = self.computeAnnularFlux(wstar)


see above comment on computeAnnularFlux

natelust · 2020-10-19T18:37:39Z

python/lsst/pipe/tasks/processBrightStars.py

+                                              gaiaGMag=extractedStamps.GMags[j],
+                                              gaiaId=extractedStamps.gaiaIds[j],
+                                              annularFlux=fluxes[j])
+                          for j in range(len(warpedStars))]


consider changing this to for j, warp in enumerate(warpedStars) you still get j to index by, but I think it is a bit clearer. This would be even more of a win if you have the annularFlux calculated by a method, either in this loop or another (or automatically triggered or something if r1, r2 are specified in constructor).

natelust · 2020-10-19T18:57:53Z

python/lsst/pipe/tasks/processBrightStars.py

+                    and cpix[1] >= self.config.stampSize[1]/2
+                    and cpix[1] < inputExposure.getDimensions()[1] - self.config.stampSize[1]/2):
+                starIms += [inputExposure.getCutout(sp, geom.Extent2I(self.config.stampSize))]
+                pixCenters += [cpix]


dont use += [] just to append to a list, you are for sure allocating a new list each time to throw it away. Depending on when the size constraints are hit, you are reallocating the main list anyway. It is better to just use append, and let the reallocating happen when needed.

natelust · 2021-01-06T16:17:11Z

python/lsst/pipe/tasks/processBrightStars.py

+        dataRef.put(output.brightStarStamps, "brightStarStamps")
+        return pipeBase.Struct(brightStarStamps=output.brightStarStamps)
+
+    def runQuantum(self, butlerQC, inputRefs, outputRefs):


We absolutely can not merge this as is. You cannot assume that 1) a gen2 butler is available 2) this is run on a machine that has this file path. If this would merge, it would seem to work in some cases, and then would fail in ci and integration tests.

At this point I dont think we should be prioritizing getting things done with the mentality of defaulting to gen2 and then addressing gen3 later as a secondary concern. Gen2 is deprecated, so unless there is some technical reason something can not be done in gen3 yet, I feel this should be the bar for considering something "working". The only technical limitation I know of getting Gaia support is simply doing the work.

There are two paths forward I see. 1) work with the current bootstrapping to get Gaia calibrations migrated when the repo is converted to a gen3 repo. 2) Do the work to get Gaia ingested naively.

What you have below is fine for proving that you code could work, but it is not there yet for merging to master.

natelust · 2021-01-06T16:38:54Z

python/lsst/pipe/tasks/processBrightStars.py

+            The image from which bright star stamps should be extracted.
+        refObjLoader : `LoadIndexedReferenceObjectsTask`, optional
+            Loader to find objects within a reference catalog.
+        dataId : `dict`


In gen3 world, this will be a lsst.daf.butler.DataId (which itself is already either a DataCoordinate or dict). It should be included in the documentation, and it is up to you if you want to also include dict (as one might not know DataId could be a dict) or you could document it as "dict or lsst.daf.butler.DataCoordinate"

MorganSchmitz requested a review from natelust July 31, 2020 16:40

MorganSchmitz force-pushed the tickets/DM-25304 branch from 3498dd3 to 5c7b5ab Compare August 17, 2020 15:39

MorganSchmitz force-pushed the tickets/DM-25304 branch from 5c7b5ab to cf14459 Compare September 11, 2020 21:06

natelust reviewed Oct 19, 2020

View reviewed changes

MorganSchmitz force-pushed the tickets/DM-25304 branch 2 times, most recently from e1d8112 to 523b02f Compare October 21, 2020 11:12

natelust approved these changes Oct 29, 2020

View reviewed changes

MorganSchmitz force-pushed the tickets/DM-25304 branch from dc12df7 to f1e2692 Compare November 24, 2020 09:04

MorganSchmitz force-pushed the tickets/DM-25304 branch from ca7f4d5 to bf1e596 Compare December 1, 2020 11:54

MorganSchmitz force-pushed the tickets/DM-25304 branch from bf1e596 to 1048955 Compare January 4, 2021 18:39

natelust requested changes Jan 6, 2021

View reviewed changes

MorganSchmitz force-pushed the tickets/DM-25304 branch 2 times, most recently from aa94dad to 8839285 Compare January 19, 2021 16:12

MorganSchmitz force-pushed the tickets/DM-25304 branch from 8839285 to 3242708 Compare January 25, 2021 09:58

natelust approved these changes Jan 25, 2021

View reviewed changes

MorganSchmitz force-pushed the tickets/DM-25304 branch from 3242708 to 1e39c11 Compare January 26, 2021 16:20

MorganSchmitz added 2 commits January 27, 2021 08:13

Add task to extract and process bright stars

89fbb17

Make measureAndNormalize a BrightStarStamp method

92cfdb7

MorganSchmitz force-pushed the tickets/DM-25304 branch from 1e39c11 to 92cfdb7 Compare January 27, 2021 14:13

MorganSchmitz merged commit cbd0928 into master Jan 27, 2021

MorganSchmitz deleted the tickets/DM-25304 branch January 27, 2021 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-25304: Add task to extract and process bright stars #402

DM-25304: Add task to extract and process bright stars #402

MorganSchmitz commented Jul 31, 2020

natelust left a comment

natelust Aug 10, 2020

natelust Aug 10, 2020

natelust Aug 10, 2020

natelust Aug 10, 2020

MorganSchmitz Oct 20, 2020

natelust Oct 19, 2020

MorganSchmitz Oct 20, 2020

natelust Oct 19, 2020

natelust Oct 19, 2020

natelust Oct 19, 2020

natelust Oct 19, 2020

natelust Oct 19, 2020

natelust Jan 6, 2021

natelust Jan 6, 2021

		self.refCatLoader.ref_dataset_name = "gaia_dr2_20200414"


		class ProcessBrightStarsTask(pipeBase.PipelineTask, pipeBase.CmdLineTask):

DM-25304: Add task to extract and process bright stars #402

DM-25304: Add task to extract and process bright stars #402

Conversation

MorganSchmitz commented Jul 31, 2020

natelust left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment