DM-9147: global sky subtraction #23

PaulPrice · 2017-03-01T18:55:50Z

No description provided.

hsinfang · 2017-03-02T22:57:33Z

python/lsst/pipe/drivers/constructCalibs.py

@@ -699,6 +713,43 @@ def write(self, butler, exposure, dataId):
        self.log.info("Writing %s on %s" % (dataId, NODE))
        butler.put(exposure, self.calibName, dataId)

+    def makeCameraImage(self, butler, dataId, calibs):


Would it be better to leave butler in the run method? For example, take camera as input and the calib_camera as output?

Hmm... after reading a bit more, I realized bulter is being passed around in multiple places in constructCalibs... I thought we are discouraged from doing so inside Tasks?

hsinfang · 2017-03-02T23:08:37Z

python/lsst/pipe/drivers/background.py

+
+
+def robustMean(array, rej=3.0):
+    """Measure a robust mean of an array


Does afw provide something like this?

hsinfang · 2017-03-02T23:14:13Z

python/lsst/pipe/drivers/background.py

+
+    # def putSkyData(self, butler, calibId, bgExp, pistons=None):
+    #     self.addPistonHeaders(bgExp, pistons)
+    #     butler.put(bgExp, "sky", calibId)


Remove commented-out codes?

hsinfang · 2017-03-02T23:43:51Z

python/lsst/pipe/drivers/background.py

+
+    @staticmethod
+    def exposureToBackground(bgExp):
+        """Convert an exposure to background model


The conversion between an Exposure and a background model (exposureToBackground and backgroundToExposure) and some background operations (averageBackgrounds and so on) look rather generic; would they better live somewhere higher up in the package chain? For example afw or pipe_tasks or some meas packages?

hsinfang · 2017-03-03T00:08:46Z

python/lsst/pipe/drivers/skyCorrection.py

+        WARNING: We clobber the calexp in the data repository! This may not
+        be desirable, but nor do we want to introduce multiple datasets that
+        the user has to select down the road.  The user should write to a
+        different rerun or output data repository.


It's true the user can write to a different output repo, but I would rather introducing another butler dataset type, and let the user select which one to use later. Overwriting data files make it very difficult for provenance, workflow, to name a few. Maybe butler+supertask will come up with some brilliant solutions later, but before that happens I do not like the idea of sharing dataset type names.

hsinfang · 2017-03-03T00:21:45Z

python/lsst/pipe/drivers/constructCalibs.py

@@ -515,7 +552,7 @@ def process(self, cache, ccdId, outputName="postISRCCD"):
        else:
            self.log.info(
                "Using previously persisted processed exposure for %s" % (sensorRef.dataId,))
-            exposure = sensorRef.get(outputName, immediate=True)
+            exposure = sensorRef.get(outputName, immediate=False)


RFC-286 gave me the impression that there were no real use cases of immediate=False; seems we may have one now?

Ah, and it may not actually work. We don't have this enabled with pybind11 now. We deferred the decision on which types to proxy to DM-9563.

pschella · 2017-11-03T15:23:31Z

python/lsst/pipe/drivers/background.py

+                                     "MEDIAN": "median",})
+    clip = Field(doc="Clipping threshold for background", dtype=float, default=3.0)
+    nIter = Field(doc="Clipping iterations for background", dtype=int, default=3)
+    mask = ListField(doc="Mask planes to reject", dtype=str, default=["SAT", "DETECTED", "BAD", "NO_DATA",])


I assume "EDGE" is not relevant here because it is for the full focal plane. But what about "DETECTED_NEGATIVE"?

pschella · 2017-11-03T15:46:00Z

python/lsst/pipe/drivers/background.py

+        assert all(len(bg) == 1 for bg in bgList), "Mixed bgList: %s" % ([len(bg) for bg in bgList],)
+        images = [bg[0][0].getStatsImage() for bg in bgList]
+        boxes = [bg[0][0].getImageBBox() for bg in bgList]
+        assert len(set((box.getMinX(), box.getMinY(), box.getMaxX(), box.getMaxY()) for box in boxes)) == 1


Should an overlap also work?

Perhaps add a message (e.g. "bounding boxes not all equal") instead of having a user parse this on failure.

I think having some overlap would work so long as it's built in consistently.

pschella · 2017-11-03T15:49:30Z

python/lsst/pipe/drivers/background.py

+        maskVal = afwImage.Mask.getPlaneBitMask("BAD")
+        for img in images:
+            bad = numpy.isnan(img.getImage().getArray())
+            img.getMask().getArray()[bad] = maskVal


Perhaps add an option to skip this for efficiency? Maybe this has already been done on input.

The background images are small, so it's not a big hit on efficiency, and I don't think it hurts to be thorough.

pschella · 2017-11-03T15:52:30Z

python/lsst/pipe/drivers/background.py

+
+        stats = afwMath.StatisticsControl()
+        stats.setAndMask(maskVal)
+        stats.setNanSafe(True)


Is this still needed if you masked? Or equivalently, do you need to mask?

Probably not needed (maybe cargo-culted from somewhere), but doesn't hurt.

pschella · 2017-11-03T15:54:06Z

python/lsst/pipe/drivers/background.py

+        array = combined.getImage().getArray()
+        bad = numpy.isnan(array)
+        mean = robustMean(array[~bad], self.config.skyRej)
+        array[bad] = mean


Is this safe in case of a gradient? Wouldn't an interpolation be better?

pschella · 2017-11-03T21:17:35Z

python/lsst/pipe/drivers/background.py

+        for x in range(width):
+            if numpy.any(isBad[:, x]) and numpy.any(isGood[:, x]):
+                array[:, x][isBad[:, x]] = interpolate1D(method, yIndices[isGood[:, x]],
+                                                         array[:, x][isGood[:, x]], yIndices[isBad[:, x]])


What if all were bad in one of these cases?

pschella · 2017-11-03T21:19:36Z

python/lsst/pipe/drivers/constructCalibs.py

@@ -228,9 +231,41 @@ def getCcdIdListFromExposures(expRefList, level="sensor", ccdKeys=["ccd"]):
                ccdLists[name] = []
            ccdLists[name].append(ccdId)

+    for ccd in ccdLists:
+        ccdLists[ccd] = sorted(ccdLists[ccd], key=lambda dd: dictToTuple(dd, sorted(dd.keys())))


Each item is a dict and you are first sorting each item by key, then lexicographically the list by value? Comment might be useful here.

pschella · 2017-11-03T21:28:28Z

python/lsst/pipe/drivers/constructCalibs.py

@@ -515,7 +552,7 @@ def process(self, cache, ccdId, outputName="postISRCCD"):
        else:
            self.log.info(
                "Using previously persisted processed exposure for %s" % (sensorRef.dataId,))
-            exposure = sensorRef.get(outputName, immediate=True)
+            exposure = sensorRef.get(outputName, immediate=False)


Ah, and it may not actually work. We don't have this enabled with pybind11 now. We deferred the decision on which types to proxy to DM-9563.

pschella · 2017-11-03T21:30:53Z

python/lsst/pipe/drivers/constructCalibs.py

+        fullOutputId = {k: ccdName[i] for i, k in enumerate(self.config.ccdKeys)}
+        fullOutputId.update(outputId)
+        self.addMissingKeys(fullOutputId, butler)
+        fullOutputId.update(outputId)  # must be after the call to queryMetadata


pschella · 2017-11-03T21:44:51Z

python/lsst/pipe/drivers/constructCalibs.py

+        # Set detected/bad pixels to background to ensure they don't corrupt the background
+        maskVal = image.getMask().getPlaneBitMask(self.config.mask)
+        isBad = image.getMask().getArray() & maskVal > 0
+        bgLevel = np.median(image.getImage().getArray()[~isBad])


Assuming there are no gradients. And why median here and (clipped) mean in other cases?

This is a regular image (as opposed to a background model) that we know has a lot of junk (like stars and galaxies) in it, so the median is a bit more robust. It shouldn't matter, because those pixels are masked, but I'm being careful.

pschella · 2017-11-06T13:47:56Z

python/lsst/pipe/drivers/background.py

+            num = result.getValue(afwMath.NPOINT)
+            if not numpy.isfinite(mean) or not numpy.isfinite(num):
+                continue
+            warped.set(xx, yy, mean*num)


I was referring to this in the previous comment. Let's say num << npixels then you get a significantly smaller value here. I thought that instead you wanted to pretend that all pixels in the box had the mean value. But apparently that is not the case?

I'm pretending that num pixels have the same value.
Pixels from a different CCD may overlap this superpixel, so those are going to get included as well, weighted by the number of pixels.

Since we are iterating over exposures and CCDs, it's important that the order be consistent. Otherwise, we might accidentally mix CCDs from different exposures.

Why, oh why would you put a line break there???

This will make it easier to subclass later. * Split process pool into two parts, so subclass can access the one they want. * Have CalibTask.run return result, so the subclass can use it. * Move the bulk of scatterProcess into a free function that can be used by anything that wants a matrix of results. * Allow passing of additional arguments to processSingle. * Allow the butler to use the read proxy when processing, since subclasses may not use the result. * Move code to get fully-qualified outputId into its own method. This makes it easier to override the 'combine' method.

We stitch the resultant calib together to form a single (binned) image with all the CCDs in it. This allows the user to quickly view the result.

There's no good data there, so only trouble can result from including them.

A sky frame is the dominant response of the camera to the sky. It is constructed by first subtracting a large-scale focal plane background model (e.g., 1024x1024) from all frames to remove large-scale gradients, binning pixels in each CCD (e.g., 64x64) and then averaging those.

We combine a large-scale background model and a scaled sky frame and write a background model consisting of these (and un-doing the original) that can be used downstream to correct the calexp. This does a decent job over most of the field of view of HSC.

hsinfang reviewed Mar 3, 2017

View reviewed changes

PaulPrice force-pushed the tickets/DM-9147 branch from 15d38e0 to 1a9c0ab Compare September 25, 2017 19:55

PaulPrice force-pushed the tickets/DM-9147 branch 4 times, most recently from e035a77 to a7da064 Compare November 3, 2017 09:50

pschella reviewed Nov 3, 2017

View reviewed changes

pschella reviewed Nov 6, 2017

View reviewed changes

PaulPrice added 7 commits November 15, 2017 13:13

constructCalibs: ensure consistent order of inputs

00de863

Since we are iterating over exposures and CCDs, it's important that the order be consistent. Otherwise, we might accidentally mix CCDs from different exposures.

clean up lines butchered by autopep8

870752c

Why, oh why would you put a line break there???

constructCalib: add construction of focal-plane image

ce29ba9

We stitch the resultant calib together to form a single (binned) image with all the CCDs in it. This allows the user to quickly view the result.

CalibStatsConfig: ignore NO_DATA pixels

9d0889a

There's no good data there, so only trouble can result from including them.

PaulPrice force-pushed the tickets/DM-9147 branch from a309d19 to be221da Compare November 15, 2017 18:17

PaulPrice merged commit be221da into master Nov 15, 2017

ktlim deleted the tickets/DM-9147 branch August 25, 2018 06:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-9147: global sky subtraction #23

DM-9147: global sky subtraction #23

PaulPrice commented Mar 1, 2017

hsinfang Mar 2, 2017

hsinfang Mar 3, 2017

hsinfang Mar 2, 2017

hsinfang Mar 2, 2017

hsinfang Mar 2, 2017

hsinfang Mar 3, 2017

hsinfang Mar 3, 2017

pschella Nov 3, 2017

pschella Nov 3, 2017

pschella Nov 3, 2017

pschella Nov 3, 2017

PaulPrice Nov 6, 2017

pschella Nov 3, 2017

PaulPrice Nov 6, 2017

pschella Nov 3, 2017

PaulPrice Nov 6, 2017

pschella Nov 3, 2017

pschella Nov 3, 2017

pschella Nov 3, 2017

pschella Nov 3, 2017

pschella Nov 3, 2017

pschella Nov 3, 2017

PaulPrice Nov 6, 2017

pschella Nov 6, 2017

PaulPrice Nov 6, 2017



		def robustMean(array, rej=3.0):
		"""Measure a robust mean of an array

DM-9147: global sky subtraction #23

DM-9147: global sky subtraction #23

Conversation

PaulPrice commented Mar 1, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment