DM-31345: Fix the underlying bug resulting in PSF-matching failures in ci_hsc and ci_imsim #19

arunkannawadi · 2021-08-16T18:56:23Z

The PSF shape needs to be computed at the centroid of the source of interest. There was a subtler issue with the scaling being done on trace radius in some places and on determinant radius in some other places, although this inconsistency had little role to play. Both of these are fixed in this ticket, and the messages about unable to find PSF-matching kernel do not appear any more.

laurenam

I find the wording around the need to compute the PSF at center confusing. In the commit message you imply wanting it to be the center of the exposure's bounding box, but I think you mean that you want to evaluate the PSF at the centroid of the source. Also, the statement in the PR:

The PSF shape is be computed at the centroid of the source being measured, even if the exposure is already clipped to the bounding box of the corresponding footprint

is confusing as what I think you mean to imply is that clipping to the source's footprint bounding box does not guarantee (or even necessarily suggest) that this will be centered on the source's centroid. Either a clarification of that, or just a simple statement that the PSF needs to be evaluated at the source of interest's centroid (as opposed to whatever default you get without explicitly specifying a position) would be more clear.

Also (could I be more nit-picky!), I find the statement:

the messages about unable to find PSF-matching kernel do not appear any more

a bit misleading as such a message still should appear in the case of a failure! (i.e. the point here is that you fixed the issue that was causing so many failures, not that you just got rid of a superfluous error message, right?)

laurenam · 2021-08-17T19:30:44Z

python/lsst/meas/extensions/gaap/_gaap.py

@@ -549,7 +549,7 @@ def _gaussianizeAndMeasure(self, measRecord: lsst.afw.table.SourceRecord,
            raise measBase.FatalAlgorithmError("No PSF in exposure")
        wcs = exposure.getWcs()

-        seeing = psf.computeShape(center).getTraceRadius()
+        seeing = psf.computeShape(center).getDeterminantRadius()


As a casual reader of this code, I have no idea what "seeing" is explicitly supposed to represent, so have no way of assessing if this change is correct (even though I'm sure you know that it is!) Perhaps some words defining what "seeing" means or, even better, a variable name that is more explicit about what the variable represents. I'll note that you can find the stack definitions of these here: https://github.com/lsst/afw/blob/master/include/lsst/afw/geom/ellipses/BaseCore.h#L99-L112

It is somewhat intentional that seeing is vague. It can either be trace radius or determinant radius, as long as it is used consistently. I have renamed it to psfSigma, as the factor in _gaussianizePsf task that converts to a FWHM is called sigma2fwhm.

laurenam · 2021-08-17T20:20:00Z

python/lsst/meas/extensions/gaap/_gaussianizePsf.py

        fwhmModel = targetPsfModel.computeShape().getDeterminantRadius()*sigma2fwhm
+        self.log.debug(f"Ratio of model to science PSF = {fwhmModel/fwhmScience}")


I can't keep up with the proper use of format strings in logging, but I will note that the use of f here does not match the other self.log.debug statements I see in this file.

For logging, we're not supposed to use f strings. It should be self.log.debug("Ratio of model to science PSF = %f", fwhmModel/fwhmScience)

I am also happy to remove this logging from here. It will appear something like: psfMatch DEBUG: Ratio of model to science PSF = 1.15, which may still not be super helpful.

Totally your call given that it's at debug level 😄

laurenam · 2021-08-17T20:36:09Z

python/lsst/meas/extensions/gaap/_gaussianizePsf.py

+        except Exception as e:
+            self.log.error("Failed to solve for PSF matching kernel in GAaP for (%f, %f): %s",
+                           (center.getX(), center.getY(), e))
+            raise e


You need to add this new raise to the docs. Also, the previous behavior did not include a raise statement (even though the docs in _solve() say that it does)...please give a thought to what the intended behavior is (i.e. should this really be a raise or just a logging of the error and move on?)

This is not exactly a new raise, but just bubbling up of an exception raised in the _solve method. The only new change is that some useful information is logged before the exception reaches the plugin framework, which then calls the fail method of the plugin. The plugin framework doesn't have the contextual information about the location (center) where the failure occurred, so it is logged here before propagating the exception up. And since this is not the origin of the exception, I don't think this needs to mentioned in the docs.

Well, I moved the logging from here and the plugin's fail method does the logging now.

The modified logging helped uncover a bug in meas_base about how undeblended plugins were initialiized. DM-31442 has to be merged before merging this branch to master.

Oh, right...somehow I missed the raise e in the _solve() try block...sorry for the noise! Can you kick off a fresh Jenkins with the new implementation (which I'll have a look at now).

The ``center`` variable needs to be specified explicitly to the ``computeShape`` method.

arunkannawadi · 2021-08-17T22:04:46Z

I modified the description in the PR and in the first commit message. Of course, the comment about "Unable to find PSF-matching kernel" is not a generic one but that it does not appear when run on ci_hsc and ci_imsim.

laurenam · 2021-08-17T23:05:38Z

python/lsst/meas/extensions/gaap/_gaap.py

@@ -277,6 +278,7 @@ class BaseGaapFluxMixin:
    """

    ConfigClass = BaseGaapFluxConfig
+    hasLogName = True


The plugin logging framework needs this.
https://pipelines.lsst.io/modules/lsst.meas.base/tasks_and_algorithms.html#how-a-plugin-can-get-its-log-name

Yeah...saw that when the meas_base PR popped up!

That documentation is out of date. It shouldn't be referring to lsst.log in the "how to use" section.

laurenam · 2021-08-18T01:41:07Z

python/lsst/meas/extensions/gaap/_gaussianizePsf.py

+        except Exception as e:
+            self.log.error("Failed to solve for PSF matching kernel in GAaP for (%f, %f): %s",
+                           (center.getX(), center.getY(), e))
+            raise e


Oh, right...somehow I missed the raise e in the _solve() try block...sorry for the noise! Can you kick off a fresh Jenkins with the new implementation (which I'll have a look at now).

laurenam · 2021-08-18T02:04:33Z

tests/test_gaap.py

-                                                                         'ext_gaap_GaapFlux',
-                                                                         schema, None)
+                                                                         "ext_gaap_GaapFlux",
+                                                                         schema, None, logName="testGaap.log")


Might be a long-shot...but would it be possible to force a failure to _solve() in the test code to make sure it "does the right thing"?

So, the testFail method does that, especially at L266 where flag_gaussianization is set to True by the fail method after logging. But running pytest or scons doesn't create the log file. Running the test script as python tests/test_gaap.py (not recommended and is going to be phased out) prints it on the command line. It looks as follows:

.ERROR: Unable to calculate psf matching kernel measurement.ext_gaap_GaapFlux ERROR: Failed to solve for PSF matching kernel in GAaP for (100.000000, 670.000000): Problematic scaling factors = 100.0 Errors: Exception('Unable to determine kernel sum; 0 candidates') ERROR: Unable to calculate psf matching kernel measurement.ext_gaap_GaapFlux ERROR: Failed to solve for PSF matching kernel in GAaP for (100.000000, 870.000000): Problematic scaling factors = 100.0 Errors: Exception('Unable to determine kernel sum; 0 candidates') ERROR: Unable to calculate psf matching kernel measurement.ext_gaap_GaapFlux ERROR: Failed to solve for PSF matching kernel in GAaP for (-10.000000, -20.000000): Problematic scaling factors = 100.0 Errors: Exception('Unable to determine kernel sum; 0 candidates')

Any inputs on how to run the test_gaap.py in a manner to produce a log file? I was going to run forcedPhotCoadd with just the logging changes on the ci_imsim and ci_hsc branches and see how the error messages look like, but it is of course not principled.

Hhhmmm...no, I have no idea how to do that. It seems there are now no GAaP failures in the limited ci datasets (which is fundamentally a very good thing if the numbers are all sensible!) But what you have shown here does sufficiently scratch the itch I had to see an example of the log output on error, so thanks!

.log seems odd for the name of a logger -- it's not a file name.

I've removed specifying any value for logName, since the Python logging can handle None. The logging aspects themselves are checked in the testFail method explicitly.

timj

This change makes clear the meas_base documentation should have been updated to use python logging.

timj · 2021-08-18T03:49:49Z

python/lsst/meas/extensions/gaap/_gaap.py

@@ -331,6 +333,9 @@ def __init__(self, config: BaseGaapFluxConfig, name, schema, logName=None) -> No
            flagName = self.ConfigClass._getGaapResultName(scalingFactor, "flag_gaussianization")
            flagDefs.add(flagName, "PSF Gaussianization failed when trying to scale by this factor.")

+        self.logName = logName
+        self.log = lsst.log.getLogger(self.getLogName())  # getLogName is inherited from the plugin classes.


No new python code should be using lsst.log. meas_base does not. Use logging.getLogger().

What log name does this end up as in practice? Should the logName default to something that has gaap in it?

The BaseMeasurementTask seems to assign the logName, with the plugin name included. https://github.com/lsst/meas_base/blob/c63f6ee5de4496055ae9f4a2d7b9beb5fae67e73/python/lsst/meas/base/baseMeasurement.py#L272

The log name will end up as measurement.ext_gaap_GaapFlux in practice. Now, unlike lsst.log, Python's default logging method doesn't seem to being the message with the logger name. So running the testFail method from the command line produces

ERROR: Unable to calculate psf matching kernel Failed to solve for PSF matching kernel in GAaP for (100.000000, 670.000000): Problematic scaling factors = 100.0 Errors: Exception('Unable to determine kernel sum; 0 candidates') ERROR: Unable to calculate psf matching kernel Failed to solve for PSF matching kernel in GAaP for (100.000000, 870.000000): Problematic scaling factors = 100.0 Errors: Exception('Unable to determine kernel sum; 0 candidates') ERROR: Unable to calculate psf matching kernel Failed to solve for PSF matching kernel in GAaP for (-10.000000, -20.000000): Problematic scaling factors = 100.0 Errors: Exception('Unable to determine kernel sum; 0 candidates')

unlike the previous comment.

I think that’s actually ok. See discussion on this PR: lsst/meas_extensions_piff#4

Yes, this is default python logging and we configure it when pipetask or command line tasks run. If you want to test that a log message is actually being issued you should use self.assertLogs and with that you can check that an ERROR is issued and also what the message contains. Testing that an ERROR log message is issued might well be a good idea for this ticket.

timj · 2021-08-18T03:50:10Z

python/lsst/meas/extensions/gaap/_gaap.py

@@ -331,6 +333,9 @@ def __init__(self, config: BaseGaapFluxConfig, name, schema, logName=None) -> No
            flagName = self.ConfigClass._getGaapResultName(scalingFactor, "flag_gaussianization")
            flagDefs.add(flagName, "PSF Gaussianization failed when trying to scale by this factor.")

+        self.logName = logName


You default this to None so are you happy for this to be None?

The job of providing the logName is taken care of by the plugin framework. It actually can never handle None case within the plugin, which is why I had to give some string in the unit tests, where I run the plugin outside of the measurement framework. I thought about getting the default logger if logName is None. Should I go that way?

Hmm. Okay, it's not clear from the parameter defaulting that this is impossible in general but it sounds like it's common knowledge for all plugin writers. I suppose a comment wouldn't hurt to remind people.

This is now mentioned in the docstrings, where logName is defined.

timj · 2021-08-18T03:52:52Z

python/lsst/meas/extensions/gaap/_gaap.py

@@ -277,6 +278,7 @@ class BaseGaapFluxMixin:
    """

    ConfigClass = BaseGaapFluxConfig
+    hasLogName = True


That documentation is out of date. It shouldn't be referring to lsst.log in the "how to use" section.

Log the exact error message if the PSF Gaussianization fails along with the location in the exposure.

Use determinant radius consistently and remove any use of trace radius.

laurenam reviewed Aug 17, 2021

View reviewed changes

arunkannawadi force-pushed the tickets/DM-31345 branch from 728f542 to 731f848 Compare August 17, 2021 21:05

Calculate the PSF shape at source's centroid

7e647a4

The ``center`` variable needs to be specified explicitly to the ``computeShape`` method.

arunkannawadi force-pushed the tickets/DM-31345 branch from 731f848 to d6cdb49 Compare August 17, 2021 22:01

arunkannawadi force-pushed the tickets/DM-31345 branch from d6cdb49 to 762aff6 Compare August 18, 2021 00:50

laurenam reviewed Aug 18, 2021

View reviewed changes

laurenam approved these changes Aug 18, 2021

View reviewed changes

timj approved these changes Aug 18, 2021

View reviewed changes

arunkannawadi added 3 commits August 18, 2021 13:28

Add some log messages

9a0b3c8

Log the exact error message if the PSF Gaussianization fails along with the location in the exposure.

Calculate seeing using determinant

996985b

Use determinant radius consistently and remove any use of trace radius.

Rename seeing to psfSigma

1d602ff

arunkannawadi force-pushed the tickets/DM-31345 branch from 762aff6 to 1d602ff Compare August 18, 2021 18:28

arunkannawadi merged commit 6b6d59c into master Aug 19, 2021

		fwhmModel = targetPsfModel.computeShape().getDeterminantRadius()*sigma2fwhm
		self.log.debug(f"Ratio of model to science PSF = {fwhmModel/fwhmScience}")

DM-31345: Fix the underlying bug resulting in PSF-matching failures in ci_hsc and ci_imsim #19

DM-31345: Fix the underlying bug resulting in PSF-matching failures in ci_hsc and ci_imsim #19

Conversation

arunkannawadi commented Aug 16, 2021 • edited

laurenam left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arunkannawadi commented Aug 17, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arunkannawadi Aug 18, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arunkannawadi commented Aug 16, 2021 •

edited

arunkannawadi Aug 18, 2021 •

edited