DM-21380: Add a galaxy photometric repeatability metric to validate_drp #107

taranu · 2019-10-24T19:56:27Z

Add/modify code to compute metrics defined in verify_metrics

These are virtually identical to PA1 but deliberately kept separate in case of future divergence.

wmwv

Some scattered comments.

I think the major is asking whether some of the functions can be generalized instead of basically writing an almost-copy for the galaxy work.
The secondary is a request to clearly explain what's going on with the different bins.

wmwv · 2019-11-01T17:52:17Z

config/hscConfigGalaxies.py

+# these commissioning data do not have the correct header info to apply the stray light correction
+config.processCcd.isr.doStrayLight = False
+# Run CModel
+config.processCcd.calibrate.measurement.plugins.names |= [


I think it's fine to just add these to the overall configs. HSC, DECam, and CFHT.
But also do so in the lsst_ci repo.

Running CModel significantly increases the processing time (by an extra 50% for the Hsc example) - are we okay with that?

Oh, right. I remember the discussion now.

Yes, I think that's fine. But check with @SimonKrughoff .

For the CI cases, it still seems like it's the build time that dominates. But @SimonKrughoff is responsible for running these things.

wmwv · 2019-11-01T18:01:10Z

python/lsst/validate/drp/matchreduce.py

@@ -190,6 +194,17 @@ def _loadAndMatchCatalogs(repo, dataIds, matchRadius,
                                       'PSF magnitude'))
    mapper.addOutputField(Field[float]('base_PsfFlux_magErr',
                                       'PSF magnitude uncertainty'))
+    if not skipGalaxies:


Can you re-phrase this as an interactive check to see if the necessary columns are there in the data?

wmwv · 2019-11-01T18:02:26Z

python/lsst/validate/drp/matchreduce.py

-    safeSnr : `float`, optional
-        Minimum median SNR for a match to be considered "safe".
+    goodSnr : float, optional
+        Minimum median SNR for a match to be considered "good"; default 3.


I think goodSnr should default to at least 5. There are a lot of 3-sigma detections in ~billion pixels.

wmwv · 2019-11-01T18:07:33Z

python/lsst/validate/drp/matchreduce.py

-    safeMatches = goodMatches.where(safeFilter)
+    def snrFilter(cat):
+        # Note that this also implicitly checks for psfSnr being non-nan.
+        snr = np.median(cat.get(snrKey))


If you're worried about SNR potentially being NaN, then you should calculate the median with np.nanmedian.
np.median does what you expect with Inf, because Inf is sortable. But if there is a NaN in the array np.median will return NaN.

If

a = np.array([1]) b = np.array([0])

a/b is np.array([Inf])
b/b is np.array([NaN])

This comment was already in the code but I accidentally shifted it up one line; it was referring to the comparison to snrMin/Max. So the existing behavior is to get a NaN median if any of the SNRs are NaN and hence return False. Do we want to tolerate some fraction of the SNRs being NaN?

No, I think it's fine to not expect NaNs and to not tolerate any.

The executed code is fine. The comment likely just needs to be moved.

wmwv · 2019-11-01T18:09:52Z

python/lsst/validate/drp/validate.py

@@ -289,6 +290,8 @@ def runOneFilter(repo, visitDataIds, metrics, brightSnr=100,
        PsfShape measurements).
    verbose : bool, optional
        Output additional information on the analysis steps.
+    skipGalaxies : bool, optional
+        Whether to skip processing galaxies even if model measurements are available; default False.


I don't think it's necessary to support this option. If we can calculate them we should.

I think we can since there should always be a slot_modelFlux, but if CModel hasn't been run then that defaults to base_GaussianFlux, which isn't especially useful.

wmwv · 2019-11-01T18:31:11Z

python/lsst/validate/drp/calcsrd/pa1.py

-    np.random.shuffle(copy)
-    return copy[0] - copy[1]
+    a, b = random.sample(range(len(array)), 2)
+    return array[a] - array[b]


Good improvement.

Perhaps even simpler:

Suggested change

return array[a] - array[b]

a, b = random.sample(array, 2)

return a - b

Sadly, that only works on sets or sequences, not numpy arrays.

Oh, right. I hadn't fully thought that through.

wmwv · 2019-11-01T18:34:01Z

python/lsst/validate/drp/calcsrd/pa1.py

@@ -270,8 +273,7 @@ def getRandomDiffRmsInMmags(array):
    >>> print(rms)
    212.132034
    """
-    # For scalars, math.sqrt is several times faster than numpy.sqrt.
-    return (1000/math.sqrt(2)) * getRandomDiff(array)
+    return thousandDivSqrtTwo * getRandomDiff(array)


How much does this optimization matter?

wmwv · 2019-11-01T18:37:44Z

bin.src/validateDrp.py

@@ -73,6 +73,8 @@
                        help='Skip making plots of performance.')
    parser.add_argument('--level', type=str, default='design',
                        help='Level of SRD requirement to meet: "minimum", "design", "stretch"')
+    parser.add_argument('--skipGalaxies', dest='skipGalaxies', default=False, action='store_true',


--skipGalaxies is likely ambiguous if you're just thinking about the KPMs. You might be like "Well, duh, of course I don't want galaxies in my stellar photometry KPMs".

Is it actually necessary to have this option? When computing things in {{calcnonsrc}}, can you just log a simple message that these metrics aren't available and then skip them?

wmwv · 2019-11-01T18:46:52Z

python/lsst/validate/drp/validate.py

+                _reduceSources(matchedDataset, matchedDataset._matchedCatalog, extended=source == 'Gal',
+                               nameFluxKey=name_flux, goodSnr=snr, goodSnrMax=snr*2,
+                               safeSnr=snr*2, safeSnrMax=snr*4)
+                for bin_offset in range(2):


I think

for bin_offset in [0, 1]:

would be more clear.

wmwv · 2019-11-01T18:47:42Z

python/lsst/validate/drp/validate.py

+        name_flux_all.append('slot_ModelFlux')
+    for name_flux in name_flux_all:
+        key_model_mag = matchedDataset._matchedCatalog.schema.find(f"{name_flux}_mag").key
+        bin_base = 1


Insert of paragraph explaining about the binning. This is pretty different from the way other metrics are calculated so deserves a bit of a special call out.

wmwv

LGTM.
Two minor comments.
I haven't run this latest version. But I trust that you have.

wmwv · 2019-11-26T21:30:31Z

config/hscConfig.py

@@ -2,3 +2,7 @@
 config.processCcd.isr.doAttachTransmissionCurve = False
 # these commissioning data do not have the correct header info to apply the stray light correction
 config.processCcd.isr.doStrayLight = False
+# Run meas_modelfit to compute CModel fluxes


Should you explicitly

import lsst.meas.modelfit

here as well?

I will just in case; obs_subaru just happens to be configured to load it by default (in config/*.py).

wmwv · 2019-11-26T21:33:05Z

python/lsst/validate/drp/matchreduce.py

@@ -190,6 +194,17 @@ def _loadAndMatchCatalogs(repo, dataIds, matchRadius,
                                       'PSF magnitude'))
    mapper.addOutputField(Field[float]('base_PsfFlux_magErr',
                                       'PSF magnitude uncertainty'))
+    if not skipNonSrd:
+        # Needed because addOutputField(... 'slot_ModelFlux_mag') will add a field with that literal name


" Need to check for column existence through an alias because addOutputField(... ..."

taranu added 8 commits October 24, 2019 13:29

Use random.sample instead of copy + shuffle

523f536

Avoid repeated evaluation of a constant

ce445dd

Add methods to compute model/galaxy photometric repeatability

cdb422c

These are virtually identical to PA1 but deliberately kept separate in case of future divergence.

Ignore specs with no metrics

f6ee8f2

Count failure on None spec

575dc43

Fix log formatting

9efdb04

Compute model/galaxy/PSF photometric repeatability for fixed S/N bins

f2cf730

Add sample config using meas_modelfit (CModel)

baaa35e

wmwv requested changes Nov 1, 2019

View reviewed changes

wmwv approved these changes Nov 26, 2019

View reviewed changes

taranu added 3 commits December 2, 2019 13:53

Move repeatability measurements into own file

e6898a1

Rename skipGalaxies to skipNonSrd

8fbc737

Run meas_modelfit CModel by default in all configs

a28823f

taranu force-pushed the tickets/DM-21380 branch from f468359 to a28823f Compare December 2, 2019 20:14

taranu merged commit a28823f into master Dec 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-21380: Add a galaxy photometric repeatability metric to validate_drp #107

DM-21380: Add a galaxy photometric repeatability metric to validate_drp #107

taranu commented Oct 24, 2019

wmwv left a comment

wmwv Nov 1, 2019

taranu Nov 1, 2019

wmwv Nov 1, 2019

wmwv Nov 1, 2019

wmwv Nov 1, 2019

wmwv Nov 1, 2019

wmwv Nov 1, 2019

taranu Nov 6, 2019

wmwv Nov 6, 2019

wmwv Nov 1, 2019

taranu Nov 1, 2019

wmwv Nov 1, 2019

taranu Nov 6, 2019

wmwv Nov 6, 2019

wmwv Nov 1, 2019

wmwv Nov 1, 2019

wmwv Nov 1, 2019

wmwv Nov 1, 2019

wmwv left a comment

wmwv Nov 26, 2019

taranu Dec 2, 2019

wmwv Nov 26, 2019

	return array[a] - array[b]
	a, b = random.sample(array, 2)
	return a - b

DM-21380: Add a galaxy photometric repeatability metric to validate_drp #107

DM-21380: Add a galaxy photometric repeatability metric to validate_drp #107

Conversation

taranu commented Oct 24, 2019

wmwv left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wmwv left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment