DM-5120: Add intelligence to `validate_drp` so it does "A Reasonable Thing" on an unknown output repo #10

wmwv · 2016-02-20T15:23:05Z

validateDrp.py Cfht/output

will now do "something reasonable" with no further configuration information needed.

Return only dataIds with 'src' and 'calexp' on disk. use queryMetadata() to retrieve the filter information in case the dataId from Butler.Subset doesn't include it

Update README.md and examples for new calling format for `validateDrp.py`

jdswinbank · 2016-03-01T15:38:07Z

python/lsst/validate/drp/validate.py

+            calib = afwImage.Calib(butler.get("calexp_md", vId, immediate=True))
+        except TypeError as te:
+            print(te)
+            continue


Why would this throw a TypeError? (I'm not claiming it won't or can't, I'm just a little surprised.) If it happens, can we do something more informative than just printing the exception? (Even adding "Skipping to next data ID" might be useful, for e.g.).

Also, this code is added in 8fc82e0, which is fundamentally about a different issue (adding the no-throw decorator, below). Can we split it into its own commit with a message that explains why it's necessary?

Good question. Now documented in code comment below.

# DECam images that haven't been properly reformatted # can trigger a TypeError because of a residual FITS header # LTV2 which is a float instead of the expected integer. # This generates an error of the form: # # lsst::pex::exceptions::TypeError: 'LTV2 has mismatched type' # # See, e.g., DM-2957 for details.

Also added catching FitsError exception when reading FITS files.

Didn't DM-4133 solve the LTV1/2 mismatched type issue? Or is this from something else?

Yes, this is from the same problem solved by DM-4133. Unfortunately problems in reduced data persist even after code has been updated. I think we'll unfortunately be seeing such issues and accumulating a bit of cruft like this as we have to deal with more and more actual data.

When I ran validateDrp.py on the DECam COSMOS data as processed as NCSA, there were still a number of files with this issue. This catch is designed to skip those and proceed onward.

MWV

On Mar 3, 2016, at 10:29, hchiang2 notifications@github.com wrote:

In python/lsst/validate/drp/validate.py:

@@ -88,17 +92,23 @@ def loadAndMatchData(repo, visitDataIds,
srcVis = SourceCatalog(newSchema)

for vId in visitDataIds:

calib = afwImage.Calib(butler.get("calexp_md", vId, immediate=True))

calib.setThrowOnNegativeFlux(False)

try:

calib = afwImage.Calib(butler.get("calexp_md", vId, immediate=True))

except TypeError as te:

print(te)

continue

Didn't DM-4133 solve the LTV1/2 mismatched type issue? Or is this from something else?

—
Reply to this email directly or view it on GitHub.

If the configFile has only validation parameters specified (e.g., good_mag_limit) then use util.discoverDataIds to identify the dataIds to analyze. Update naming of visitDataIds->dataIds.

Use afwImageUtils.CalibNoThrow context manager instead of explicitly setting the state in the Calib object Add catching FitsError + explain TypeError reading calexp Clarify error messages from missing calibration. Explain why the TypeError needs to be caught when reading LTV2 keywords in DECam images that haven't been fully sanitized.

The caveat is that this is single-node, load-everything-in-memory, and then match. This is infeasible at scale. Specifically, this currently fails on lsst-dev at more than ~500 catalogs in a band.

wmwv added 2 commits February 20, 2016 08:14

Add util.discoverDataIds to return dataIds from a repo

e3731df

Return only dataIds with 'src' and 'calexp' on disk. use queryMetadata() to retrieve the filter information in case the dataId from Butler.Subset doesn't include it

Use argparse. --configFile must be passed as a keyword.

3317269

Update README.md and examples for new calling format for `validateDrp.py`

jdswinbank reviewed Mar 1, 2016
View reviewed changes

Discover dataIds if config file doesn't specify.

d2bb733

If the configFile has only validation parameters specified (e.g., good_mag_limit) then use util.discoverDataIds to identify the dataIds to analyze. Update naming of visitDataIds->dataIds.

wmwv force-pushed the tickets/DM-5120 branch from 25f7144 to f840b52 Compare March 1, 2016 21:18

wmwv added 5 commits March 2, 2016 22:52

Close figures after plotting to free the memory.

5a6431a

Update README to describe new behavior and caveat.

23684a6

The caveat is that this is single-node, load-everything-in-memory, and then match. This is infeasible at scale. Specifically, this currently fails on lsst-dev at more than ~500 catalogs in a band.

Clarify names of expected output plots and JSON.

5903bf0

Fix whitespace and line length.

7e65310

wmwv force-pushed the tickets/DM-5120 branch from 476e184 to 7e65310 Compare March 3, 2016 03:58

wmwv merged commit 7e65310 into master Mar 3, 2016

ktlim deleted the tickets/DM-5120 branch August 25, 2018 06:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-5120: Add intelligence to `validate_drp` so it does "A Reasonable Thing" on an unknown output repo #10

DM-5120: Add intelligence to `validate_drp` so it does "A Reasonable Thing" on an unknown output repo #10

wmwv commented Feb 20, 2016

jdswinbank Mar 1, 2016

wmwv Mar 3, 2016

hsinfang Mar 3, 2016

wmwv Mar 3, 2016

DM-5120: Add intelligence to validate_drp so it does "A Reasonable Thing" on an unknown output repo #10

DM-5120: Add intelligence to validate_drp so it does "A Reasonable Thing" on an unknown output repo #10

Conversation

wmwv commented Feb 20, 2016

jdswinbank Mar 1, 2016

Choose a reason for hiding this comment

wmwv Mar 3, 2016

Choose a reason for hiding this comment

hsinfang Mar 3, 2016

Choose a reason for hiding this comment

wmwv Mar 3, 2016

Choose a reason for hiding this comment

DM-5120: Add intelligence to `validate_drp` so it does "A Reasonable Thing" on an unknown output repo #10

DM-5120: Add intelligence to `validate_drp` so it does "A Reasonable Thing" on an unknown output repo #10