DM-26987: Update filtering of matched catalog to operate before creating GroupView #118

jeffcarlin · 2022-01-26T05:30:27Z

Implemented filtering of catalogs before running multiMatch to create the matched catalogs. This involved:

a) writing a new "prefilter" function, which is called from matcher.py
b) adding config params to control the filtering to the base matched catalog class
c) re-writing some pipelines to allow for different cuts to be applied for different metrics (e.g., the "default" SNR cut for photometric repeatability, but magnitude ranges for astrometric repeatability)

Part (c) has the effect of writing additional matched catalogs rather than using the same one for all metrics. However, I thought it was worth the extra steps to make the entire pipeline more efficient. We could reconsider this if needed.

Some simple tests suggest that for small amounts of data, pipelines with prefiltering of catalogs before matching run in ~half the time, and use ~half the memory. It's possible that the effect will be more dramatic when processing large chunks of data.

ctslater

Added a few small code comments, overall looks like a good fix.

python/lsst/faro/utils/prefilter.py

python/lsst/faro/base/MatchedCatalogBase.py

pipelines/preparation/preparation_matched_multi_jointcal_fgcm.yaml

ctslater · 2022-01-27T17:09:53Z

pipelines/preparation/preparation_matched.yaml

+      faintMagCut: 21.5
+      selectExtended: False
+      python: |
+        config.connections.outputCatalog = 'matchedCatalogTractMag17to21p5'


I couldn't deduce what the p5 means.

That's my shorthand for 21.5, since I don't want to put the '.' in there...

jeffcarlin · 2022-01-29T00:08:08Z

Jenkins job kicked off here

leannep

Colin's comments look to be addressed. I have not additional comments

jeffcarlin · 2022-02-02T18:14:12Z

Successful Jenkins run here

ctslater reviewed Jan 27, 2022

View reviewed changes

leannep approved these changes Jan 29, 2022

View reviewed changes

jeffcarlin added 6 commits February 2, 2022 12:12

Add filtering before catalog matching

462a445

Working version of prefilter

d24d80e

Update pipelines to use prefiltered catalogs

9d20cdc

Update expected values in unit tests

3dd83f2

Clarify docstrings, remove deep copy step

8c2d456

Remove default from pipeline spec

90e9d96

jeffcarlin force-pushed the tickets/DM-26987 branch from 37ce503 to 90e9d96 Compare February 2, 2022 18:13

jeffcarlin merged commit a97a9ee into main Feb 2, 2022

jeffcarlin deleted the tickets/DM-26987 branch February 2, 2022 18:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-26987: Update filtering of matched catalog to operate before creating GroupView #118

DM-26987: Update filtering of matched catalog to operate before creating GroupView #118

jeffcarlin commented Jan 26, 2022 •

edited

ctslater left a comment

ctslater Jan 27, 2022

jeffcarlin Jan 28, 2022

jeffcarlin commented Jan 29, 2022

leannep left a comment

jeffcarlin commented Feb 2, 2022

DM-26987: Update filtering of matched catalog to operate before creating GroupView #118

DM-26987: Update filtering of matched catalog to operate before creating GroupView #118

Conversation

jeffcarlin commented Jan 26, 2022 • edited

ctslater left a comment

Choose a reason for hiding this comment

ctslater Jan 27, 2022

Choose a reason for hiding this comment

jeffcarlin Jan 28, 2022

Choose a reason for hiding this comment

jeffcarlin commented Jan 29, 2022

leannep left a comment

Choose a reason for hiding this comment

jeffcarlin commented Feb 2, 2022

jeffcarlin commented Jan 26, 2022 •

edited