New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-23971: Undo TE1 "regression" #118
Conversation
@@ -163,7 +161,7 @@ def build_matched_dataset(repo, dataIds, matchRadius=None, safeSnr=50., | |||
blob.magKey = blob._matchedCatalog.schema.find("base_PsfFlux_mag").key | |||
# Reduce catalogs into summary statistics. | |||
# These are the serialiable attributes of this class. | |||
reduceSources(blob, blob._matchedCatalog, safeSnr) | |||
summarizeSources(blob, filterSources(blob._matchedCatalog, brightSnr=safeSnr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you were trying to avoid the confusing name safeSnr
, why not replace it here (all 3 instances in "build_matched_dataset") with brightSnrMin
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I was going to leave it to avoid having to change any configs until I remembered that it doesn't actually get used anywhere in the code (until DM-22138 gets fixed).
nameFluxKey=name_flux, goodSnr=snr_one_min, goodSnrMax=snr_one_max, | ||
safeSnr=snr_two_min, safeSnrMax=snr_two_max) | ||
matches = filterSources( | ||
matchedDataset._matchedCatalog, keys=keys, extended=source == 'Gal', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity, maybe extended=source == 'Gal' should be rewritten as extended=(source == 'Gal')
source_type = f'{"extended" if extended else "point"} sources"' | ||
blob['snr'] = Datum(quantity=goodMatches.aggregate(np.median, field=snrKey) * u.Unit(''), | ||
source_type = f'{"extended" if filterResult.extended else "point"} sources"' | ||
matches = filterResult.matchesFaint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear why you are selecting the faint matches here. Does summarizeSources only aggregate the faint sources? (If so, perhaps it should be renamed?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the astrometry metrics use 'snr' and filter on a different brightSnr parameter, whereas the photometry ones just use what used to be called safeMatches. Yes, it's confusing; I didn't want to tackle all of that in this ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. We can clarify that later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of “brightSnr, brightSnrMax” (and the same for faint), wouldn’t it make more sense to use “brightSnrMin, brightSnrMax”?
I like explicitly separating "faint" and "bright" subsets. However, it appears that we're assuming the bright objects are always a subset of the faint, even though that doesn't necessarily have to be true (if faintSnrMax < brightSnrMax). Is the "faint" set really meant to be "all" objects? Or should there be bright/faint/"all" subsets (where "all" is in quotes because it might still have some basic filtering applied)?
Also see comments on specific lines of code.
From what I recall, the code didn't have any SnrMax - it just excluded saturated sources - and it did assume that safe (bright) was a subset of good (faint) without explicitly checking; that happened to work because you couldn't actually set both limits anyway. Given that the SRD does specify that some metrics are for bright stars/point sources and others for faint ones, I think we need another ticket to separate them so that they're not assumed to be overlapping and so each metric uses the correct set of objects without redundant filtering. |
709c306
to
adc573e
Compare
556cc9c
to
6cbd910
Compare
Split function into more atomic operations. Also rename good/safeMatches to matchesFaint/Bright.
6cbd910
to
9ed2003
Compare
No description provided.