Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-34174: Replace PropagateVisitFlagsTask with PropagateSourceFlagsTask #659

Merged
merged 3 commits into from Apr 14, 2022

Conversation

erykoff
Copy link
Contributor

@erykoff erykoff commented Apr 5, 2022

No description provided.

@erykoff erykoff force-pushed the tickets/DM-34174 branch 2 times, most recently from bef49f8 to b9c72ea Compare April 11, 2022 17:30
Copy link
Contributor

@fred3m fred3m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good with one exception: without having looked at the ticket I wouldn't understand the difference between the source table and the finalized source table, so I think that throughout the PR the docs need to be updated to explain the difference between the two.

I'm also still confused about whether or not it is expected for there to be a scenario where both the source table flags and finalized source table flags are propagated. At least the comments in the default config make it appear as if they won't be regularly used. My recommendation would be to write the entire task to use a single input table and if there is a case where the user will need to propagate both sets of flags, they could run the task twice using different sets of inputs (sources vs finalized sources). Or am I missing something?

selected_src['calib_psf_candidate'] = np.zeros(len(selected_src), dtype=bool)
selected_src['calib_psf_used'] = np.zeros(len(selected_src), dtype=bool)
selected_src['calib_psf_reserved'] = np.zeros(len(selected_src), dtype=bool)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused. calib_psf_candidate and calib_psf_reserved are both set (not updated) below, so the above code seems unnecessary except for calib_psf_used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've update the comment that I do this just to be extra sure that information from the input tables doesn't leak through.

type='Flag',
doc=('set if source was used in the PSF determination by '
'FinalizeCharacterizationTask.'),
)
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this code block commented out? It looks like it should be necessary based on compute_psf_and_ap_corr_map and if it isn't, it should be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a mistake. It is not necessary since the names are being carried through from the input catalog.

python/lsst/pipe/tasks/multiBand.py Show resolved Hide resolved
python/lsst/pipe/tasks/multiBand.py Outdated Show resolved Hide resolved
python/lsst/pipe/tasks/multiBand.py Show resolved Hide resolved


class PropagateSourceFlagsConfig(pexConfig.Config):
"""Configuration for propagating source flags to coadd objects."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change coadd objects to coadd sources here and everywhere else in the file. "Coadd objects" sounds more like some sort of class associated with a coadd as opposed to a source catalog.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we decided to leave this as coadd objects vs single-frame sources, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

python/lsst/pipe/tasks/propagateSourceFlags.py Outdated Show resolved Hide resolved
Comment on lines +35 to +64
source_flags = pexConfig.DictField(
keytype=str,
itemtype=float,
default={
# TODO: DM-34391: when doApplyFinalizedPsf is the default, these flags
# should be set below and not here.
"calib_psf_candidate": 0.2,
"calib_psf_used": 0.2,
"calib_psf_reserved": 0.2,
"calib_astrometry_used": 0.2,
"calib_photometry_used": 0.2,
"calib_photometry_reserved": 0.2
},
doc=("Source flags to propagate, with the threshold of relative occurrence "
"(valid range: [0-1]). Coadd object will have flag set if fraction "
"of input visits in which it is flagged is greater than the threshold."),
)
finalized_source_flags = pexConfig.DictField(
keytype=str,
itemtype=float,
default={
# TODO: DM-34391: when doApplyFinalizedPsf is the default, these flags
# should be set here and not above.
# "calib_psf_candidate": 0.2,
# "calib_psf_used": 0.2,
# "calib_psf_reserved": 0.2
},
doc=("Finalized source flags to propagate, with the threshold of relative "
"occurrence (valid range: [0-1]). Coadd object will have flag set if "
"fraction of input visits in which it is flagged is greater than the "
"threshold."),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. From the looks of it, only one of source_flags and finalized_source_flags will be used by default, right? So why is this entire task setup to use both instead of having a single set of parameters and a config option useFinalized or something similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added more documentation to make it clear that there will be a mix of flags from different tables.

Comment on lines +217 to +232
for flag in source_flag_counts:
thresh = num_overlaps*self.config.source_flags[flag]
object_flag = (source_flag_counts[flag] > thresh)
coadd_object_cat[flag] = object_flag
self.log.info("Propagated %d sources with flag %s", object_flag.sum(), flag)

for flag in finalized_source_flag_counts:
thresh = num_overlaps*self.config.finalized_source_flags[flag]
object_flag = (finalized_source_flag_counts[flag] > thresh)
coadd_object_cat[flag] = object_flag
self.log.info("Propagated %d finalized sources with flag %s", object_flag.sum(), flag)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this should be a separate function or class method instead of using duplicate code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this, but there are so many variables to pass I went with the rule-of-3:

is a code refactoring rule of thumb to decide when a replicated piece of code should be replaced by a new procedure. It states that you are allowed to copy and paste the code once, but that when the same code is replicated three times, it should be extracted into a new procedure. The rule was introduced by Martin Fowler in his text "Refactoring" and attributed to Don Roberts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

counts_list,
x_column_list,
y_column_list):
if handle_dict is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also check that len(columns) > 0, that way if you are only using one of source_columns and finalized_source_columns it doesn't match up against a catalog with no flags to propagate.

@erykoff erykoff merged commit e1bc22d into main Apr 14, 2022
@erykoff erykoff deleted the tickets/DM-34174 branch April 14, 2022 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants