New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

DM-34174: Replace PropagateVisitFlagsTask with PropagateSourceFlagsTask #659

Merged

erykoff merged 3 commits into main from tickets/DM-34174

Apr 14, 2022

Contributor

erykoff commented Apr 5, 2022

No description provided.

erykoff force-pushed the tickets/DM-34174 branch 2 times, most recently from bef49f8 to b9c72ea Compare

April 11, 2022 17:30

fred3m requested changes

View reviewed changes

Contributor

fred3m left a comment

Everything looks good with one exception: without having looked at the ticket I wouldn't understand the difference between the source table and the finalized source table, so I think that throughout the PR the docs need to be updated to explain the difference between the two.

I'm also still confused about whether or not it is expected for there to be a scenario where both the source table flags and finalized source table flags are propagated. At least the comments in the default config make it appear as if they won't be regularly used. My recommendation would be to write the entire task to use a single input table and if there is a case where the user will need to propagate both sets of flags, they could run the task twice using different sets of inputs (sources vs finalized sources). Or am I missing something?

python/lsst/pipe/tasks/finalizeCharacterization.py

+                      selected_src['calib_psf_candidate'] = np.zeros(len(selected_src), dtype=bool)
+                      selected_src['calib_psf_used'] = np.zeros(len(selected_src), dtype=bool)
+                      selected_src['calib_psf_reserved'] = np.zeros(len(selected_src), dtype=bool)

Contributor

fred3m Apr 12, 2022

I'm a bit confused. calib_psf_candidate and calib_psf_reserved are both set (not updated) below, so the above code seems unnecessary except for calib_psf_used.

Contributor Author

erykoff Apr 13, 2022

I've update the comment that I do this just to be extra sure that information from the input tables doesn't leak through.

python/lsst/pipe/tasks/finalizeCharacterization.py Outdated

                           type='Flag',
                           doc=('set if source was used in the PSF determination by '
                                'FinalizeCharacterizationTask.'),
                       )
+                      """

Contributor

fred3m Apr 12, 2022

Why is this code block commented out? It looks like it should be necessary based on compute_psf_and_ap_corr_map and if it isn't, it should be removed.

Contributor Author

erykoff Apr 13, 2022

That was a mistake. It is not necessary since the names are being carried through from the input catalog.

python/lsst/pipe/tasks/multiBand.py Show resolved Hide resolved

python/lsst/pipe/tasks/multiBand.py Outdated Show resolved Hide resolved

python/lsst/pipe/tasks/multiBand.py Show resolved Hide resolved

python/lsst/pipe/tasks/propagateSourceFlags.py



		class PropagateSourceFlagsConfig(pexConfig.Config):
		"""Configuration for propagating source flags to coadd objects."""

Contributor

fred3m Apr 12, 2022

Maybe change coadd objects to coadd sources here and everywhere else in the file. "Coadd objects" sounds more like some sort of class associated with a coadd as opposed to a source catalog.

Contributor Author

erykoff Apr 13, 2022

I think that we decided to leave this as coadd objects vs single-frame sources, right?

Contributor

fred3m Apr 13, 2022

👍

python/lsst/pipe/tasks/propagateSourceFlags.py Outdated Show resolved Hide resolved

python/lsst/pipe/tasks/propagateSourceFlags.py

Comment on lines +35 to +64

+                  source_flags = pexConfig.DictField(
+                      keytype=str,
+                      itemtype=float,
+                      default={
+                          # TODO: DM-34391: when doApplyFinalizedPsf is the default, these flags
+                          # should be set below and not here.
+                          "calib_psf_candidate": 0.2,
+                          "calib_psf_used": 0.2,
+                          "calib_psf_reserved": 0.2,
+                          "calib_astrometry_used": 0.2,
+                          "calib_photometry_used": 0.2,
+                          "calib_photometry_reserved": 0.2
+                      },
+                      doc=("Source flags to propagate, with the threshold of relative occurrence "
+                           "(valid range: [0-1]). Coadd object will have flag set if fraction "
+                           "of input visits in which it is flagged is greater than the threshold."),
+                  )
+                  finalized_source_flags = pexConfig.DictField(
+                      keytype=str,
+                      itemtype=float,
+                      default={
+                          # TODO: DM-34391: when doApplyFinalizedPsf is the default, these flags
+                          # should be set here and not above.
+                          # "calib_psf_candidate": 0.2,
+                          # "calib_psf_used": 0.2,
+                          # "calib_psf_reserved": 0.2
+                      },
+                      doc=("Finalized source flags to propagate, with the threshold of relative "
+                           "occurrence (valid range: [0-1]). Coadd object will have flag set if "
+                           "fraction of input visits in which it is flagged is greater than the "
+                           "threshold."),
+                  )

Contributor

fred3m Apr 12, 2022

I'm confused. From the looks of it, only one of source_flags and finalized_source_flags will be used by default, right? So why is this entire task setup to use both instead of having a single set of parameters and a config option useFinalized or something similar?

Contributor Author

erykoff Apr 13, 2022

I've added more documentation to make it clear that there will be a mix of flags from different tables.

python/lsst/pipe/tasks/propagateSourceFlags.py

Comment on lines +217 to +232

+                      for flag in source_flag_counts:
+                          thresh = num_overlaps*self.config.source_flags[flag]
+                          object_flag = (source_flag_counts[flag] > thresh)
+                          coadd_object_cat[flag] = object_flag
+                          self.log.info("Propagated %d sources with flag %s", object_flag.sum(), flag)
+                      for flag in finalized_source_flag_counts:
+                          thresh = num_overlaps*self.config.finalized_source_flags[flag]
+                          object_flag = (finalized_source_flag_counts[flag] > thresh)
+                          coadd_object_cat[flag] = object_flag
+                          self.log.info("Propagated %d finalized sources with flag %s", object_flag.sum(), flag)

Contributor

fred3m Apr 12, 2022

I feel like this should be a separate function or class method instead of using duplicate code.

Contributor Author

erykoff Apr 13, 2022

I considered this, but there are so many variables to pass I went with the rule-of-3:

is a code refactoring rule of thumb to decide when a replicated piece of code should be replaced by a new procedure. It states that you are allowed to copy and paste the code once, but that when the same code is replicated three times, it should be extracted into a new procedure. The rule was introduced by Martin Fowler in his text "Refactoring" and attributed to Don Roberts.

Contributor

fred3m Apr 13, 2022

Fair enough

python/lsst/pipe/tasks/propagateSourceFlags.py Outdated

+                                                                                 counts_list,
+                                                                                 x_column_list,
+                                                                                 y_column_list):
+                          if handle_dict is not None:

Contributor

fred3m Apr 12, 2022

I would also check that len(columns) > 0, that way if you are only using one of source_columns and finalized_source_columns it doesn't match up against a catalog with no flags to propagate.

erykoff force-pushed the tickets/DM-34174 branch from b9c72ea to 1746cf4 Compare

April 13, 2022 17:54

fred3m approved these changes

View reviewed changes

erykoff added 3 commits

April 14, 2022 08:22


          Add PropagateSourceFlagsTask.

ba3f065


          Deprecate PropagateVisitFlagsTask.

1ae3480


          Add task documentation file.

0edda7f

erykoff force-pushed the tickets/DM-34174 branch from 1746cf4 to 0edda7f Compare

April 14, 2022 15:22

erykoff merged commit e1bc22d into main

erykoff deleted the tickets/DM-34174 branch

April 14, 2022 20:23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment