Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-36726: Add FgcmBuildFromIsolatedStarsTask to use isolated stars as input. #98

Merged
merged 6 commits into from Mar 20, 2023

Conversation

erykoff
Copy link
Contributor

@erykoff erykoff commented Jan 5, 2023

This PR also removes the need to have background files as an input (because these values are recorded in the visit summary tables already).

Therefore, the only processing inputs needed to run fgcmcal are the isolated star catalogs, the isolated star sources, and the visit summary tables.

@erykoff erykoff force-pushed the tickets/DM-36726 branch 2 times, most recently from 973a106 to afb5a5b Compare January 7, 2023 00:26
Copy link

@cmsaunders cmsaunders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks good--I found it easy to follow, and learned a few things along the way. I have two big suggestions, and you can tell me why they're impossible:

  • The way I would have imagined structuring this part of FGCM is to have a "build" step with the original association task or the new build-from-isolated-stars task as a configurable subtask. This would avoid the current situation where all the pipelines that include fgcmcal have to be changed. Is this just not practical?
  • A big benefit of using IsolatedStarAssociationTask is to have consistency across the stack, but I think this is somewhat undermined by then doing the reference catalog association within fgcmcal. Is it possible move the meat of _fgcm_associate_reference_stars to live with IsolatedStarAssociationTask so that it can be reusable by all the tasks using the isolated stars?

More minor comments are inline.

self.reserve_selection.reserve_fraction = 0.1

# The names here correspond to the post-transformed
# sourceTable_visit catalogs, which differ from the raw src

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the source tracing things back is still the sourceTable_visit catalogs, but the direct source is the isolated_star_sources catalogs, so that would be more clear.

raise RuntimeError(f"tract {tract} in isolated_star_cats but not isolated_star_sources")

if self.config.doReferenceMatches:
lut_handle = input_ref_dict["fgcm_lookup_table"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given how many other long variable names you have, you might as well call this lookup_table_handle. I had to keep reminding myself what lut meant.

if goodBackground.size > 2:
skyBackground = np.median(summary['skyBg'][goodBackground])
elif goodBackground.size > 0:
skyBackground = np.mean(summary['skyBg'][goodBackground])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point of taking the mean of one number here?

# This is not used in the isolated stars builder.
pass

def _fgcm_make_all_star_obs_from_isolated_stars(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and elsewhere, what is the point of starting the method name with fgcm? This seems redundant in fgcmcal.

Returns
-------
fgcm_obj : `astropy.table.Table`
Catalog of per-star ids and positions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catalog of ids and positions for each star?

flagFlag=flagFlag,
computeNobs=True)
if self.config.useParquetCatalogFormat:
fgcmStars.loadStars(fgcmPars,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you comment on the ra/dec unite conversion below, can you clarify that the units are already degrees for the parquet format?

@@ -87,8 +87,32 @@ class FgcmFitCycleConnections(pipeBase.PipelineTaskConnections,
deferLoad=True,
)

fgcm_star_observations = connectionTypes.Input(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like how you now have fgcm_star_observations and fgcmStarObservations, fgcm_star_ids and fgcmStarIds, etc., and that this designates that the former is parquet format. It gives you repeated variable names and a mixture of snake case and camel case in one! How about changing the name format from fgcm_star_observations to fgcmStarIdsParquet? I think that would be more transparent. You could even point the user to useParquetCatalogFormat in the documentation line.

import matplotlib
matplotlib.use("Agg")
# import matplotlib
# matplotlib.use("Agg")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops; actually these should not be commented since they prevent matplotlib from trying to open a display. I had commented them for more interactive debugging.

fgcmStarObservations = connectionTypes.Input(
doc="Catalog of star observations for fgcm",
doc="Catalog of star observations for fgcm; old format.",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to designate the connections below as "old format" also?

Parameters
----------
instName : `str`
Short name of the instrument

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a mixture of periods at the end and no periods in these docs.

Copy link

@cmsaunders cmsaunders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. Looks good to me!

Array columns and metadata are now supported through the parquet
formatter, and this maintains compatibility with the previous version.
@erykoff erykoff merged commit 1b14044 into main Mar 20, 2023
@erykoff erykoff deleted the tickets/DM-36726 branch March 20, 2023 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants