Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-35207: add task to produce updated visit summaries #6

Merged
merged 5 commits into from Jan 11, 2023

Conversation

TallJimbo
Copy link
Member

No description provided.

@TallJimbo TallJimbo force-pushed the tickets/DM-35207 branch 7 times, most recently from 5012f53 to 830a147 Compare January 7, 2023 04:24
Copy link
Contributor

@laurenam laurenam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only PR I've really looked at so far, and most of my comments are just optional suggestions for clarifications. I'm submitting this review now, but am going to pause before approving because I'm worried enough about the conjecture:

ComputeExposureSummaryStatsTask (which was actually written to work with afw.table, but Astropy is similar enough that it works, too)

that I'm going to get all the branches set up and run a local ci_hsc to have a look at the outputs before moving on to the rest.

@laurenam
Copy link
Contributor

laurenam commented Jan 9, 2023

Ok, yep, we have a problem. In my local ci_hsc run, almost all of the psf metrics have NaN entries in the finalVisitSummary catalog. I've tracked it down to this clause in ComputeExposureSummaryStatsTask:

        if (
            self.config.starSelection not in sources_columns
            or self.config.starShape + '_flag' not in sources_columns
        ):
            # The source catalog does not have the necessary fields (as in some
            # tests).
            return

noting that self.config.starShape = "base_SdssShape".

It seems that the base_SdssShape_ columns do not exist in the finalized_src_table, so this task silently returns here (my subsequent check on psfTraceRadiusDelta passes as these values did get updated before this return happens). I haven't figured out why they don't appear yet (maybe a slot default change?), but I'm not sure if the best way to fix this is to get those fields into the finalized_src_table or to somehow make sure we are only relying on the slot columns being present? @erykoff may also have an opinion here?

[And we'll also need to change the getX() and getY() usage in favor of explicit column names in maximum_nearest_psf_distance(). They didn't trigger an error because it never made it that far!]

@erykoff
Copy link
Contributor

erykoff commented Jan 9, 2023

The slot nomenclature does not get propagated to parquet. You get either the slot name or the base name and now I don't know what comes through in the way I configured it. But all that alias stuff, as well as .getX() and .getY() won't work with the finalized src tables.

@laurenam
Copy link
Contributor

laurenam commented Jan 9, 2023

It looks like all of the slot_ columns do get propagated to finalized_src_table (but there's no way to know which algorithm the slot refers to...)

@laurenam
Copy link
Contributor

laurenam commented Jan 9, 2023

I was just wondering how any warps were getting made (i.e. selected) with all the psf metrics being NaN...seems it's because:

float("nan") > 0.009
False

so they pass the threshold test! I guess we should be checking against ~np.isfinite too?

@erykoff
Copy link
Contributor

erykoff commented Jan 9, 2023

Right, you can copy over the slot names or the base names, but if you want both names you have to store twice the data. I guess I decided to make it use the slot names downstream.

Copy link
Member Author

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've addressed all review comments except the "well, this is all silently broken" doozy at the end.

Thanks for the investigation on that, and I'll start my own where you left off. I am inclined to set the default PSF shape to the slot one down in ComputeSummaryStatisticsTask, and then obliterating whatever try/catch block made that table incompatibility a silent failure, but we'll see if there any knock-on effects.

python/lsst/drp/tasks/update_visit_summary.py Show resolved Hide resolved
python/lsst/drp/tasks/update_visit_summary.py Show resolved Hide resolved
python/lsst/drp/tasks/update_visit_summary.py Outdated Show resolved Hide resolved
python/lsst/drp/tasks/update_visit_summary.py Show resolved Hide resolved
python/lsst/drp/tasks/update_visit_summary.py Show resolved Hide resolved
# Convert the psf_star_catalog datasets from DataFrame to Astropy so
# they can be handled by ComputeExposureSummaryStatsTask (which was
# actually written to work with afw.table, but Astropy is similar
# enough that it works, too). Ideally this would be handled by just
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, back when I wrote this and looked at the usage your ticket hadn't merged yet. Smells like something is catching waaaay more exceptions than it should, and that's a real problem. I'll investigate.

python/lsst/drp/tasks/update_visit_summary.py Show resolved Hide resolved
We should follow the same approach as pipe_tasks here to minimize
imports of code that's not needed, since this package is also going
to be full of unrelated things.
This seems to work as far as the built docs go, but there are still
a ton of warnings in the doc build ranging from plausibly-real ("can't
find upstream thing that probably just isn't documented") to
probably-bogus ("somethings wrong with the quoting on this line that
is blank and has no quotes of any kind anywhere near it").
@TallJimbo TallJimbo force-pushed the tickets/DM-35207 branch 2 times, most recently from 4486b0c to 33ecf66 Compare January 9, 2023 18:33
@TallJimbo
Copy link
Member Author

TallJimbo commented Jan 9, 2023

Ok, I think I've fixed the serious problem here as well, though my ci_hsc run to check that is ongoing. Pretty much all of those changes were over in pipe_tasks; here the only change is sources_columns=... -> sources_is_astropy=True to reflect changes there.

@@ -0,0 +1,762 @@
# This file is part of pipe_tasks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipe_tasks -> drp_tasks

@laurenam
Copy link
Contributor

laurenam commented Jan 9, 2023

Garhhh...looks like ups/drp_tasks.table needs some dependencies added. Maybe just:

setupRequired(pipe_tasks)
setupRequired(skymap)

Tests here qre about all we can do without either some really intense
mocking or cooking up a lot of fairly realistic data.
@TallJimbo TallJimbo merged commit 03b55d6 into main Jan 11, 2023
@TallJimbo TallJimbo deleted the tickets/DM-35207 branch January 11, 2023 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants