Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-39081: Port stellar locus plot updates to analysis_tools #184

Merged
merged 16 commits into from Jan 25, 2024

Conversation

laurenam
Copy link
Contributor

No description provided.

@laurenam laurenam force-pushed the tickets/DM-39081 branch 4 times, most recently from 29725ed to a8c00f1 Compare January 18, 2024 21:52
@laurenam laurenam force-pushed the tickets/DM-39081 branch 4 times, most recently from 0b72091 to 9baf54c Compare January 21, 2024 23:11
Copy link
Contributor

@erykoff erykoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Some minor comments here.

# Loop twice over the fit and include sigma clipping of points
# perpendicular to the fit line (stricter on first iteration).
for nIter in range(2):
nSigmaToClipPerIteration = [3.5, 5.0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to be fixed or should it be configurable? (It may be this is just fine, but wanted to check.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this and wasn't sure if it would just add config bloat. What if I added them to the stellarLocusFitDict/paramDict with these defaults such that they could be updated if desired in the pipeline yaml if desired?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, if that's possible that makes sense. Or if you want to leave them frozen that's fine. But adding a config variable is too heavy, I agree.

fitDists = np.array(perpDistance(p1, p2, zip(xs[fitPoints], ys[fitPoints]))) * 1000
clippedStats = calcQuartileClippedStats(fitDists, nSigmaToClip=nSigmaToClipPerIteration[nIter])
allDists = np.array(perpDistance(p1, p2, zip(xs, ys))) * 1000
keep = np.logical_not(np.abs(allDists) > clippedStats.clipValue)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be keep = ~(np.abs(allDists) > clippedStats.clipValue) or even clearer keep = (np.abs(allDists) <= clippedStats.clipValue).


Returns
-------
dists : `list`
dists : `list` [`float`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is old code, but I am completely confused as to why this function (a) loops over things and (b) returns a list instead of a numpy array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, good point. Updated to:

    points = list(points)
    dists = np.cross(p2 - p1, points - p1) /np.linalg.norm(p2 - p1)
                                                                                       
    return dists

median = quartiles[1]
interQuartileDistance = quartiles[2] - quartiles[0]
clipValue = nSigmaToClip * 0.74 * interQuartileDistance
good = np.logical_not(np.abs(dataArray - median) > clipValue)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be good = (np.abs(dataArray - median) <= clipValue)

"xMin": 1.05,
"xMax": 1.55,
"yMin": 0.78,
"yMax": 1.62,
"mHW": 13.35,
"bHW": -15.54,
"mFixed": 60.0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.

"xMin": 0.82,
"xMax": 2.01,
"yMin": 0.37,
"yMax": 0.90,
"mHW": 0.40,
"bHW": 0.03,
"mFixed": 0.385,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again!

"xMin": 0.82,
"xMax": 2.01,
"yMin": 0.37,
"yMax": 0.90,
"mHW": 0.40,
"bHW": 0.03,
"mFixed": 0.385,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This too.


from ...interfaces import KeyedData, KeyedDataAction, KeyedDataSchema, Scalar, Vector
from ...math import sigmaMad


def stellarLocusFit(xs, ys, paramDict):
def stellarLocusFit(xs, ys, mags, paramDict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a problem that the API is changing here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Another one I pondered and decided that (despite not having a leading _ in the name) this is effectively a private-ish function (it's not - and in fact can't be...I tried but got errors - included in __all__). I also did a full GitHub search and couldn't find any other uses of it...but I suppose it's within the realm of possibility that someone out there somewhere could be using it (e.g. in a notebook). LMK if you think I need to do some kind of deprecation (or just add it as a None defaulted kwarg and bail/warn if it doesn't get passed?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure it's fine. Though if it's not used anywhere outside this Action file, maybe rename to _stellarLocusFit?

This will include a list of any bands subject to the S/N threshold.
@laurenam laurenam force-pushed the tickets/DM-39081 branch 9 times, most recently from eca5157 to 2433e88 Compare January 24, 2024 17:10
This is largely a cherry-pick from commit lsst/analysis_drp@887aec7
on analysis_drp.

It includes fixing issues where the plot was misleading (i.e. not
reflecting reality in terms of the points being used in the fit and
the density color-mapping), updating labels to add info as well as
removes some irrelevant entries.  The module and parameter
documentation is also updated throughout.
This is to provide another look into any color dependence of the
stellar locus fit.  Set to False by default.
It was deemed that a distance from fit vs. color scatter plot (showing
a running median) was much more informative (i.e. would highlight any
potential color dependency/bias in the fit) than the magnitude vs.
distance to fit contour plot, so this is added here are set as the
default plot for the lower right panel. The original contours plot
can be selected instead by setting the doPlotDistVsColor config to
False.
The use case here is for the stellar locus data selection. While we
still rely on a S/N threshold, we also want the analysis to always
include the same magnitude range, regardless of the depth of any
given image, to minimize differences in the stellar populations
included in the stellar locus analysis. (This would break down for
abnormally shallow or low S/N images, so this should be kept in mind
when setting thresholds and assessing results.)
Several updates to our pipelines, particularly in PSF characterization,
have been made since thise were calibrated on DM-13519.  These are still
based on the RC2 dataset and may benefit from further fine-tuning based
on a larger dataset (e.g. the full PDR2).

For reference, the fit values as of this commit date for the three
HSC RC2 tracts of the w_2023_50 processing (DM-42194) are:

wPerp_psfFlux:  tract: 9813  mODR: 0.483  bODR: -0.042
wPerp_psfFlux:  tract: 9697  mODR: 0.481  bODR: -0.040
wPerp_psfFlux:  tract: 9615  mODR: 0.484  bODR: -0.039

xPerp_psfFlux:  tract: 9813  mODR: 120.070  bODR: -153.434
xPerp_psfFlux:  tract: 9697  mODR:  39.187  bODR:  -50.816
xPerp_psfFlux:  tract: 9615  mODR:  59.600  bODR:  -74.839

yPerp_psfFlux:  tract: 9813  mODR: 0.388  bODR: 0.062
yPerp_psfFlux:  tract: 9697  mODR: 0.385  bODR: 0.064
yPerp_psfFlux:  tract: 9615  mODR: 0.383  bODR: 0.044

The values for each metric were roughly set as the "happy medium"
among the three tracts.  They are used as a rough guideline for
differences in the stellar locus fits for different tracts (which
could reflect real astrophysics if probing very different stellar
populations, or an "issue" with the code).  They are also used as
an initial guess for the linear stellar locus fit.
We set a minimum number of objects surviving all cuts to be deemed
suitable for including in the fit (defaults to 3).
Prepend the stellarLocusFit function with "_" to indicate that it is
a semi-private function that is unlikely to be used outside of this
module's StellarLocusFitAction.  If, sometime down the road, this is
deemed generally useful, it should be moved to a more generic "utils"
location.
@laurenam laurenam merged commit bd79068 into main Jan 25, 2024
8 checks passed
@laurenam laurenam deleted the tickets/DM-39081 branch January 25, 2024 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants