New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-39081: Port stellar locus plot updates to analysis_tools #184
Conversation
29725ed
to
a8c00f1
Compare
0b72091
to
9baf54c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Some minor comments here.
# Loop twice over the fit and include sigma clipping of points | ||
# perpendicular to the fit line (stricter on first iteration). | ||
for nIter in range(2): | ||
nSigmaToClipPerIteration = [3.5, 5.0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this to be fixed or should it be configurable? (It may be this is just fine, but wanted to check.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this and wasn't sure if it would just add config bloat. What if I added them to the stellarLocusFitDict/paramDict with these defaults such that they could be updated if desired in the pipeline yaml if desired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh, if that's possible that makes sense. Or if you want to leave them frozen that's fine. But adding a config variable is too heavy, I agree.
fitDists = np.array(perpDistance(p1, p2, zip(xs[fitPoints], ys[fitPoints]))) * 1000 | ||
clippedStats = calcQuartileClippedStats(fitDists, nSigmaToClip=nSigmaToClipPerIteration[nIter]) | ||
allDists = np.array(perpDistance(p1, p2, zip(xs, ys))) * 1000 | ||
keep = np.logical_not(np.abs(allDists) > clippedStats.clipValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be keep = ~(np.abs(allDists) > clippedStats.clipValue)
or even clearer keep = (np.abs(allDists) <= clippedStats.clipValue)
.
|
||
Returns | ||
------- | ||
dists : `list` | ||
dists : `list` [`float`] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is old code, but I am completely confused as to why this function (a) loops over things and (b) returns a list instead of a numpy array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, good point. Updated to:
points = list(points)
dists = np.cross(p2 - p1, points - p1) /np.linalg.norm(p2 - p1)
return dists
median = quartiles[1] | ||
interQuartileDistance = quartiles[2] - quartiles[0] | ||
clipValue = nSigmaToClip * 0.74 * interQuartileDistance | ||
good = np.logical_not(np.abs(dataArray - median) > clipValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be good = (np.abs(dataArray - median) <= clipValue)
"xMin": 1.05, | ||
"xMax": 1.55, | ||
"yMin": 0.78, | ||
"yMax": 1.62, | ||
"mHW": 13.35, | ||
"bHW": -15.54, | ||
"mFixed": 60.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
"xMin": 0.82, | ||
"xMax": 2.01, | ||
"yMin": 0.37, | ||
"yMax": 0.90, | ||
"mHW": 0.40, | ||
"bHW": 0.03, | ||
"mFixed": 0.385, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once again!
"xMin": 0.82, | ||
"xMax": 2.01, | ||
"yMin": 0.37, | ||
"yMax": 0.90, | ||
"mHW": 0.40, | ||
"bHW": 0.03, | ||
"mFixed": 0.385, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This too.
|
||
from ...interfaces import KeyedData, KeyedDataAction, KeyedDataSchema, Scalar, Vector | ||
from ...math import sigmaMad | ||
|
||
|
||
def stellarLocusFit(xs, ys, paramDict): | ||
def stellarLocusFit(xs, ys, mags, paramDict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a problem that the API is changing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Another one I pondered and decided that (despite not having a leading _
in the name) this is effectively a private-ish function (it's not - and in fact can't be...I tried but got errors - included in __all__
). I also did a full GitHub search and couldn't find any other uses of it...but I suppose it's within the realm of possibility that someone out there somewhere could be using it (e.g. in a notebook). LMK if you think I need to do some kind of deprecation (or just add it as a None
defaulted kwarg
and bail/warn if it doesn't get passed?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure it's fine. Though if it's not used anywhere outside this Action file, maybe rename to _stellarLocusFit
?
This will include a list of any bands subject to the S/N threshold.
eca5157
to
2433e88
Compare
This is largely a cherry-pick from commit lsst/analysis_drp@887aec7 on analysis_drp. It includes fixing issues where the plot was misleading (i.e. not reflecting reality in terms of the points being used in the fit and the density color-mapping), updating labels to add info as well as removes some irrelevant entries. The module and parameter documentation is also updated throughout.
This is to provide another look into any color dependence of the stellar locus fit. Set to False by default.
It was deemed that a distance from fit vs. color scatter plot (showing a running median) was much more informative (i.e. would highlight any potential color dependency/bias in the fit) than the magnitude vs. distance to fit contour plot, so this is added here are set as the default plot for the lower right panel. The original contours plot can be selected instead by setting the doPlotDistVsColor config to False.
2433e88
to
165fecc
Compare
The use case here is for the stellar locus data selection. While we still rely on a S/N threshold, we also want the analysis to always include the same magnitude range, regardless of the depth of any given image, to minimize differences in the stellar populations included in the stellar locus analysis. (This would break down for abnormally shallow or low S/N images, so this should be kept in mind when setting thresholds and assessing results.)
Several updates to our pipelines, particularly in PSF characterization, have been made since thise were calibrated on DM-13519. These are still based on the RC2 dataset and may benefit from further fine-tuning based on a larger dataset (e.g. the full PDR2). For reference, the fit values as of this commit date for the three HSC RC2 tracts of the w_2023_50 processing (DM-42194) are: wPerp_psfFlux: tract: 9813 mODR: 0.483 bODR: -0.042 wPerp_psfFlux: tract: 9697 mODR: 0.481 bODR: -0.040 wPerp_psfFlux: tract: 9615 mODR: 0.484 bODR: -0.039 xPerp_psfFlux: tract: 9813 mODR: 120.070 bODR: -153.434 xPerp_psfFlux: tract: 9697 mODR: 39.187 bODR: -50.816 xPerp_psfFlux: tract: 9615 mODR: 59.600 bODR: -74.839 yPerp_psfFlux: tract: 9813 mODR: 0.388 bODR: 0.062 yPerp_psfFlux: tract: 9697 mODR: 0.385 bODR: 0.064 yPerp_psfFlux: tract: 9615 mODR: 0.383 bODR: 0.044 The values for each metric were roughly set as the "happy medium" among the three tracts. They are used as a rough guideline for differences in the stellar locus fits for different tracts (which could reflect real astrophysics if probing very different stellar populations, or an "issue" with the code). They are also used as an initial guess for the linear stellar locus fit.
We set a minimum number of objects surviving all cuts to be deemed suitable for including in the fit (defaults to 3).
Prepend the stellarLocusFit function with "_" to indicate that it is a semi-private function that is unlikely to be used outside of this module's StellarLocusFitAction. If, sometime down the road, this is deemed generally useful, it should be moved to a more generic "utils" location.
165fecc
to
e7cd440
Compare
No description provided.