New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-8951: Add TE1 and TE2 KPMs to validate_drp #31
Conversation
87b4289
to
b2095a7
Compare
0d8020a
to
d73699e
Compare
769dea2
to
3b1c927
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments. Mostly missing or unclear docstrings, and some tests that should be added.
One of the questions relates to execution speed, and you can feel free to tell me that it's irrelevant in the context of validate_drp.
etc/metrics.yaml
Outdated
description: Separation distance | ||
bin_range_operator: | ||
value: "<=" | ||
description: Are we looking at correlations less than D or greater than D. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be phrased as a question?
etc/metrics.yaml
Outdated
description: Separation distance | ||
bin_range_operator: | ||
value: ">=" | ||
description: Are we looking at correlations less than D or greater than D. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same phrasing question as above.
@@ -8,3 +8,4 @@ | |||
from .amx import AMxMeasurement # NOQA | |||
from .afx import AFxMeasurement # NOQA | |||
from .adx import ADxMeasurement # NOQA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need # NOQA
here? I thought we just ignored the flake8 comments on __init__.py
since it's a weird file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#NOQA for from .tex
is in keeping with previous practice for these SRD metrics. Happy to consider overall #NOQA use in future ticket.
---------- | ||
metric : `lsst.validate.base.Metric` | ||
An TE1 or TE2 `~lsst.validate.base.Metric` instance. | ||
matchedDataset : lsst.validate.drp.matchreduce.MatchedMultiVisitDataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
matchedDataset of what?
filter_name : `str` | ||
filter_name (filter name) used in this measurement (e.g., ``'r'``). | ||
width : `float` or `astropy.units.Quantity`, optional | ||
Width around fiducial distance to include. [arcmin] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
width
doesn't appear in the argument list.
@@ -140,6 +148,32 @@ def sphDist(ra1, dec1, ra2, dec2): | |||
return dist | |||
|
|||
|
|||
def averageRaFromCat(cat): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though validate_drp isn't a particularly time-critical operation, I do worry that your use of these "only RA" , "only Dec" functions is slow, given you're throwing away the other average. I see you're using it in the aggregator for TEx: how many values are typically input?
return meanRa | ||
|
||
|
||
def averageDecFromCat(cat): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least a one-line docstring for all of these new functions.
python/lsst/validate/drp/validate.py
Outdated
outputPrefix=outputPrefix) | ||
except RuntimeError as e: | ||
print(e) | ||
print('\tSkipped plot{}'.format(texName)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be using the lsst.log
system to log these errors?
self.assertFloatsAlmostEqual(exp_all_avg_xip, obs_all_avg_xip, rtol=1e-7) | ||
self.assertFloatsAlmostEqual(exp_all_avg_xip_err, obs_all_avg_xip_err, rtol=1e-7) | ||
|
||
exp_rlt2_avg_xip, exp_rlt2_avg_xip_err = 2.8171429e-05, 1.2857143e-06 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a note about how these numbers were arrived at.
select_bin_from_corr(r, xip, xip_err, radius=1, operator=operator.ge) | ||
self.assertFloatsAlmostEqual(exp_rge1_avg_xip, obs_rge1_avg_xip, rtol=1e-7) | ||
self.assertFloatsAlmostEqual(exp_rge1_avg_xip_err, obs_rge1_avg_xip_err, rtol=1e-7) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you also want some tests for your correlation_function_ellipticity*
functions, and probably for the added utils
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Tests added for correlation_function_ellipticity
.
The utility functions that require a MatchedMultiVisitDataset
object, a catalog, or ashape
object are much more complicated to mock up than seems justified given the simplicity of the functions.
tests/test_tex.py
Outdated
print("EXP_xip_err: ", repr(exp_xip_err)) | ||
print("OBS_xip_err: ", repr(obs_xip_err)) | ||
self.assertFloatsAlmostEqual(exp_r, obs_r, atol=1e-4 * u.arcmin) | ||
self.assertFloatsAlmostEqual(exp_xip, obs_xip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading this correctly, you're expecting the correlation function to be identically 0 everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, sorry. The tests are a work in progress that I started after submitting the ticket for review and realizing that I needed some tests. I'm hoping to have them fully implemented by Monday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry! Carry on then. I won't look at things until you tell me to.
403218b
to
281eb19
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments on the new code. I like your approach to the ellipticity tests.
self.assertFloatsAlmostEqual(exp_rlt2_avg_xip, obs_rlt2_avg_xip, rtol=1e-7) | ||
self.assertFloatsAlmostEqual(exp_rlt2_avg_xip_err, obs_rlt2_avg_xip_err, rtol=1e-7) | ||
|
||
exp_rge1_avg_xip, exp_rge1_avg_xip_err = 4.3400000e-05, 1.8000000e-06 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another note about where these numbers come from.
tests/test_tex.py
Outdated
Which in turn references | ||
http://adsabs.harvard.edu/abs/2002A%26A...389..729S | ||
""" | ||
rand_seed = 1238625876 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this choice for random seed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arbitrary.
random.seed(rand_seed) | ||
# Yes, a million. N this high and L this large | ||
# gets xip within 1e-7 absolute of the analytic limit. | ||
# Takes 25 seconds to run on an early-2015 MacBook Air 2.2 GHz Intel Core i7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Useful comment.
""" | ||
# Translate to 'verbose_level' here to refer to the integer levels in TreeCorr | ||
# While 'verbose' is more generically what is being passed around | ||
# for verbosity within 'validate_drp' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the extra spaces at the start here?
xip = gg.xip * u.Unit('') | ||
xip_err = np.sqrt(gg.varxi) * u.Unit('') | ||
|
||
return (r, xip, xip_err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to return a namedtuple
, so the user doesn't have to keep track of which index is what thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree, but I've been generically discouraged from using namedtuple
in favor of pipe_base struct. But I don't want to invoke pipe_base struct here.
@@ -552,3 +552,62 @@ def plotAMx(amx, afx, filterName, amxSpecName='design', outputPrefix=""): | |||
plt.tight_layout() # fix padding | |||
plt.savefig(plotPath, dpi=300) | |||
plt.close(fig) | |||
|
|||
|
|||
def plotTEx(tex, filterName, texSpecName='design', outputPrefix=''): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, shouldn't this be plot_TEx
?
|
||
ax1.legend(loc='upper right', fontsize=16) | ||
|
||
pathFormat = '{prefix}{metric}_D_{D:d}_{Dunits}.{ext}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. A comment to that effect would be good (prefix suggests directory to me).
This is based on Bob Armstrong's ellipticity calculation. done in DM-3040
b369249
to
c3b0db1
Compare
Update TEx page reference to 31. Specify D.
Remove unused magRange option from TEx. Also remove docstring description of non-existent 'width' parameter.
Specifically motivated by ellipticity correlation metrics which ~1e-5.
Adds a 'bin_operator' that specifies whether the metric is for <= given radius value (TE1) or >= given radius value(TE2) Add descriptions to TE1, TE2 parameters.
Remove PLOTting hardcoded stuff from tex.py Comment TE plot lines. Minor whitespace edits.
This matches the log radius binning that was used when computing the correlation function in the first place.
Refactoring initially motivated by desire to make it easier to test the key algorithm without having to construct a MatchedMultiVisitDataset object. Also may allow for future external use from more flexible/direct interface.
Use example test case from TreeCorr to verify wrapper function. TreeCorr itself has much more extensive tests. The test here in validate_drp is to make sure that we're calling TreeCorr correctly in correlate_function_ellipticity and, e.g., getting units correct. Also adds a simple test of select_bin_from_corr.
Make nbins, min_sep, max_sep, verbose keyword arguments passed through to treecorr.GGCorrelation
There's a significant performance penalty in using str keys when looking things up in AFW tables.
Translate verbose=True to Treecorr verbose=2.
Factor of 50 (yes, fifty) increase in performance.
Previously was 'for i, s in unmerate(cat)' Now use extracts entire column at a time.
'calculate_' wasn't a particularly informative prefix. Most functions that return something calculate something.
Had had these reversed from the start of this branch.
Quantities within machine precision of 0 can trigger NaN in numpy.sqrt. Use numpy.lib.scimath.sqrt in complex ellipticity calculation to handle this cleanly.
Also clarify that a random seed in the tests was arbitrarily chosen.
bb2e233
to
6e851d0
Compare
No description provided.