New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-33123: Calculate RhoStatistics as part of regular QA #35
Conversation
7d923b3
to
5dee524
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just some minor comments.
config: | ||
rhoStatisticsAction.treecorr.nbins: 21 | ||
rhoStatisticsAction.treecorr.min_sep: 0.01 | ||
rhoStatisticsAction.treecorr.max_sep: 80 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this one 80
, but the coadd 100
? I guess 100 arcsec exceeds the HSC field diameter, but doesn't for DECam or Rubin. Is there a way to make these instrument specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These will probably be made instrument specific later on when they make their way into obs packages I believe. At the visit level, these are bound to be too noisy anyway that I don't think they'd be used in practice.
dtype=str, | ||
optional=True, | ||
allowed={ | ||
units: units for units in ["arcmin", "arc", "degrees", "hours", "radians"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
units: ["arcmin", "arc", "degrees", "hours", "radians"]
would be simpler, does it not work here for some reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see more examples down below, so maybe I'm just out-of-the-loop here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that first, but allowed
had to be a dictionary where the key are the allowed options and the value being it's documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use standard unit strings. "hours" is not understood by Astropy and should be "h" or "hour" (and in fact the plural forms are not allowed). What unit is arc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was supposed to be arcsec
. Good catch Tim. I intended this to be some sort of a thin wrapper around treecorr config (only because our DictField doesn't accept values of different types). I could implement a mapper that converts standard units to treecorr units, but I'd rather fix it (let you fix it) in treecorr or deal with a mapper later on. At the moment, this config is of limited use beyond making rho statistics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like treecorr
should accept standard units (despite the TreeCorr.Catalog
docstring). See: https://github.com/rmjarvis/TreeCorr/blob/86afbdca38aac6325bc95cdc101b771c8ebb02f3/treecorr/config.py#L88-L104. Maybe the cleanest resolution is to make a map so either the docstring units or the standard units will work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we can stick with standard units since it only looks for the beginning of the string.
"You must specify exactly three of ``nbins``, ``bin_size``, ``min_sep`` and ``max_sep``" | ||
f" in treecorr_config. {num_req_params} parameters were set instead." | ||
) | ||
raise FieldValidationError(self.bin_size, self, msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you can add some checks here for things like min_sep < max_sep
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. treecorr
will throw an error if that's the case, so we should catch it earlier on.
"(Ixx + Iyy + 2*sqrt(Ixx*Iyy - Ixy**2))" | ||
), | ||
}, | ||
default="chi", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I see that Jarvis++15 (where rho3-5 are defined) use epsilon
. I guess it doesn't matter as long as one is making consistent comparisons. It may be good to document that there's a factor of ~4 difference depending on the definition here.
"determinant": "determinant radius", | ||
}, | ||
) | ||
treecorr = ConfigField( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space
function. | ||
""" | ||
xy = treecorr.GGCorrelation(**treecorrKwargs) | ||
catA = treecorr.Catalog(ra=ra, dec=dec, g1=g1a, g2=g2a, ra_units=raUnits, dec_units=decUnits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you're making more Catalog
s than necessary. Here's my rho-stats function that I got from Mike, which computes rho1-5 with just 3 Catalog
s:
def getRho(data):
cat_g = treecorr.Catalog(
ra=data['ra'], dec=data['dec'], ra_units='rad', dec_units='rad',
g1=data['star_e1'], g2=data['star_e2']
)
cat_dg = treecorr.Catalog(
ra=data['ra'], dec=data['dec'], ra_units='rad', dec_units='rad',
g1=data['de1'], g2=data['de2']
)
cat_gdTT = treecorr.Catalog(
ra=data['ra'], dec=data['dec'], ra_units='rad', dec_units='rad',
g1=data['star_e1']*data['dT']/data['star_T'], g2=data['star_e2']*data['dT']/data['star_T']
)
kwargs = dict(min_sep=0.5, max_sep=300.0, bin_size=0.2, sep_units='arcmin', metric='Arc')
rho1 = treecorr.GGCorrelation(**kwargs)
rho1.process(cat_dg)
rho2 = treecorr.GGCorrelation(**kwargs)
rho2.process(cat_dg, cat_g)
rho3 = treecorr.GGCorrelation(**kwargs)
rho3.process(cat_gdTT)
rho4 = treecorr.GGCorrelation(**kwargs)
rho4.process(cat_dg, cat_gdTT)
rho5 = treecorr.GGCorrelation(**kwargs)
rho5.process(cat_g, cat_gdTT)
return rho1, rho2, rho3, rho4, rho5
Not worth changing unless there are efficiency problems possibly, but thought I'd let you know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For coaddQAEllipPlots.yaml
it doesn't seem to take all that much more time than it takes to make other plots. I've been thinking given ra
and dec
columns are going to be the same, it shouldn't even have to build a kdtree thrice but only once, but that's another issue :-)
8961aeb
to
d138de3
Compare
"Which metric to use for distance measurements. For details, see " | ||
"https://rmjarvis.github.io/TreeCorr/_build/html/metric.html" | ||
), | ||
default="Euclidean", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmeyers314 I remember you had used Arc
metric in your notebook. Given that we are expecting to run this on tract-level scales, I doubt it makes much difference. If you'd rather keep it as Arc
for consistency, we could make that the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd go with Arc
for correctness, even though you're right it likely doesn't make a difference at the tract scale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made Arc
as the default in rhoStatistics
calculation. I've left the defaults for the config as it is, as I wanted it to correspond to the defaults of treecorr
itself.
@@ -616,15 +955,18 @@ def validate(self): | |||
color1_errors = True | |||
elif ((self.color1_flux1_err and not self.color1_flux2_err) | |||
or (not self.color1_flux1_err and self.color1_flux2_err)): | |||
raise ValueError("Must set both color1_flux1_err and color1_flux2_err if either is set.") | |||
msg = "Must set both color1_flux1_err and color1_flux2_err if either is set." | |||
raise FieldValidationError(self.__class__.color1_flux1_err, self, msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hijacked a few other commits into this ticket since these are probably not worth making a ticket on its own. Especially since @timj is looking at this PR :-)
ff0b367
to
b6dbfcf
Compare
The config contains fields that correspond to various keyword arguments that can be passed on to treecorr.BinnedCorr2 and its derived classes such as GGCorrelation, KKCorrelation etc. used in Rho statistics calculations. The necessity to define a new classs instead of using DictField is to i) validate the config up front ii) allow mixed types and complete control over treecorr configuration.
The reason to not add this to visitQAEllipPlots.yaml is that, running coadds on visit-levels is much more time consuming than all other visit-level plots combined. A stand-alone pipeline is provided so plots can be generated on a need-to-see basis.
The documentation of the ``validate`` method in the parent class asserts that it raises ``FieldValidationError`` when misconfigured. Thus, instances of ``ValueError`` raises must be replaced.
b6dbfcf
to
8f16051
Compare
No description provided.