Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-33123: Calculate RhoStatistics as part of regular QA #35

Merged
merged 11 commits into from May 19, 2022

Conversation

arunkannawadi
Copy link
Member

No description provided.

@arunkannawadi arunkannawadi force-pushed the tickets/DM-33123 branch 4 times, most recently from 7d923b3 to 5dee524 Compare May 13, 2022 18:22
Copy link

@jmeyers314 jmeyers314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just some minor comments.

config:
rhoStatisticsAction.treecorr.nbins: 21
rhoStatisticsAction.treecorr.min_sep: 0.01
rhoStatisticsAction.treecorr.max_sep: 80

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this one 80, but the coadd 100? I guess 100 arcsec exceeds the HSC field diameter, but doesn't for DECam or Rubin. Is there a way to make these instrument specific?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will probably be made instrument specific later on when they make their way into obs packages I believe. At the visit level, these are bound to be too noisy anyway that I don't think they'd be used in practice.

dtype=str,
optional=True,
allowed={
units: units for units in ["arcmin", "arc", "degrees", "hours", "radians"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

units: ["arcmin", "arc", "degrees", "hours", "radians"] would be simpler, does it not work here for some reason?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see more examples down below, so maybe I'm just out-of-the-loop here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that first, but allowed had to be a dictionary where the key are the allowed options and the value being it's documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use standard unit strings. "hours" is not understood by Astropy and should be "h" or "hour" (and in fact the plural forms are not allowed). What unit is arc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was supposed to be arcsec. Good catch Tim. I intended this to be some sort of a thin wrapper around treecorr config (only because our DictField doesn't accept values of different types). I could implement a mapper that converts standard units to treecorr units, but I'd rather fix it (let you fix it) in treecorr or deal with a mapper later on. At the moment, this config is of limited use beyond making rho statistics.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like treecorr should accept standard units (despite the TreeCorr.Catalog docstring). See: https://github.com/rmjarvis/TreeCorr/blob/86afbdca38aac6325bc95cdc101b771c8ebb02f3/treecorr/config.py#L88-L104. Maybe the cleanest resolution is to make a map so either the docstring units or the standard units will work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, we can stick with standard units since it only looks for the beginning of the string.

"You must specify exactly three of ``nbins``, ``bin_size``, ``min_sep`` and ``max_sep``"
f" in treecorr_config. {num_req_params} parameters were set instead."
)
raise FieldValidationError(self.bin_size, self, msg)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you can add some checks here for things like min_sep < max_sep etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. treecorr will throw an error if that's the case, so we should catch it earlier on.

"(Ixx + Iyy + 2*sqrt(Ixx*Iyy - Ixy**2))"
),
},
default="chi",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I see that Jarvis++15 (where rho3-5 are defined) use epsilon. I guess it doesn't matter as long as one is making consistent comparisons. It may be good to document that there's a factor of ~4 difference depending on the definition here.

"determinant": "determinant radius",
},
)
treecorr = ConfigField(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space

function.
"""
xy = treecorr.GGCorrelation(**treecorrKwargs)
catA = treecorr.Catalog(ra=ra, dec=dec, g1=g1a, g2=g2a, ra_units=raUnits, dec_units=decUnits)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you're making more Catalogs than necessary. Here's my rho-stats function that I got from Mike, which computes rho1-5 with just 3 Catalogs:

def getRho(data):
    cat_g = treecorr.Catalog(
        ra=data['ra'], dec=data['dec'], ra_units='rad', dec_units='rad',
        g1=data['star_e1'], g2=data['star_e2']
    )
    cat_dg = treecorr.Catalog(
        ra=data['ra'], dec=data['dec'], ra_units='rad', dec_units='rad',
        g1=data['de1'], g2=data['de2']
    )
    cat_gdTT = treecorr.Catalog(
        ra=data['ra'], dec=data['dec'], ra_units='rad', dec_units='rad',
        g1=data['star_e1']*data['dT']/data['star_T'], g2=data['star_e2']*data['dT']/data['star_T']
    )
    kwargs = dict(min_sep=0.5, max_sep=300.0, bin_size=0.2, sep_units='arcmin', metric='Arc')
    rho1 = treecorr.GGCorrelation(**kwargs)
    rho1.process(cat_dg)
    rho2 = treecorr.GGCorrelation(**kwargs)
    rho2.process(cat_dg, cat_g)
    rho3 = treecorr.GGCorrelation(**kwargs)
    rho3.process(cat_gdTT)
    rho4 = treecorr.GGCorrelation(**kwargs)
    rho4.process(cat_dg, cat_gdTT)
    rho5 = treecorr.GGCorrelation(**kwargs)
    rho5.process(cat_g, cat_gdTT)
    return rho1, rho2, rho3, rho4, rho5

Not worth changing unless there are efficiency problems possibly, but thought I'd let you know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For coaddQAEllipPlots.yaml it doesn't seem to take all that much more time than it takes to make other plots. I've been thinking given ra and dec columns are going to be the same, it shouldn't even have to build a kdtree thrice but only once, but that's another issue :-)

@arunkannawadi arunkannawadi force-pushed the tickets/DM-33123 branch 3 times, most recently from 8961aeb to d138de3 Compare May 16, 2022 23:55
"Which metric to use for distance measurements. For details, see "
"https://rmjarvis.github.io/TreeCorr/_build/html/metric.html"
),
default="Euclidean",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmeyers314 I remember you had used Arc metric in your notebook. Given that we are expecting to run this on tract-level scales, I doubt it makes much difference. If you'd rather keep it as Arc for consistency, we could make that the default.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd go with Arc for correctness, even though you're right it likely doesn't make a difference at the tract scale.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made Arc as the default in rhoStatistics calculation. I've left the defaults for the config as it is, as I wanted it to correspond to the defaults of treecorr itself.

@@ -616,15 +955,18 @@ def validate(self):
color1_errors = True
elif ((self.color1_flux1_err and not self.color1_flux2_err)
or (not self.color1_flux1_err and self.color1_flux2_err)):
raise ValueError("Must set both color1_flux1_err and color1_flux2_err if either is set.")
msg = "Must set both color1_flux1_err and color1_flux2_err if either is set."
raise FieldValidationError(self.__class__.color1_flux1_err, self, msg)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hijacked a few other commits into this ticket since these are probably not worth making a ticket on its own. Especially since @timj is looking at this PR :-)

@arunkannawadi arunkannawadi force-pushed the tickets/DM-33123 branch 3 times, most recently from ff0b367 to b6dbfcf Compare May 17, 2022 16:19
The config contains fields that correspond to various keyword arguments
that can be passed on to treecorr.BinnedCorr2 and its derived classes
such as GGCorrelation, KKCorrelation etc. used in Rho statistics
calculations. The necessity to define a new classs instead of using
DictField is to i) validate the config up front ii) allow mixed types
and complete control over treecorr configuration.
The reason to not add this to visitQAEllipPlots.yaml is that, running
coadds on visit-levels is much more time consuming than all other
visit-level plots combined. A stand-alone pipeline is provided so plots
can be generated on a need-to-see basis.
The documentation of the ``validate`` method in the parent class asserts
that it raises ``FieldValidationError`` when misconfigured. Thus,
instances of ``ValueError`` raises must be replaced.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants