DM-33123: Calculate RhoStatistics as part of regular QA #35

arunkannawadi · 2022-04-23T03:42:45Z

No description provided.

jmeyers314

LGTM. Just some minor comments.

jmeyers314 · 2022-05-13T22:36:44Z

pipelines/visitQARhoPlots.yaml

+    config:
+      rhoStatisticsAction.treecorr.nbins: 21
+      rhoStatisticsAction.treecorr.min_sep: 0.01
+      rhoStatisticsAction.treecorr.max_sep: 80


Why is this one 80, but the coadd 100? I guess 100 arcsec exceeds the HSC field diameter, but doesn't for DECam or Rubin. Is there a way to make these instrument specific?

These will probably be made instrument specific later on when they make their way into obs packages I believe. At the visit level, these are bound to be too noisy anyway that I don't think they'd be used in practice.

jmeyers314 · 2022-05-13T22:40:06Z

python/lsst/analysis/drp/_treecorrConfig.py

+        dtype=str,
+        optional=True,
+        allowed={
+            units: units for units in ["arcmin", "arc", "degrees", "hours", "radians"]


units: ["arcmin", "arc", "degrees", "hours", "radians"] would be simpler, does it not work here for some reason?

I see more examples down below, so maybe I'm just out-of-the-loop here.

I tried that first, but allowed had to be a dictionary where the key are the allowed options and the value being it's documentation.

Please use standard unit strings. "hours" is not understood by Astropy and should be "h" or "hour" (and in fact the plural forms are not allowed). What unit is arc?

That was supposed to be arcsec. Good catch Tim. I intended this to be some sort of a thin wrapper around treecorr config (only because our DictField doesn't accept values of different types). I could implement a mapper that converts standard units to treecorr units, but I'd rather fix it (let you fix it) in treecorr or deal with a mapper later on. At the moment, this config is of limited use beyond making rho statistics.

Looks like treecorr should accept standard units (despite the TreeCorr.Catalog docstring). See: https://github.com/rmjarvis/TreeCorr/blob/86afbdca38aac6325bc95cdc101b771c8ebb02f3/treecorr/config.py#L88-L104. Maybe the cleanest resolution is to make a map so either the docstring units or the standard units will work.

Ah, we can stick with standard units since it only looks for the beginning of the string.

jmeyers314 · 2022-05-13T22:44:08Z

python/lsst/analysis/drp/_treecorrConfig.py

+                "You must specify exactly three of ``nbins``, ``bin_size``, ``min_sep`` and ``max_sep``"
+                f" in treecorr_config. {num_req_params} parameters were set instead."
+            )
+            raise FieldValidationError(self.bin_size, self, msg)


I suppose you can add some checks here for things like min_sep < max_sep etc.

Good point. treecorr will throw an error if that's the case, so we should catch it earlier on.

jmeyers314 · 2022-05-13T22:57:32Z

python/lsst/analysis/drp/calcFunctors.py

+                             "(Ixx + Iyy + 2*sqrt(Ixx*Iyy - Ixy**2))"
+                             ),
+                 },
+        default="chi",


Interesting. I see that Jarvis++15 (where rho3-5 are defined) use epsilon. I guess it doesn't matter as long as one is making consistent comparisons. It may be good to document that there's a factor of ~4 difference depending on the definition here.

jmeyers314 · 2022-05-13T22:58:02Z

python/lsst/analysis/drp/calcFunctors.py

+                 "determinant": "determinant radius",
+                 },
+    )
+    treecorr = ConfigField(


jmeyers314 · 2022-05-13T23:04:35Z

python/lsst/analysis/drp/calcFunctors.py

+            function.
+        """
+        xy = treecorr.GGCorrelation(**treecorrKwargs)
+        catA = treecorr.Catalog(ra=ra, dec=dec, g1=g1a, g2=g2a, ra_units=raUnits, dec_units=decUnits)


Looks like you're making more Catalogs than necessary. Here's my rho-stats function that I got from Mike, which computes rho1-5 with just 3 Catalogs:

def getRho(data): cat_g = treecorr.Catalog( ra=data['ra'], dec=data['dec'], ra_units='rad', dec_units='rad', g1=data['star_e1'], g2=data['star_e2'] ) cat_dg = treecorr.Catalog( ra=data['ra'], dec=data['dec'], ra_units='rad', dec_units='rad', g1=data['de1'], g2=data['de2'] ) cat_gdTT = treecorr.Catalog( ra=data['ra'], dec=data['dec'], ra_units='rad', dec_units='rad', g1=data['star_e1']*data['dT']/data['star_T'], g2=data['star_e2']*data['dT']/data['star_T'] ) kwargs = dict(min_sep=0.5, max_sep=300.0, bin_size=0.2, sep_units='arcmin', metric='Arc') rho1 = treecorr.GGCorrelation(**kwargs) rho1.process(cat_dg) rho2 = treecorr.GGCorrelation(**kwargs) rho2.process(cat_dg, cat_g) rho3 = treecorr.GGCorrelation(**kwargs) rho3.process(cat_gdTT) rho4 = treecorr.GGCorrelation(**kwargs) rho4.process(cat_dg, cat_gdTT) rho5 = treecorr.GGCorrelation(**kwargs) rho5.process(cat_g, cat_gdTT) return rho1, rho2, rho3, rho4, rho5

Not worth changing unless there are efficiency problems possibly, but thought I'd let you know.

For coaddQAEllipPlots.yaml it doesn't seem to take all that much more time than it takes to make other plots. I've been thinking given ra and dec columns are going to be the same, it shouldn't even have to build a kdtree thrice but only once, but that's another issue :-)

arunkannawadi · 2022-05-16T23:58:42Z

python/lsst/analysis/drp/_treecorrConfig.py

+            "Which metric to use for distance measurements. For details, see "
+            "https://rmjarvis.github.io/TreeCorr/_build/html/metric.html"
+        ),
+        default="Euclidean",


@jmeyers314 I remember you had used Arc metric in your notebook. Given that we are expecting to run this on tract-level scales, I doubt it makes much difference. If you'd rather keep it as Arc for consistency, we could make that the default.

Yeah, I'd go with Arc for correctness, even though you're right it likely doesn't make a difference at the tract scale.

I made Arc as the default in rhoStatistics calculation. I've left the defaults for the config as it is, as I wanted it to correspond to the defaults of treecorr itself.

arunkannawadi · 2022-05-17T00:00:26Z

python/lsst/analysis/drp/calcFunctors.py

@@ -616,15 +955,18 @@ def validate(self):
            color1_errors = True
        elif ((self.color1_flux1_err and not self.color1_flux2_err)
              or (not self.color1_flux1_err and self.color1_flux2_err)):
-            raise ValueError("Must set both color1_flux1_err and color1_flux2_err if either is set.")
+            msg = "Must set both color1_flux1_err and color1_flux2_err if either is set."
+            raise FieldValidationError(self.__class__.color1_flux1_err, self, msg)


I hijacked a few other commits into this ticket since these are probably not worth making a ticket on its own. Especially since @timj is looking at this PR :-)

The config contains fields that correspond to various keyword arguments that can be passed on to treecorr.BinnedCorr2 and its derived classes such as GGCorrelation, KKCorrelation etc. used in Rho statistics calculations. The necessity to define a new classs instead of using DictField is to i) validate the config up front ii) allow mixed types and complete control over treecorr configuration.

The reason to not add this to visitQAEllipPlots.yaml is that, running coadds on visit-levels is much more time consuming than all other visit-level plots combined. A stand-alone pipeline is provided so plots can be generated on a need-to-see basis.

The documentation of the ``validate`` method in the parent class asserts that it raises ``FieldValidationError`` when misconfigured. Thus, instances of ``ValueError`` raises must be replaced.

arunkannawadi requested a review from jmeyers314 April 23, 2022 03:42

arunkannawadi force-pushed the tickets/DM-33123 branch 4 times, most recently from 7d923b3 to 5dee524 Compare May 13, 2022 18:22

jmeyers314 approved these changes May 16, 2022

View reviewed changes

arunkannawadi force-pushed the tickets/DM-33123 branch 3 times, most recently from 8961aeb to d138de3 Compare May 16, 2022 23:55

arunkannawadi commented May 17, 2022

View reviewed changes

arunkannawadi force-pushed the tickets/DM-33123 branch 3 times, most recently from ff0b367 to b6dbfcf Compare May 17, 2022 16:19

arunkannawadi added 11 commits May 17, 2022 12:10

Add a functor to calculate Rho statistics

bbdcb2f

Add a task to compute Rho statistics on tracts

b5a7dba

Add a task to compute Rho statistics on visits

97bdeeb

Add tract-level Rho statistics to coadd pipeline

c825bd9

Raise FieldValidationError in ColorDiffPull

83ac1e0

The documentation of the ``validate`` method in the parent class asserts that it raises ``FieldValidationError`` when misconfigured. Thus, instances of ``ValueError`` raises must be replaced.

Raise FieldValidationError in CalcE1 and CalcShapeSize

2298480

Add a validate method for CalcEDiff

f3630d6

Change Reference to References in CalcE

d409642

Update conf.py in doc folder

8f16051

arunkannawadi force-pushed the tickets/DM-33123 branch from b6dbfcf to 8f16051 Compare May 17, 2022 17:11

arunkannawadi merged commit 9c2632b into main May 19, 2022

This was referenced May 19, 2022

Revert "DM-33123: Calculate RhoStatistics as part of regular QA" #42

Merged

DM-33123: Calculate RhoStatistics as part of regular QA #44

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-33123: Calculate RhoStatistics as part of regular QA #35

DM-33123: Calculate RhoStatistics as part of regular QA #35

arunkannawadi commented Apr 23, 2022

jmeyers314 left a comment

jmeyers314 May 13, 2022

arunkannawadi May 16, 2022

jmeyers314 May 13, 2022

jmeyers314 May 13, 2022

arunkannawadi May 16, 2022

timj May 16, 2022

arunkannawadi May 16, 2022 •

edited

jmeyers314 May 17, 2022

arunkannawadi May 17, 2022

jmeyers314 May 13, 2022

arunkannawadi May 16, 2022

jmeyers314 May 13, 2022

jmeyers314 May 13, 2022

jmeyers314 May 13, 2022

arunkannawadi May 16, 2022

arunkannawadi May 16, 2022

jmeyers314 May 17, 2022

arunkannawadi May 17, 2022

arunkannawadi May 17, 2022

DM-33123: Calculate RhoStatistics as part of regular QA #35

DM-33123: Calculate RhoStatistics as part of regular QA #35

Conversation

arunkannawadi commented Apr 23, 2022

jmeyers314 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arunkannawadi May 16, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arunkannawadi May 16, 2022 •

edited