add new validation datasets to galaxy-shear correlation test (#128) #131

chihway · 2018-08-03T20:18:06Z

This PR is for developing tests described in issue #128 . The new test DeltaSigma effectively contains and expands the original DeltaSigmaTest (https://github.com/LSSTDESC/descqa/blob/master/descqa/DeltaSigmaTest.py#L51).

Initial results can be found below:

the original test, SDSS lowz sample from Singh et al. (2015): https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-08-03_77&test=delta_sigma_sdss_lowz
the CFHTLenS sample from Velander et al. (2013): https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-08-03_42&test=delta_sigma_cfhtlens
--> note here we do not yet have the data points from the paper
the SDSS main sample from Mandelbaum et al. (2016): https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-08-03_74&test=delta_sigma_sdss_main
--> the color cut probably still needs tuning, the paper asked for colors based on k-corrected magnitude to z=0.1, whereas I was using the abs magnitudes...

[edited by @yymao: fix #128]

aphearin · 2018-08-03T21:23:22Z

@chihway - very nice validation test! This is great to see, I think these will be highly constraining data for any model, so thanks for implementing this.

For the SDSS and CHFTLens plots with multiple lines, it might be clearer to chop up the information a little differently. One of the first things I always look at in plots like this are red & blue samples of the same r-band (or stellar mass), over-plotted in the same panel. That makes the color-dependence of the signal visually jump out. It's also nice to see the r-dependence though, so here is an idea for how to chop things up. For CHFTLens, there are 8 bins spanning -20 < r < -24.5; showing all 8 may be a bit too granular, so maybe one four-panel plot showing a single red & blue curve in each panel, plotted for every other r-band bin. Then a separate single-panel plot showing four curves for red-samples-only for the same every-other-r-band-bin, and a mirror single-panel plot for blue-samples-only. I think the same structure could work for SDSS, only using M* instead of r-band.

What do you think? Any other opinions on the clearest way to present the trends?

evevkovacs · 2018-08-04T09:18:25Z

@chihway Thanks very much. This looks very good. I think it would also be helpful to add a legend to the plots to make it immediately obvious which lines pertain to catalog and which to data and I second Andrew's suggestion to reduce the business. I also like to add the selection cuts on the catalog and to the text files. See for example:ellipticity_distribution test. You could probably use the code from the latter test to make this addition. I'm happy to help with the implementation of these features if you have any questions.

chihway · 2018-08-04T11:45:28Z

Hi @aphearin @evevkovacs, thanks for the suggestions! yep I'll work on making the plots more readable next.

This may be a silly question, but do you guys know how I chould implement this color cut using "g and r magnitudes k-corrected to z=0.1"? Also for CFHTLenS if we can't get hold of the validation data (the author has left the field and the co-authors don't know where the data is) is it useful to get approximate values from reading from the plots in the paper or something like that?

evevkovacs · 2018-08-05T09:17:11Z

@chihway There is a K-correct code by Blanton (http://adsabs.harvard.edu/abs/2007AJ....133..734B), but I believe it is only valid for redshifts up to 0.5. I'm not quite sure what you want to achieve. Is the issue that the validation data being reported is for an observer at z=0.1? In that case, k-correcting the validation data may be the best option. (BTW, cosmoDC2 has magnitudes for SDSS filters available..you are using those, right?)
Re the missing data: I think it is worthwhile to try and get some points off the published plots (and also to try and locate the person who left the field)

aphearin · 2018-08-05T13:10:10Z

@chihway - in these situations I have used WebPlotDigitizer, which seems totally fine for this kind of application.

As for k-corrections, the ones we use are pre-computed and come from Galacticus; since the proper calculation requires the SED which we only save in a coarse form, it would not be so easy to provide colors k-corrected to a specific redshift (tagging @abensonca to confirm whether I am correct on this point). Is this mission-critical for your comparison? Or can you think of a workaround/approximation?

yymao · 2018-08-05T13:25:54Z

@chihway

K-correction is kind of a headache. In principle if there's a good k-correction package, we can just install it in the DESC environment and you can just call it from DESCQA. Unfortunately all existing k-correction packages that I am aware of are all pretty awkward when it comes to installation and maintenance. This is still something we probably need to figure out anyway, because other people in DESC will probably need k-correction too.

However, I don't think this issue should be a showstopper for this test. I agree with with @aphearin that (1) maybe just try ignoring k-correction and see how things go. We won't be comparing apples to apples but we ain't comparing apples to cars, and (2) we can do some approximation too. Cc @rongpu as he may have some insight about this.

abensonca · 2018-08-05T16:19:53Z

As @aphearin says you could do a crude approximation using the SED for each galaxy. The SED is only crudely tabulated though so it wouldn't be very accurate.

We could compute correctly k-corrected magnitudes in future runs of Galacticus (to one or more predefined redshifts), but that doesn't help you right now.

It also might not be the best thing to do - if you want to reproduce what would be done with real data then it's definitely better to apply some k-corrected code (since this would presumably then result in the same errors and biases as you'd get computing k-corrected so for real data).

rongpu · 2018-08-05T20:24:30Z

@chihway To obtain "g and r magnitudes k-corrected to z=0.1", you can try using this Python script, which applies kcorrect to create absolute magnitudes (k-corrected to a specific redshift): https://github.com/rongpu/descqa-color-test/blob/master/sdss_kcorrect.py

This script was used in the DC1 version of the DESCQA color distribution test, where we applied k-correction on SDSS data to the redshift (snapshot) of the mock galaxies. (Part of the code is on the DESCQA repo here but the SED-fitting part was not included.) It uses a Python wrapper (https://github.com/nirinA/kcorrect_python) and is kind of awkward as it operates on ASCII table for input and output. The script applies to real SDSS data, but it should be rather straightforward to adapt it to e.g. protoDC2.

@yymao should knows how to get kcorrect (and python wrapper) working on NERSC since he got it working for DESCQA1. In DESCQA1 the magnitude reconstruction was done on the fly (SED fitting was done offline), but that might not be necessary for protoDC2 since we might just need to run it once to z=0.1.

For future runs of Galacticus, I think it would be very useful to produce magnitudes that are k-corrected to z=0.1, since this choice is used in the SDSS Value-Added Galaxy Catalog and adopted in many papers. As @abensonca has mentioned, it is probably most accurate if we apply k-correction on the observed magnitudes of mock galaxies so that we match the procedures in real data. But as an initial test, the simple filter-convolved magnitudes at z=0.1 would still be very helpful.

rmandelb · 2018-08-05T21:37:23Z

Just a few additional comments:

The CFHTLenS person left the field ~3 years ago, so if we want to validate against that data I'd rather try to use the plot digitizer than try to track them down (it would be a different story if they left within the past few months).
I suspect the SDSS test will be more interesting on cosmoDC2 since the volume at low redshift is small and when combined with a small area, as in protoDC2, the results are fairly noisy.
If we use kcorrect on the sims then we could actually reproduce something like the selection in the data, but for what it's worth I do not think it's crazy at all to simply use the true rest-frame (z=0) colors to define the red sequence vs. blue cloud and use that to define the red vs. blue samples. (That was how our split in the real data was defined, so it's conceptually the same thing but using different quantities based on what's most easily available in data vs. sims.) . This is not apples-to-apples, but for what we're trying to get out of this test, I think it's fine.
Thanks for getting this set up, @chihway !

chihway · 2018-08-05T21:49:57Z

Thanks everyone, this is super useful!

So the main reason I thought the k-correct colors may be relevant was that when I used: res['Mag_true_g_sdss_z0'] - res['Mag_true_r_sdss_z0'] (as @rmandelb suggested), I saw the bulk of the red galaxies to have slightly lower DeltaSigma than the data points and the blue one slightly higher ( https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-08-03_74&test=delta_sigma_sdss_main), which I then thought maybe it matters to have the exact right colors. But then there may be a dozen other things going on there too.

Let me do the following and report back:

clean up plots
get CFHTLenS data points from plot digitizer
take a look at the g-r distribution and what z=0 colors threshold might be a good match to the z=0.1 threshold, maybe simply shifting that threshold will be enough... and if that doesn't look promising I'll investigate the k-correct script

chihway · 2018-08-06T01:49:19Z

updates...

CFHTLenS: https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-08-05_15&test=delta_sigma_cfhtlens

SDSS main: https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-08-05_18&test=delta_sigma_sdss_main

For CFHTLenS, we could probably run longer and get better statistics, but for SDSS, I agree with Rachel we'd probably want to go to cosmoDC2.

Comments?

yymao · 2018-08-16T20:53:17Z

@chihway want to check in with you on the status of this test, is it ready to be reviewed? If you need any help from me, please do let me know.

Also, I noticed you added a new file descqa/DeltaSigma.py instead of directly modifying existing descqa/DeltaSigmaTest.py. Is there a particular reason for that? Unless there's a strong reason, I think it's better to have only one class for the Delta Sigma test.

chihway · 2018-08-19T17:22:21Z

Hi @yymao , sorry for the slowness. I was wondering if @aphearin and @rmandelb have other comments, otherwise yes I think this is as far as the test can go now without being too quantitative.

In terms of the two tests, effectively I've merged descqa/DeltaSigmaTest.py into descqa/DeltaSigma.py, so if @EiffL is ok I think we can remove descqa/DeltaSigmaTest.py and the corresponding config file.

EiffL · 2018-08-20T17:04:49Z

@chihway It's certainly fine with me.

rmandelb · 2018-08-21T02:20:14Z

I think having the one test that includes SDSS Main, LOWZ, and CFHTLenS makes sense. And I'd be very curious to see this one run on cosmoDC2!

yymao · 2018-08-21T02:21:42Z

speaking of which, is shear now available in cosmoDC2, @patricialarsen?

aphearin · 2018-08-21T02:29:14Z

@yymao - shears are coming soon to a ~50 deg^2 mock that is currently in production. Should be ready by Wednesday.

chihway · 2018-09-22T16:35:32Z

Updates with cosmoDC2

cosmoDC2_v1.0_9556

2 is quite noisy, but using cosmoDC2_v1.0_image seems to be running into memory issues.

evevkovacs · 2018-09-25T23:27:24Z

@chihway Thanks, these look very nice. There are some sub-volumes available on subsets of the healpixels in cosmoDC2_v1.0_image. For example, try cosmoDC2_v1.0_9431_9812 which is a strip of healpixels across the middle of the image-sim area. This will run faster and use less memory than cosmoDC2_v1.0_image.

chihway · 2018-09-26T22:02:57Z

Thanks @evevkovacs .

Tried the cosmoDC2_v1.0_9431_9812 catalog and they looks pretty cool:
https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-09-26_4&test=delta_sigma_sdss_main
https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-09-26_7&test=delta_sigma_cfhtlens
The massive red galaxies match pretty well but less so on the low-mass end.

Any other suggestions from @aphearin @rmandelb ? I can clean up the plots further (and maybe run a much longer test) if you think it's ready to merge.

rmandelb · 2018-09-28T04:26:54Z

The low-mass end doesn't look great, but I think at this point a longer test is warranted; some bins are too noisy to say. The plot format etc. seems really good to me so just getting some better statistics seems like the last step before merging. Thanks!

yymao · 2018-09-28T16:10:15Z

@chihway note that you may need to submit a batch job if you want to run the test on the full catalog.

yymao · 2018-12-04T22:28:52Z

@chihway one thing that we found useful when diagnosing the catalog is the surface number density of lenses, which was printed out on the plot in Francios' original implementation. See:
https://github.com/LSSTDESC/descqa/blob/master/descqa/DeltaSigmaTest.py#L63
https://github.com/LSSTDESC/descqa/blob/master/descqa/DeltaSigmaTest.py#L133

Can you implement this in the current version too? Thanks!

chihway · 2018-12-04T23:50:02Z

@yymao I added that on plot. I only have time to make a quick run just to show the numbers now.

https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-12-04_12
https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-12-04_11
https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-12-04_5

…paper Fig 5

yymao

Ready to go! Yay!

evevkovacs · 2020-12-02T01:01:14Z

I ran this test by cloning Chihway's repo and it crashed due to the following line causing a problem:
z = np.linspace(0, self.zmax, self.zmax*100)
The fix was to replace it with:
z = np.linspace(0, self.zmax, int(self.zmax)*100)
Then I retested and now it crashes with:
astropy.units.core.UnitConversionError: 'kg / (m Mpc)' and '' (dimensionless) are not convertible
I have not tracked this latter error down yet.
I will ping Chihway about this.

rmandelb · 2020-12-02T01:04:54Z

Sounds like something is expecting a surface density DeltaSigma (units of mass per unit area) and getting a weak lensing shear (dimensionless), or vice versa.

chihway · 2020-12-03T14:00:14Z

@evevkovacs I realized we might be talking about slightly different things. The plot I was thinking to go into the cosmoDC2 paper is this one: https://github.com/LSSTDESC/DC2-analysis/blob/master/contributed/TestGalaxyGalaxyLensingSDSSMain.ipynb. So not exactly a descqa test. I'll take a look at this test anyway, but I think the notebook linked above should be sufficient for the paper? Sorry for the confusion...

evevkovacs · 2020-12-03T16:30:03Z

@chihway I cannot get this notebook to render in my browser. Could you please post the plot here? Our goal is to have all the tests in the paper be a part of the DESCQA suite. Do you have someone who could turn this notebook into a test in the DEDSCQA framework? (We can help with this...would be a good student project) Thanks

yymao added validation: extragalactic validation data labels Aug 3, 2018

yymao requested review from yymao and rmandelb August 3, 2018 20:23

yymao changed the title ~~Issue/#128~~ add new validation datasets to galaxy-shear correlation test (#128) Dec 4, 2018

adding new DeltaSigma test that runs for sdss lowz and cfhtlens

278e1af

chihway and others added 6 commits December 5, 2018 08:28

include now the SDSS main config file and validation data.

4b0a9e4

prettify plots and change SDSS main color cut to g-r>0.7

c879db5

add cfhtlens data which was read off from the Velander et al. (2013) …

8e2353e

…paper Fig 5

fix small bug in DeltaSigma.py, remove old DeltaSigmaTest.py

d9a4d9a

add number density on plot

233d83c

rename DeltaSigma.py -> DeltaSigmaTest.py

a2e48d9

yymao force-pushed the issue/#128 branch from 2cfac02 to a2e48d9 Compare December 5, 2018 13:32

Merge branch 'master' into issue/LSSTDESC#128

cb8a88a

yymao added the paper label Dec 1, 2020

Remove unused variables

bca0405

yymao approved these changes Dec 2, 2020

View reviewed changes

yymao merged commit 4e42d57 into LSSTDESC:master Dec 2, 2020

yymao mentioned this pull request Dec 2, 2020

Bug fix in DeltaSigmaTest #207

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add new validation datasets to galaxy-shear correlation test (#128) #131

add new validation datasets to galaxy-shear correlation test (#128) #131

chihway commented Aug 3, 2018 •

edited by yymao

Loading

aphearin commented Aug 3, 2018

evevkovacs commented Aug 4, 2018

chihway commented Aug 4, 2018

evevkovacs commented Aug 5, 2018

aphearin commented Aug 5, 2018

yymao commented Aug 5, 2018

abensonca commented Aug 5, 2018

rongpu commented Aug 5, 2018

rmandelb commented Aug 5, 2018

chihway commented Aug 5, 2018

chihway commented Aug 6, 2018

yymao commented Aug 16, 2018

chihway commented Aug 19, 2018

EiffL commented Aug 20, 2018

rmandelb commented Aug 21, 2018

yymao commented Aug 21, 2018

aphearin commented Aug 21, 2018

chihway commented Sep 22, 2018 •

edited

Loading

evevkovacs commented Sep 25, 2018

chihway commented Sep 26, 2018

rmandelb commented Sep 28, 2018

yymao commented Sep 28, 2018

yymao commented Dec 4, 2018

chihway commented Dec 4, 2018

yymao left a comment

evevkovacs commented Dec 2, 2020 •

edited

Loading

rmandelb commented Dec 2, 2020

chihway commented Dec 3, 2020 •

edited

Loading

evevkovacs commented Dec 3, 2020

add new validation datasets to galaxy-shear correlation test (#128) #131

add new validation datasets to galaxy-shear correlation test (#128) #131

Conversation

chihway commented Aug 3, 2018 • edited by yymao Loading

aphearin commented Aug 3, 2018

evevkovacs commented Aug 4, 2018

chihway commented Aug 4, 2018

evevkovacs commented Aug 5, 2018

aphearin commented Aug 5, 2018

yymao commented Aug 5, 2018

abensonca commented Aug 5, 2018

rongpu commented Aug 5, 2018

rmandelb commented Aug 5, 2018

chihway commented Aug 5, 2018

chihway commented Aug 6, 2018

yymao commented Aug 16, 2018

chihway commented Aug 19, 2018

EiffL commented Aug 20, 2018

rmandelb commented Aug 21, 2018

yymao commented Aug 21, 2018

aphearin commented Aug 21, 2018

chihway commented Sep 22, 2018 • edited Loading

evevkovacs commented Sep 25, 2018

chihway commented Sep 26, 2018

rmandelb commented Sep 28, 2018

yymao commented Sep 28, 2018

yymao commented Dec 4, 2018

chihway commented Dec 4, 2018

yymao left a comment

Choose a reason for hiding this comment

evevkovacs commented Dec 2, 2020 • edited Loading

rmandelb commented Dec 2, 2020

chihway commented Dec 3, 2020 • edited Loading

evevkovacs commented Dec 3, 2020

chihway commented Aug 3, 2018 •

edited by yymao

Loading

chihway commented Sep 22, 2018 •

edited

Loading

evevkovacs commented Dec 2, 2020 •

edited

Loading

chihway commented Dec 3, 2020 •

edited

Loading