Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

size-luminosity relation test #13

Closed
3 of 4 tasks
yymao opened this issue Nov 6, 2017 · 103 comments
Closed
3 of 4 tasks

size-luminosity relation test #13

yymao opened this issue Nov 6, 2017 · 103 comments

Comments

@yymao
Copy link
Member

yymao commented Nov 6, 2017

  • code to reduce mock data
  • code that works within DESCQA framework
  • validation data
  • validation criteria
@yymao
Copy link
Member Author

yymao commented Dec 14, 2017

@yymao attempted to start to write this test during the Sprint Week but failed due to the lack of a cloned Yao. As a result, no progress has been made on this test so far.

@rmandelb
Copy link

rmandelb commented Dec 14, 2017 via email

@vvinuv
Copy link
Contributor

vvinuv commented Dec 15, 2017

Is this test compares the scaling relation of luminosity and size of observed objects to the simulated objects?

@yymao
Copy link
Member Author

yymao commented Dec 15, 2017

@vvinuv Yes!

@evevkovacs
Copy link
Contributor

@vvinuv If you can point us to a good data set for comparison, this would be very helpful. Thanks!

@yymao
Copy link
Member Author

yymao commented Dec 15, 2017

@vvinuv Or, even better, implement this test :)

@vvinuv
Copy link
Contributor

vvinuv commented Dec 15, 2017

@evevkovacs We have found the luminosity and size of SDSS galaxies in this paper http://adsabs.harvard.edu/abs/2015MNRAS.446.3943M . There is a catalog associated with that paper and contains luminosity and size of galaxies in g,r and i filters.

@rmandelb
Copy link

I would imagine there are CANDELS or other HST-based datasets that go to higher redshift and fainter magnitudes than SDSS. This would be useful in my opinion rather than only having a z<0.2 validation test.

@vvinuv
Copy link
Contributor

vvinuv commented Jan 4, 2018

I found a few issues to validate this test. For protoDC2 catalog the sizes are given for bulge and disk components. Therefore, to have a meaningful comparison to the data we need similar observational data. As far as I know that the two dimensional decomposition of data is available mostly for SDSS galaxies and a few high redshift HST based BCGs (correct me if I am wrong). We could not validate the protoDC2 unless the catalog has single Sersic half light radius. Another issue is that Buzzard catalog gives only FLUX_RADIUS parameter. I am not quite sure whether Buzzards folks (@j-dr et al) have plans to introduce the half light radius of Sersic component.

There are observational data available for galaxies with single Sersic component at higher redshift (van der Wel et al 2014). Since there are no progress is going on this issue I started writing a validation script for this purpose, at least for Buzzard catalog now.

@rmandelb
Copy link

rmandelb commented Jan 5, 2018

There are 2-component Sersic fits in COSMOS, for galaxy samples going down to i~25.2 (they are getting noisy down there, so I wouldn't go to the flux limit). See http://adsabs.harvard.edu/abs/2014ApJS..212....5M appendix E. It just so happens that the server where the data can normally be downloaded from is down until mid-next week, but I can put it elsewhere for you if this sounds interesting/useful.

@yymao
Copy link
Member Author

yymao commented Jan 5, 2018

@vvinuv there's also one-component size size_true in protoDC2 catalog version >= 2.1.2 (which is a luminosity weighted size given the two components, not exactly the same as single Sersic half light radius but should be a reasonable comparison).

@vvinuv
Copy link
Contributor

vvinuv commented Jan 5, 2018

@rmandelb Thanks! I think it could be useful. However, I can wait until mid-next week. @yymao I think the luminosity weighted radius is different from the radius by fitting only single Sersic. I checked somewhere else but don't have a figure now.

@j-dr
Copy link
Contributor

j-dr commented Jan 5, 2018

@vinuv, re: buzzard, FLUX_RADIUS is an estimate of the half-light radius. FLUX_RADIUS is just the name of the source extractor parameter that the size distributions in buzzard are based on.

@vvinuv
Copy link
Contributor

vvinuv commented Jan 5, 2018 via email

@yymao
Copy link
Member Author

yymao commented Jan 5, 2018

(It seems that @vvinuv is now working on this.)

@j-dr
Copy link
Contributor

j-dr commented Jan 20, 2018

This test is also going to be quite important for CL in order to test the impact of blending on red sequence colors. I expect that we will need to use something at higher redshift as well for validation data as has already been brought up. I might be able to dig up some HSC data that could be useful for this.

@rmandelb
Copy link

@vvinuv - following up on my comment on this thread from about 2 weeks ago (validation data), the server where the dataset that I mentioned is now online:
http://great3.jb.man.ac.uk/leaderboard/data
(scroll to the bottom, http://great3.jb.man.ac.uk/leaderboard/data/public/COSMOS_25.2_training_sample.tar.gz )

@j-dr - the dataset that I just linked to goes to i<25.2. The FITS are good for galaxies out to z~1. Do you think this is sufficient? I see you mentioned HSC, but my concern is that I'm not sure the galaxy radii from cmodel are that useful at high redshift. Have you seen section 6.4 in https://arxiv.org/pdf/1705.01599.pdf ? (It shows plots for axis ratio, but mentions that a similar problem exists for the galaxy sizes.)

Other thoughts on a validation dataset are welcome.

@vvinuv
Copy link
Contributor

vvinuv commented Jan 25, 2018

@rmandelb Thanks! Could you tell me how you calculate the intensity at half light radius for bulge and disk. Is that I = 10**((surface brightness * 2 * pi * half light radius - 5 log10(4 * np.pi * DL))/-2.5)?

@rmandelb
Copy link

@vvinuv - The numbers in these files are all observed quantities, so luminosity distance is not relevant. The intensity comes directly from the files.

Have you looked at the README that is packaged in the tarball? It gives equations for various quantities related to the intensities and radii. If you are uncertain after reading the README I'm happy to discuss, but since equations and parameters are in there I wanted to make sure you knew that reference existed.

@vvinuv
Copy link
Contributor

vvinuv commented Jan 25, 2018

@rmandelb I have read the README file and it says that SERSICFIT[0]: I, defined as the intensity at the half-light radius. However, I am not quite sure whether intensity means flux or the luminosity.

@rmandelb
Copy link

Everything in those files is flux.

@evevkovacs
Copy link
Contributor

@vvinuv Would you be able to run this test on the latest versions of protoDC2? The catalog names are
proto-dc2_v4.4, proto-dc2_v4.5, proto-dc2_v4.5_rescale. To run your test (I'm assuming you have a private version of the code) you need to do this in your local descqa directory:
./run_master.sh -c proto-dc2_v4.4 proto-dc2_v4.5 proto-dc2_v4.5_rescale size_Mandelbaum2014_BD -p ~kovacs/gcr-catalogs_v4/

The last -p points to my catalog reader at nersc, otherwise you won't be able to read these new catalogs. If there is any issue with doing this, could I clone your code from github? I see there is a version in the master branch of DESCQA. Is that the version that you are running? Thanks Eve

@vvinuv
Copy link
Contributor

vvinuv commented May 21, 2018 via email

@vvinuv
Copy link
Contributor

vvinuv commented May 21, 2018 via email

@evevkovacs
Copy link
Contributor

evevkovacs commented May 21, 2018 via email

@evevkovacs
Copy link
Contributor

@vvinuv I just cloned your repo and switched to the size-luminosity-test branch. When I ran the test, I got an error:
UnboundLocalError: local variable 'ylim' referenced before assignment
[ERROR][2018-05-21 14:52:51,857] Exception occurred when running validation size_Mandelbaum2014_BD on catalog proto-dc2_v4.5. Below are stdout/stderr and traceback:
Traceback (most recent call last):
File "/global/u1/k/kovacs/descqa2-local/descqa_vv/descqarun/master.py", line 348, in run_tests
test_result = validation_instance.run_on_single_catalog(catalog_instance, catalog, output_dir_this)
File "/global/u1/k/kovacs/descqa2-local/descqa_vv/descqa/SizeStellarMassLuminosity.py", line 201, in run_on_single_catalog
ax.set_ylim(ylim)

Could you please let me know when the test runs again, and I will try again. Thanks, Eve

@vvinuv
Copy link
Contributor

vvinuv commented May 21, 2018 via email

@vvinuv
Copy link
Contributor

vvinuv commented May 21, 2018 via email

@evevkovacs
Copy link
Contributor

evevkovacs commented May 21, 2018

@rmandelb @msimet @aphearin Could you please take a look at the size-luminosity relation in https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-05-21_38&test=size_Mandelbaum2014_BD&catalog=proto-dc2_v4.4 and let us know if the agreement is acceptable?

@msimet
Copy link
Contributor

msimet commented May 22, 2018

I'm not sure, actually. A factor of 2 average size difference might be enough for us to care about, and these look within that limit for the bulge sizes with B/T>0.5 and within the disk sizes for B/T<0.5. However, the other component looks more off (but this matters proportionally less because of the bulge vs disk dominance). Will the total size-luminosity test (vs van der Wel) be run on these catalogs as well?

If we have only this to go on, I'd say it's probably okay, but I'd like the backup of the total size test as well, if possible.

@aphearin
Copy link

@rmandelb and @msimet - I was surprised when I saw this level of discrepancy. The DC2 model for the size-luminosity relation comes directly from the fitting function provided in Zhang & Yang 2017, which is based on SDSS galaxies. So I dug back into the paper, and I found that Youcai and Xiaohu actually provide fitting functions for the size-luminosity relation of two distinct morphological classification: one based on Galaxy Zoo, and another based on B/T in the Simard+11 catalog. You can see the differences in the plot below:

size_luminosity_sdss_morphology_choice

In protoDC2 v4.4 & v4.5, we based our size modeling on ZY17 fitting function parameters of disks/bulges defined by the Galaxy Zoo classification; according to the DESCQA test under discussion here, our bulge-dominated galaxies are too large. If we switched to a model based on the Simard+11 B/T classification, the figure shows that sizes in our disk-dominated systems would hardly change, but bulge sizes will be brought down ~30-40%, depending on luminosity.

Note that these changes will not alter the sizes of very luminous disks, which appear to be significantly larger as reported in Mandelbaum+14 relative to ZY17. Inter-publication variance is particularly large when measurements of size and morphological classification are concerned, even when different profile-fitting codes are run on the exact same dataset; I suspect it will be difficult to bring the model into better agreement with the data beyond the many-tens-of-percent level. I'm bringing this up since @msimet mentioned using an entirely different measurement (van der Wel+14) as an independent consistency test. According to this DESCQA test, our size-luminosity relation appears to be in reasonably tight agreement with van der Wel+14.

CC @katrinheitmann @dkorytov @evevkovacs

@evevkovacs
Copy link
Contributor

evevkovacs commented May 22, 2018

@aphearin @msimet The biggest discrepancy between data and the catalog in the DESCQA test is for the bulge component for galaxies with B/T<0.5. (late-type). The catalog and the data agree quite well for B/T>0.5 galaxies. Overall, the disk component sizes are in pretty good agreement with the data for all galaxy types. I don't think switching models above will help, since the biggest difference in the above models is for early-type galaxies and we already have pretty good agreement there. As Andrew points out, since the total size-luminosity relation agrees well, further iteration on the size model does not seem warranted at this time, but the final word should come from the WL WG.

@aphearin
Copy link

aphearin commented May 22, 2018

@rmandelb and @msimet - When evaluating whether the current size-luminosity model in protoDC2 is sufficient for purposes of cosmoDC2, I think we should be using this earlier DESCQA test and not the much more recently completed DESCQA test currently under discussion. My reasons are:

  1. The first DESCQA test was completed prior to the deadline for WGs to contribute the validation tests that they considered an important priority. In general, it is perfectly fine for new tests to be added to DESCQA, and the extragalactic catalog can and will continue to improve long past completion of cosmoDC2. However, for purposes of cosmoDC2 evaluation, the deadline for a test to be considered for cosmoDC2 evaluation has passed.

  2. Deadlines aside, I would like to better understand the scientific motivation for test 2 over test 1. In the first test, the size-luminosity relation is evaluated separately for disks and bulges, which is one of the more basic/common summary statistics studied in the literature. In the second test, the galaxy sample is first split on B/T, and then within each split subsample, the size-luminosity relation is evaluated separately for disk and bulge components. So the main thing that this second test offers is a separate validation for the sizes of bulges in disk-dominated systems, and also the size of disks in bulge-dominated systems. In my experience studying a few different morphology catalogs (primarily Mendel+13 and Meert+15), this latter measurement strikes me as a highly uncertain measurement to make, and so if we are going to prize this measurement highly enough to put into DESCQA, I would like to understand what the downstream scientific motivation is, so that we can effectively modify the model.

CC @vvinuv @evevkovacs @katrinheitmann @yymao

@rmandelb
Copy link

rmandelb commented May 22, 2018

Hi all - apologies for missing some of this traffic; was offline yesterday for personal reasons, then traveling today. A few comments, but since this is long I will say the brief summary is that I think that the results of the size-luminosity relation tests for v4_5 are acceptable:

  • @aphearin I hope you will please correct me if there is a misunderstanding, but based on your comments, here's how I see the purpose of this validation test: You are defining the size-luminosity relation in the sims using a very well-measured relation that is valid for galaxies at z<0.2. The WL group for DESC cares about galaxies at z~0.7-1. Hence the primary purpose of the validation test is to provide some input into your method of extrapolating the parameters of that relation to higher redshift, and that's why having a CANDELS (van der Wel+) or COSMOS (Mandelbaum+) comparison is important. Do you agree with this statement?

  • Based on my experience, if you have bulge-dominated galaxies in real data with low to moderate S/N (anything less than ~200), you should not try to use the disk sizes for validation tests. If the bulge dominates, then the disk parameters tend to be quite noisy and sensitive to assumptions about both the disk and bulge models. So in a validation test that uses bulge+disk fits, I only tend to use the bulge parameters for bulge-dominated galaxies, and disk parameters for disk-dominated galaxies. If other agree, we could even go so far as to remove the other curves (or de-emphasize them visually in some way). This is not necessary in all cases, but for the COSMOS and CANDELS data that go into both of our test datasets, this is the case. This is another way of saying that I agree with your tendency to trust this test over two of the curves shown on this test.

  • I do want to make sure that the curves I do trust on the second version of the test are telling the same story as the first version of the test, and I think that is indeed the case if I compare the plots. In the second version, comparing the dark red/brown points and curves (bulge size for bulge-dominated galaxies) it looks like the points are a bit above the curves for 0.5<z<1, by ~20-30%? And in the first version, at the same redshift range, the red points (bulge size) are similarly above the red curves. The match is better at high redshift in both cases. And the disk curves/points in both versions of the test mostly agree within that ~20% tolerance from what I can see, with some more tension for the highest luminosities. I believe this is the tension you remarked on for high-luminosity disks -- but weak lensers don't care that much about the rare high-luminosity disks, they care more about the more numerous ones at intermediate luminosities. So I think these plots are telling us that the dominant population that we care about is in good shape. That could be the reason the van der Wel+ test shows agreement, since presumably it's also reflecting the fact that at typical luminosities disks dominate (and their sizes are fairly well-matched). Does this interpretation sound reasonable to you?

  • I think we may agree on the challenges inherent in a comparison with these published results: the sample selections differ and the methodology for estimating and sometimes even defining sizes differs. In that sense, there will always be a question mark with any DESCQA test of these relations, which means our tolerance for disagreement should allow for tens of % effects.

  • In my mind a much less uncertain test would be to produce images from the catalogs, and for a given seeing, compare the observed galaxy sizes for a fixed apparent magnitude and PSF size against those in HSC, using the same apparent size measurement algorithm (which we can do since HSC uses the LSST stack). By imposing the same selection in observed quantities and measurement algorithms, I believe we have a much less ambiguous test that can be interpreted a bit more strictly. The reason I haven't advocated for simply using that as our sole test is that (a) the DESCQA tests provide some input at an earlier stage of the catalog production process and (b) if we find a serious mismatch in the observed size vs. magnitude relation, we still don't know if it's in the intrinsic size vs. luminosity relation at fixed redshift or some redshift evolution effect, so we'd eventually have to dig into these quantities that we're testing in DESCQA anyway. Given the results you have here, I am optimistic that this DM catalog-level validation test on apparent size vs. magnitude will not reveal any surprises.

Again, apologies for the length of this message but I saw a few points that I wanted to address. Comments from others are welcome. Many thanks to @vvinuv @msimet @evevkovacs @yymao and others who have been working on the size tests.

@aphearin
Copy link

@rmandelb - thanks for getting back to us with a thoughtful answer.

Hence the primary purpose of the validation test is to provide some input into your method of extrapolating the parameters of that relation to higher redshift, and that's why having a CANDELS (van der Wel+) or COSMOS (Mandelbaum+) comparison is important. Do you agree with this statement?

Oh yes, I very much agree we need higher redshift validation data in addition to low-redshift. In particular, and to be clear, in the protoDC2 model, we do evolve the size-luminosity relation at higher redshift: as z increases, we decrease the normalization of this scaling relation (higher redshift galaxies are more compact than their lower redshift counterparts). In previous versions of the catalog, there was no evolution, and the van der Wel+14 validation test looked considerably worse.

Based on my experience, if you have bulge-dominated galaxies in real data with low to moderate S/N (anything less than ~200), you should not try to use the disk sizes for validation tests...So in a validation test that uses bulge+disk fits, I only tend to use the bulge parameters for bulge-dominated galaxies, and disk parameters for disk-dominated galaxies.

Yes, exactly, exactly. I similarly advocate for entirely ignoring the sizes of bulges in disk-dominated systems, and also the sizes of disks in bulge-dominated systems.

... So I think these plots are telling us that the dominant population that we care about is in good shape... Does this interpretation sound reasonable to you?

Yes, I agree that the two tests give a consistent story, and that this story is indeed telling us that the bulk of the population has reasonable sizes. I also certainly acknowledge that there is room for improvement: based on this back-and-forth, I have already tweaked the bulge normalization to be ~30% smaller. But even without this change, I think we are on the same page that the current version is sufficient for cosmoDC2 production purposes.

Thanks again to @rmandelb for weighing in clearly and carefully, and echoing thanks to @vvinuv @msimet @evevkovacs @yymao for contributing to this important validation criteria.

@yymao
Copy link
Member Author

yymao commented May 23, 2018

Sorry for coming to the party late --- I just got a chance to clean up and run #95 (output here).

But given the discussion above, it seems to me that we should revise the test so that it only tests bulge size for bulge-dominated galaxies and only tests disk size for disk-dominated galaxies. Is that right, @vvinuv @rmandelb @aphearin?

And if so, @vvinuv, can you update the test accordingly?

@evevkovacs
Copy link
Contributor

@yymao The most recent run of this test from the master branch here appears to have some issues. The bottom set of plots is missing for one thing. Thanks

@yymao
Copy link
Member Author

yymao commented Jun 6, 2018

@evevkovacs I believe that's a feature... the code always creates 6 panels but only 3 are used for size_Mandelbaum2014_BD (but 6 are all used in size_vanderWel2014_SM_Lum). See these configs:

https://github.com/LSSTDESC/descqa/blob/master/descqa/configs/size_Mandelbaum2014_BD.yaml#L20
https://github.com/LSSTDESC/descqa/blob/master/descqa/configs/size_vanderWel2014_SM_Lum.yaml#L25

But yes, the code should create the number of panels according to z_bins. (cc @vvinuv)

@evevkovacs
Copy link
Contributor

Actually, they are also blank in size_vanderWel2014_SM_Lum see here

@vvinuv
Copy link
Contributor

vvinuv commented Jun 6, 2018 via email

@yymao
Copy link
Member Author

yymao commented Jun 6, 2018

@evevkovacs have you run them on buzzard_test? Since protoDC2 only goes to z=1 is not surprising the lower panels are empty...

@vvinuv
Copy link
Contributor

vvinuv commented Jun 6, 2018 via email

@vvinuv
Copy link
Contributor

vvinuv commented Jun 6, 2018 via email

@vvinuv
Copy link
Contributor

vvinuv commented Jun 6, 2018 via email

@yymao
Copy link
Member Author

yymao commented Jun 6, 2018

@vvinuv your updates in #95 have not been merged because of the discussion above (about tests bulge size for bulge-dominated galaxies and only tests disk size for disk-dominated galaxies). Note that there's now some conflict with the master branch that needs to be resolved.

@yymao
Copy link
Member Author

yymao commented Sep 18, 2019

closed by #182 (which superseded #95)

@yymao yymao closed this as completed Sep 18, 2019
patricialarsen added a commit that referenced this issue Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants