Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing luminosity function of cluster galaxies (satellites, centrals) #9

Closed
2 of 4 tasks
yymao opened this issue Nov 6, 2017 · 32 comments · Fixed by #102
Closed
2 of 4 tasks

testing luminosity function of cluster galaxies (satellites, centrals) #9

yymao opened this issue Nov 6, 2017 · 32 comments · Fixed by #102

Comments

@yymao
Copy link
Member

yymao commented Nov 6, 2017

@rmandelb is the idea to do a conditional luminosity function using true halos?

  • code to reduce mock data
  • code that works within DESCQA framework
  • validation data
  • validation criteria
@rmandelb
Copy link

rmandelb commented Nov 6, 2017

@yymao - perhaps we could ask the cluster conveners @anjavdl and @erozo to weigh in on designing a validation test for the cluster central galaxy LF, since the CL group is the one that has concerns about this. They can of course bring in anybody else who they think should be in on the discussion. But yes, I was assuming some kind of validation test would use the true halos above a mass threshold, along with the true centrals and satellites.

@katrinheitmann
Copy link

Dan Korytov has implemented a test for this (not yet in DESCQA though). @evevkovacs : can you check if Dan has signed up for the github repo? I did send him instructions a while ago but I think he forgot. Thanks!

@yymao yymao assigned j-dr and unassigned j-dr Nov 6, 2017
@yymao yymao removed the help wanted label Nov 6, 2017
@yymao
Copy link
Member Author

yymao commented Nov 6, 2017

Dan is on GitHub but GitHub doesn't let me assign this to @dkorytov... I am also assigning @j-dr since he has worked on something similar.

@dkorytov
Copy link
Contributor

dkorytov commented Nov 7, 2017 via email

yymao added a commit that referenced this issue Nov 12, 2017
@yymao yymao mentioned this issue Nov 12, 2017
yymao added a commit that referenced this issue Nov 12, 2017
@yymao
Copy link
Member Author

yymao commented Nov 12, 2017

@rmandelb @erozo @anjavdl @dkorytov

I modified @j-dr's CLF test and put it into DESCQA2. You can see the results here.

Buzzard high-res seems a bit strange, doesn't it, @j-dr?

@yymao
Copy link
Member Author

yymao commented Nov 12, 2017

@rmandelb @erozo @anjavdl @dkorytov @j-dr

Sorry, I chicked "comment" too soon --- I also want to ask if there's validation data/criteria that we should use for the CLF test? Right now this is not really a "test" as it just plots CLF.

@j-dr
Copy link
Contributor

j-dr commented Nov 12, 2017

Yeah, I think there must be some mishandling/mislabeling of central/satellite info in buzzard highres. I'll look into it. As for comparison to data, I think we need to add some color cuts to this test in order to get something that we can actually compare to, say, a redmapper CLF. I told @erozo and @anjavdl that I would do this a while ago, so I'd be happy to try my hand at it.

@erozo
Copy link

erozo commented Nov 14, 2017

That would be super @j-dr !
@yymao has also started talking with @erykoff about running redmapper. I don't think this can be trivially implemented as a QA test, but it will be extremely useful.

@yymao
Copy link
Member Author

yymao commented Nov 14, 2017

@erozo @erykoff One possibility is that, instead of using redmapper in QA tests, we run redmapper on the catalogs once and then put the redmapper info back to the catalogs so that other QA tests can use them.

@j-dr
Copy link
Contributor

j-dr commented Nov 14, 2017

Yeah, I think that's definitely the way to go. For instance, Chun-Hao To @chto might be able to implement his redmapper CLF code as a QA test.

@evevkovacs
Copy link
Contributor

@erozo @erykoff @yymao @j-dr @dkorytov A first step would be to try running redmapper on the existing catalogs to see what the problems are and how long it takes etc. Could we get a volunteer
from the CL group to try this? I can propose this as a sprint for Hack Week, although it would be good to get started on it sooner than that.

@erykoff
Copy link

erykoff commented Nov 14, 2017

I'm working on running redMaPPer on Proto-DC2 as we speak. (well, wrestling with python at nersc, but it's a start).

@j-dr
Copy link
Contributor

j-dr commented Nov 14, 2017

Awesome! Let me know if you have any questions about nersc. I've done plenty of wrestling with it myself.

@yymao
Copy link
Member Author

yymao commented Nov 14, 2017

Just for your information about the Python stuff: Heather has set up python environment on NERSC which is very to use. You can find instruction here on how to enable the enviorment

@evevkovacs
Copy link
Contributor

I also have the GCR running at ANL (following the instructions that Yao pointed you to), if you are interested in having a local version to play with. (There's also a 10M test catalog available at nersc).
Please keep us posted on your progress.

@yymao
Copy link
Member Author

yymao commented Dec 5, 2017

The CLF test is done but we still need validation datasets. Pinging @erozo @anjavdl to see if they have any further thoughts on this.

@yymao
Copy link
Member Author

yymao commented Dec 11, 2017

Posting some CLF plots here for easy access --- (taken from this DESCQA run)

image

image

@j-dr
Copy link
Contributor

j-dr commented Jan 19, 2018

@erykoff This probably also needs to be either made required or at least strongly prioritized for CL. We can provide a CLF fit to SDSS data to validate against, but some work will probably need to be done in order to make a similar selection on the sims without running redmapper.

Matching both the central LF and satellite LFs will be important for miscentering and a realistic scatter in richness at fixed mass will be important for understanding purity and completeness.

@rmandelb
Copy link

Joe - can you comment on whether the test plots that Yao shared from Dan's work on this (#9 (comment)) might be providing essentially what you want? It's based specifically on a selection in bins in halo mass, which gets around the "needing to run redmapper" problem. As you can see, as implemented it does include both the central and satellite LF.

Might we consider scatter in richness at fixed mass as a separate test? Perhaps a test of the mean richness and its scatter as a function of halo mass? (I realize the mean richness vs. mass is essentially integrating over these LFs, but the scatter incorporates additional per-cluster information.)

@yymao
Copy link
Member Author

yymao commented Jan 21, 2018

Just to clarify that the plots above are taken from a DESCQA test Joe and I wrote, so @j-dr should be very familiar with that test :)

@aphearin
Copy link

@yymao and @j-dr - Could you clarify the meaning of the halo mass variable in these plots? Is this RedMapper-determined mass? Or some other observational estimator for halo mass? Or is this the CLF as a function of true halo mass according to some model fit to a cluster catalog? I do not yet understand how we can get around the "needing to run redmapper" problem, though I agree with @rmandelb that this indeed a problem.

@yymao
Copy link
Member Author

yymao commented Jan 22, 2018

@aphearin currently what is plotted is true halo mass. But yes, we might need to run redmapper for a more realistic test

@j-dr
Copy link
Contributor

j-dr commented Jan 22, 2018

The problem with this test as currently implemented is that it's hard to come up with something to validate against that would be comparing apples to apples. I think at the very least we need to come up with more realistic color cuts, since as of now I was just cutting on the 25% reddest galaxies using restframe g-r.

One way we could tune the cuts is using buzzard since we have a full redmapper run that we can measure the CLF from. We can then tune the color/radial cuts for this test so that we get out something close to what is measured using the real redmapper run.

@j-dr
Copy link
Contributor

j-dr commented Jan 22, 2018

Or we could just run redmapper on protodc2 and include the cluster catalog as an auxiliary catalog.

@aphearin
Copy link

It would be ideal to run RedMapper on protoDC2 if that is feasible. If so, we should hold off on this until after I am done with the protoDC2 overhaul.

@j-dr
Copy link
Contributor

j-dr commented Jan 22, 2018

@erykoff already fit the red sequence in protodc2, so that part is definitely possible, but I remember him saying that he thought there might be problems running the full cluster finder, possibly related to the limited area? I'll let him elaborate.

@erykoff
Copy link

erykoff commented Jan 22, 2018

I was worried at first that the protoDC2 area would be insufficient for training redMaPPer, but it seems fine. Certainly there is noise in the measurements due to the small volume, but it did seem to work okay on the earlier protoDC2. Also, the volume is small enough that it runs pretty fast!
Once a new version of protoDC2 is available (and I think this is what @aphearin is referring to?) I'll be able to turn something around in 1-2 days.

@rmandelb
Copy link

rmandelb commented Feb 1, 2018

@erykoff @j-dr - I wanted to bring your attention back to this validation test of the conditional luminosity functions of satellites and centrals. We had established that Eli will rerun redmapper on protoDC2 once @aphearin says it's ready to go (not yet!) but while we are waiting for the updated protoDC2 catalogs, I'd like to address a few remaining questions:

  • Should this be considered a required validation test? Joe had commented that this may be the case, but can you please confirm? (If so, I will update it to indicate this.)
  • Can you clarify what changes are needed in how the test is implemented? (e.g., right now it uses true halo mass but this probably has to change; anything else?)
  • Can you please suggest a pass/fail criterion?

@j-dr
Copy link
Contributor

j-dr commented Feb 1, 2018

@rmandelb, to address your questions:

  • I think this should be required, but we can make the criteria relatively lax because there are a lot of things to worry about here.
  • I'm not sure what the alternative to using true halo mass is. For instance, if we use richness bins, then the amplitude of the satellite LF is guaranteed to match the data. @erykoff and @erozo probably have thoughts about possible alternatives to using true halo mass, but nothing comes to my mind immediately. The other thing to worry about is the color and radial cuts to make, but I think we can tune these using the method I outlined above.
  • @chto has a CLF measurement from SDSS redmapper that we can use to validate against.
  • There are two aspects of the CLF that we need to validate:
  1. We want to make sure that centrals aren't excessively bright as they appear in the current iteration of protodc2. This will likely make centering unrealistically good. I'm not sure what the best way to state this quantitatively, but a chi squared requirement on the bright end might fit the bill.

  2. The other thing that we care about is the amplitude of the satellite LF as a function of mass, or really the mass-richness relation. I think CL mostly cares about testing recovery of the true mass richness relation in the simulation, so perfect agreement with data isn't totally necessary. I think the most important thing is making sure that the amplitude and slope are such that lambda>20 corresponds to approximately the same mass in DC2 as it does in the data. I think this aspect needs to be discussed more though.

@erykoff
Copy link

erykoff commented Feb 1, 2018

So @j-dr and I just chatted about this. The first thing is about Joe's point 1. Looking at the protodc2 plots above the centrals are too bright compared to the satellites (making centering too easy/optimistic) and for buzzard too faint (making centering too difficult/pessimistic). However, the absolute luminosity doesn't matter as much as the relative luminosity. The ratio of the integral of the satellite luminosity function (from mean central luminosity - 1 sigma to mean + 1 sigma) to the same integral of satellite + central we think would capture this. We can measure the same from @chto's CLF measurements on SDSS and we have to choose what the exact criteria is (how well do we need to match this?).
The next question is halos vs clusters. Bins of mtrue would be best, but we need a way to compute radial/color cuts. If @chto runs just once on a redMaPPer run we can choose fiducial radial/color cuts such that the measurements on halos with mtrue match what comes out of the cluster measurements. (For some reference mass/richness bin). This would certainly get us some reasonable cuts on the galaxies in the halos to make this comparison meaningful.

Second, about the amplitude as a function of mass. CL just needs to be able to recover the mass-richness relation, and needs to have approximately correct statistics. We think that if the number density of lambda>20 clusters in DC2 is within a factor of 2 of the data that would indicate that the normalizations are good enough for our use.

@rmandelb
Copy link

rmandelb commented Feb 2, 2018

Hi @j-dr and @erykoff -

Your reasoning for why this is important to have as a required test (but then not being super stringent) makes sense to me.

One thing I want to note is that @aphearin is updating protoDC2, but there will be cosmoDC2 in the not-too-distant future, so these tests that involve running redmapper imply two redmapper runs (not just one). Hopefully not a problem, but I wanted to mention it.

It sounds like you are on the way to two distinct, quantitative validation criteria:

  • satellite vs. central luminosity (after running redMaPPer and some post-processing)
  • amplitude as a function of mass - so here you are talking about taking the total lambda estimate for each cluster and calculating the number density of those with lambda>20, in comparison with that in the data? That sounds totally reasonable, but it doesn't seem to have much (anything?) to do with the central vs. satellite CLF... should we make it a separate test? I'm just thinking we don't want to flag the central vs. satellite CLF as having failed when maybe it is really an overall amplitude normalization problem. The impact of failing on these two things is rather different.

@yymao
Copy link
Member Author

yymao commented Feb 14, 2018

@j-dr @erykoff can you confirm that it is true that we need to run redMapper for this test to be useful? If so, I think we should wait for the updated DC2 catalog and then revisit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants