-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quickcat2 #85
Conversation
@sbailey Thanks for this reorganization. The numbers look okay other than for QSO. 255 km/s is for the distribution excluding catastrophic failures that still have zwarn=0. So for zwarn fraction of 1-4094/5000, the redshift error should be 423 km/s. We may also somehow need to address these catastrophic failures. |
The hardcoded TYPE mapping is not formally defined; it was taken from current fiberassign usage. After discussions with @rncahn, we're working on yet another update to the formats to clean this up. More updates after that settles down. |
this quickcat2 branch and current fiberassign use the truth file "TYPE" as an integer index into the fiberassign features file type categories such as:
This is convenient for fiberassign_surveysim but not for anything else. I propose to go back to file formats that most closely match the final data flow, and add a translation script to convert from the MTL target+truth files into whatever Bob needs for his "secret" file with this integer TYPE index. The point of that is to maintain a working fiberassign_surveysim while also providing a reasonable dataflow for getting the more complete quicksurvey sim going. Everytime we try to implement this data flow some new issue comes up, but I think we are getting closer to what we want. Here is my draft of what I think should be in each file:
This impacts fiberassign, desitarget, and desisim for them all to coordinate. I'm drafting code to try this out now to see if it is viable. Please comment if you think anything is missing/wrong/ill-advised. |
This still has the BGS,MWS != DESI bias.... (I'm thinking specifically of DESI_TARGET.) Should this be DARK_TARGET or is this choice already too entrenched...? |
|
@moustakas the choice isn't entrenched, but it also wasn't random. After a bit of discussion about DESI_TARGET vs. DARK_TARGET, we originally went with DESI_TARGET as containing a combination of:
@forero, thanks for the comments.
The dark time survey will likely use this as a random number — its existence was based upon Daniel's suggestion to put the random choice in the target file itself rather than the code. However, the Milky Way Survey would like to rank order their targets, so for them it is a real priority. The overall idea is that you could add the integer PRIORITY with the [0-1] floating point SUBPRIORITY and get the order in which you should chose targets.
The challenge that we previously didn't anticipate is that the targeting bits don't cover all the possibilities for true types. e.g. DESI_TARGET.QSO means something that was targeted as a QSO, but doesn't include things that actually are QSOs but have colors that fail the target selection. Similarly for genuine stars that pass QSO target selection cuts (DESI_TARGET.QSO) but not the star cuts (DESI_TARGET.STD_*). So TRUETYPE was intended to be a map of what a redshift fitter would report if everything was working perfectly. Currently our redshift fitters are outputting strings, which is sometimes convenient (for humans) and sometimes a pain (slow for code).
These don't necessarily have to be a string, but within python it's not that bad to work with. See the mtl branch of desitarget for my current draft of working with these files. I'm willing to reconsider if it turns out to be a huge pain, but let's try the string version for a bit longer.
Hehe. This one is from you! Yes, it is the number of targets that are covered by this fiber, and is used in interpreting HDU2 of the tile file that lists exactly which targets those are.
The original idea was that this was the priority and numobs before taking any observations, and the priority and numobs after you have taken the first observation and determined what type it actually is. However, I'm having second (third?) thoughts about that, and am thinking about going back to having a simple integer CATEGORY in the truth file which is used to look up the pre/post values in the features files. That CATEGORY is redundant with a combination of (Z, ZWARN, TRUETYPE, and *_TARGET), but it might be the least hassle while we are transitioning from the internal fiberassign_surveysim to the full quicksurvey = mtl+fiberassign+quickcat simulation loop. I'll work on this this afternoon. |
I updated this branch to use the new formats described in fiberassign PR #24 and desitarget PR #32. @forero please review these 3 pull requests as a coordinated set, and let me know if you have questions! NOTE: I have not yet updated the bin/quickcat script to match the desisim.quickcat module. I have to run right now, but please wait for the bin/quickcat script to also be updated before doing the final merge. But you can use the python module as:
@gdhungana example files for using this version of quickcat are at NERSC in /project/projectdirs/desi/mocks/preliminary/mtl/v2 |
I tested it as shown in this python script. It worked but it took a long time to run on the whole survey. Something like 25 min on Edison (for comparison running fiberassign takes ~20min). This |
I'll merge this now, even if it is slow. I used astropy Tables for their database-like join syntax convenience for combining the truth catalog with the zcatalog and that may be the cause of the slowness. The other possible reason is the counter of how many times has each target been observed (I used collections.Counter from the python standard library, but there might be a faster way). |
The original quickcat branch/PR had gotten old and messy to merge, so I started fresh with a new quickcat2 branch, borrowing pieces from the previous (which will be abandoned).
This PR adds:
The redshift performance is based upon Redmonster applied to zdc1 data by @gdhungana as documented in DESI-1657 (his DESI data telecon presentation from last week). I hope we'll be able to improve the zwarn efficiency numbers, but let's start realistically pessimistic for now.
quickcat includes a perfect=True option to maintain input z -> output z.
@gdhungana please review the redshift errors and zwarn fractions.
@forero please check the module interface — is this usable for quicksurvey?