Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing imaging photometry in metadata table #121

Closed
moustakas opened this issue Apr 28, 2023 · 8 comments
Closed

missing imaging photometry in metadata table #121

moustakas opened this issue Apr 28, 2023 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@moustakas
Copy link
Member

It looks like in some cases the output metadata table is not populated with the input (broadband) imaging photometry.

I thought this issue had to do with not being able to measure the aperture correction, but it appears to be something else.

Unfortunately, the bug affects the Iron/v1.0, Fuji/v2.0, and Guadalupe/v2.0 catalogs; fortunately, it affects only a relatively small fraction of objects.

  • Iron/v1.0: 43,607 (0.242%)
  • Fuji/v2.0: 12,942 (0.926%)
  • Guadalupe/v2.0: 580 (0.026%)

E.g., for Iron:

cd $DESI_ROOT/spectro/fastspecfit
prod='iron' 
ver='v1.0' 
iim = Table(fitsio.read(f'{prod}/{ver}/catalogs/fastspec-{prod}.fits', 'METADATA', \
  columns=['TARGETID', 'SURVEY', 'PROGRAM', 'HEALPIX', 'FLUX_G', 'FLUX_R', 'FLUX_Z', 'FLUX_W1'])) 
iif = Table(fitsio.read(f'{prod}/{ver}/catalogs/fastspec-{prod}.fits', 'FASTSPEC', columns=['LOGMSTAR'])) 
I = ((iim['FLUX_G'] == 0)*iim['FLUX_R'] == 0)*(iim['FLUX_Z'] == 0)*(iim['FLUX_W1'] == 0)*(iif['LOGMSTAR'] > 0) 
print(np.sum(I), np.sum(I)/len(I))

<Table length=43607>
      TARGETID      SURVEY PROGRAM HEALPIX  FLUX_G  FLUX_R  FLUX_Z FLUX_W1
       int64         str7    str6   int32  float32 float32 float32 float32
------------------- ------ ------- ------- ------- ------- ------- -------
      6448025174016    sv1    dark   28478     0.0     0.0     0.0     0.0
      6515536691200    sv1    dark    4958     0.0     0.0     0.0     0.0
      6521555517440    sv1    dark   10436     0.0     0.0     0.0     0.0
      6536638234624    sv1    dark   28534     0.0     0.0     0.0     0.0
      6546033475584    sv1    dark    5066     0.0     0.0     0.0     0.0
     28684312379396    sv1   other    2707     0.0     0.0     0.0     0.0
     28684316573697    sv1   other    2708     0.0     0.0     0.0     0.0
     28688884170754    sv1   other    2710     0.0     0.0     0.0     0.0
     28697990004751    sv1   other    2709     0.0     0.0     0.0     0.0
     28702498881536    sv1   other    2710     0.0     0.0     0.0     0.0
     28702503075844    sv1   other    2711     0.0     0.0     0.0     0.0
     28702503075846    sv1   other    2711     0.0     0.0     0.0     0.0
     28707011952640    sv1   other    2711     0.0     0.0     0.0     0.0
                ...    ...     ...     ...     ...     ...     ...     ...
2305843055619504455   main  backup   44257     0.0     0.0     0.0     0.0
2305843055619508651   main  backup   44257     0.0     0.0     0.0     0.0
2305843055619511632   main  backup   44256     0.0     0.0     0.0     0.0
2305843055619511838   main  backup   44256     0.0     0.0     0.0     0.0
2305843055619516110   main  backup   44256     0.0     0.0     0.0     0.0
2305843055619522825   main  backup   44257     0.0     0.0     0.0     0.0
2305843055623676124   main  backup   44262     0.0     0.0     0.0     0.0
2305843055623690778   main  backup   44260     0.0     0.0     0.0     0.0
2305843055623691405   main  backup   44260     0.0     0.0     0.0     0.0
2305843055623693312   main  backup   44260     0.0     0.0     0.0     0.0
2305843055623694644   main  backup   44260     0.0     0.0     0.0     0.0
2305843055623695539   main  backup   44260     0.0     0.0     0.0     0.0
2305843055623695956   main  backup   44260     0.0     0.0     0.0     0.0
2305843056445765854   main  backup   45047     0.0     0.0     0.0     0.0

I'm not sure what the resolution is, but need to make sure to create a unit test to prevent this issue from future productions.

@moustakas moustakas added the bug Something isn't working label Apr 28, 2023
@moustakas moustakas self-assigned this Apr 28, 2023
@moustakas
Copy link
Member Author

I began looking into this issue and determined that it's not a bug. The vast majority of these sources genuinely lack DR9 photometry and the stellar mass estimate comes from the spectrum alone.

However, there are a handful of cases where desispec.io.photo.gather_tractorphot reports no photometry which actually exists. The issue is in this bit of code, which I am reproducing at the very bottom of this note:
https://github.com/desihub/desispec/blob/main/py/desispec/io/photo.py#L1029-L1042

In essence, the code checks if there exists a north Tractor catalog, a south catalog, or both. If both catalogs exist then if the median declination of the sources in the input catalog is less than the north-south boundary, +32.375 deg, then the south catalog is chosen.

However, consider this (secondary) target:

<Table length=1>
   TARGETID    SURVEY PROGRAM HEALPIX        RA              DEC        FLUX_G  FLUX_R  FLUX_Z FLUX_W1 RELEASE BRICKID BRICKNAME BRICK_OBJID
    int64       str7    str6   int32      float64          float64        str1  float32 float32 float32  int16   int32     str8      int32
-------------- ------ ------- ------- ---------------- ---------------- ------- ------- ------- ------- ------- ------- --------- -----------
50532307697666    sv1    dark   28535 186.497709668916 33.6029271154256     0.0     0.0     0.0     0.0       0       0  1864p335           0

There is a source in the south Tractor catalog within ~0.4 arcsec:

https://www.legacysurvey.org/viewer-desi/?ra=186.497709668916&dec=33.6029271154256&zoom=16&mark=186.497709668916,33.6029271154256&layer=ls-dr9-south&sources-dr9s

But in the north Tractor catalog the nearest source is ~1.5 arcsec away:

https://www.legacysurvey.org/viewer-desi/?ra=186.497709668916&dec=33.6029271154256&zoom=16&mark=186.497709668916,33.6029271154256&layer=ls-dr9-north&sources-dr9n

(Recall that the matching radius is 1 arcsec.) However, since the declination (+33.603 deg) is above our north-south declination cut, +32.375 deg, we end up with no matching source and therefore no photometry.

So the options are to:

  • Do nothing; or
  • Be smarter about the case where both north and south Tractor catalogs exist, namely:
    • Check both catalogs and select the set of photometry (on an object-by-object basis) which has the most amount of information.
    • And if there are fluxes from both the north and the south, choose south.

Below is the code currently in main:
https://github.com/desihub/desispec/blob/main/py/desispec/io/photo.py#L1029-L1042

tractorfile_north = os.path.join(legacysurveydir, 'north', 'tractor', brick[:3], 'tractor-{}.fits'.format(brick))
tractorfile_south = os.path.join(legacysurveydir, 'south', 'tractor', brick[:3], 'tractor-{}.fits'.format(brick))
if os.path.isfile(tractorfile_north) and not os.path.isfile(tractorfile_south):
    tractorfile = tractorfile_north
elif not os.path.isfile(tractorfile_north) and os.path.isfile(tractorfile_south):
    tractorfile = tractorfile_south
elif os.path.isfile(tractorfile_north) and os.path.isfile(tractorfile_south):
    if np.median(input_cat[deccolumn][ipos]) < desitarget_resolve_dec():
        tractorfile = tractorfile_south
    else:
        tractorfile = tractorfile_north
elif not os.path.isfile(tractorfile_north) and not os.path.isfile(tractorfile_south):
    return out

@moustakas
Copy link
Member Author

@weaverba137 @stephjuneau @sbailey this issue would need be addressed in desispec.io.photo and would also impact the https://github.com/moustakas/desi-photometry LS/DR9 VACs, so please let me know if you have any thoughts or opinions. Any changes would allow us to extract photometry for a tiny number of secondary targets in the overlap region between DECaLS and BASS+MzLS, so it would be perfectly acceptable to do nothing.

And note that whatever we decide I do not think we should regenerate the Fuji LS/DR9 VAC, so this would only impact the DR1 (Guadalupe + Iron) VACs and database loads.

@stephjuneau
Copy link

Hi @moustakas et al., the example source indeed lacks a Tractor entry within 1" when querying against the combined tractor table. Not sure how it was selected as a secondary but it seems to be a small HII region in a spiral or tidal arm (and in good company given all the sources outside the SGA ellipse there - wow!). I agree with "do nothing" for Fuji. For others tables, I think what you proposed in terms of searching in both N and S and then picking the best one could be fine. The caveats are (1) we create inconsistencies with the formal "resolve" for joining the S+N footprints; (2) we might risk creating a cross-match to the wrong targets depending on which secondary program it is (on this topic, we might be able to make decisions on a program per program basis rather than an object per object basis -- for example the PV secondary program should be handled with care as cross-matching positions meant to be along the major or minor axis of a large galaxy to assign them their own photometry is questionable).

image

@sbailey
Copy link
Contributor

sbailey commented Aug 1, 2023

Expanding on Stephanie's caveat "we create inconsistencies with the formal "resolve" for joining the S+N footprints":

IIUC, elsewhere we treat the +32.375 deg boundary as the strict and only definition of whether sources should be taken from the N or the S footprints, i.e. as if there was no overlap, with the actual overlap being a expert-level debugging feature for data cross checks. In that spirit, I think you should "do nothing" and not do anything clever with checking both footprints in the overlap region. Strictly following the boundary as if the overlap didn't exist would be more consistent with how final target selection resolve catalogs are generated and the benefits to consistency outweigh the recovery of photometry for a small number of targets, especially given that a match in one but not the other is already a sign of something special/problematic about that target.

Mentioning @geordie666 in case there are any caveats to how strict the +32.375 boundary is.

@moustakas
Copy link
Member Author

To clarify: primary targets will always get the correct photometry because TARGETID encodes the appropriate north/south resolve, brickid, objid, etc.

I'm only talking about secondary targets here for which desispec.io.photo.gather_tractorphot uses positional matching to get the photometry. For some classes of targets this is absolutely the wrong thing to do (e.g., an off-center position of a large galaxy targeted as part of the PV program); but for other programs (e.g., a faint dwarf galaxy targeted using non-DR9 imaging), what the code is doing is defensible.

I think it's not crazy to try to check both north and south catalogs in the very rare case of an object like 50532307697666. But maybe I'm splitting hairs.

@sbailey
Copy link
Contributor

sbailey commented Aug 1, 2023

Even for secondary targets, I advocate that you should use the same resolve algorithm as primary target selection, i.e. the +32.375 N/S boundary, without further refinements/complications.

i.e. we do not do anything tricky like "if a dec>32.375 target fails target selection in the N but passes in the S, keep it", so by extension I think you also should not do anything like "if an ra,dec>32.375 location doesn't have a positional match in the N but does in the S, keep it".

Perhaps the tie breaker for @geordie666 to comment on: target selection has the option for secondary targets to be position-matched to primary targets and inherit the primary TARGETID if there is a match. If a dec>32.375 secondary does not have a N match, does the code check if there is a match in the S, or is the S data ignored and this is treated as a non-match full stop?

i.e. I think desispec.io.photo should resolve to the same N/S positional matching algorithm that target selection would have, if matching had been requested at the time of target selection. But perhaps that ship has already sailed and you are already "trying harder" to find a match in other ways...

@geordie666
Copy link

Regarding the tiebreaker:

The target referenced above by @moustakas is also in the Main Survey, during which desitarget was matching (with a 1" radius) to recover fluxes from imaging:

a = fitsio.read("/global/cfs/cdirs/desi/public/ets/target/catalogs/dr9/1.1.1/targets/main/secondary/dark/targets-dark-secondary.fits")
a[80084][["RA", "DEC", "OVERRIDE", "FLUX_G", "FLUX_R", "FLUX_Z", "TARGETID", "SCND_TARGET"]]

Out[]: (186.49770967, 33.60292712, False, 0., 0., 0., 2249555563249666, 2048)

Note that this is an OVERRIDE==False target, so would have been allowed to be merged with a primary. But, no fluxes were recovered and the TARGETID corresponds to a secondary; so, this target never found a primary with which to merge.

Philosophically, I'm not opposed to breaching the N/S boundary when matching secondaries to primaries. But, if the question is asked purely on the basis of "what would desitarget do" then we shouldn't cross the N/S boundary, I think.

Note that I haven't checked all possible corner cases in the code, so I can't promise desitarget never crosses the boundary when matching primaries and secondaries. But, it's fairly convincing that desitarget didn't cross the boundary for the quoted example (TARGETID==50532307697666 in sv1 which is TARGETID==2249555563249666 in main).

One note on the N/S resolve: The boundary isn't purely at dec=32.375o. Sources are resolved based on Galactic latitude, too. Sources have to be both north of the Galactic plane (in Galactic coordinates) and north of Dec ≥ 32.375o
(in equatorial coordinates) to be considered an "N" source.

@moustakas
Copy link
Member Author

From @sbailey:

Even for secondary targets, I advocate that you should use the same resolve algorithm as primary target selection, i.e. the +32.375 N/S boundary, without further refinements/complications.

i.e. we do not do anything tricky like "if a dec>32.375 target fails target selection in the N but passes in the S, keep it", so by extension I think you also should not do anything like "if an ra,dec>32.375 location doesn't have a positional match in the N but does in the S, keep it".

I strongly disagree and I'll note that this is not what the code currently does. This algorithm also doesn't extend to other imaging datasets like DR10 where there is no north-south split and you'd be throwing away data if you did a resolve.

From @geordie666:

Philosophically, I'm not opposed to breaching the N/S boundary when matching secondaries to primaries. But, if the question is asked purely on the basis of "what would desitarget do" then we shouldn't cross the N/S boundary, I think.

For secondary targets, desispec.io.photo.gather_tractorphot intentionally doesn't do what desitarget does. In fact, it was developed in large part because there were secondary targets with perfectly reasonable DR9 photometry that was (intentionally) not propagated into the pipeline. For non-DR9-selected secondary targets, I think it's totally defensible to (a) use positional matching; and (b) cross the north-south resolve boundary in order to prefer DECam imaging (which is what the code does).

In the end, after looking at many other of these targets, I don't think we should do anything different than what we're already doing. I'm going to close this ticket since it's on the critical path for the next set of VACs but if anyone else has thoughts or comments please feel free to add them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

4 participants