# DP0.2 Object Catalog tutorial -- Part VII: SRV releases

Owners: **PatriciaLarsen [@plarsen](https://github.com/LSSTDESC/DC2-analysis/issues/new?body=@patricialarsen)**  
Last Verifed to Run: **2024-06-04** (by @plarsen)

This notebook will show you how to access the DP0.2 object catalog through the Generic Catalog Reader (GCR, https://github.com/yymao/generic-catalog-reader), or directly through the parquet files at nersc. We will talk through how to look at the metadata, and discuss how to read the data in an efficient manner. 


__Learning objectives__:

After going through this notebook, you should be able to:
  1. Understand and apply the cuts recommended by the SRV group for various functionalities
  2. See the main validation tests that inform these cuts

__Sections__:

__Note__:
 - This is a particularly draft notebook

__Logistics__: This notebook is intended to be run through the JupyterHub NERSC interface available here: https://jupyter.nersc.gov. To setup your NERSC environment, please follow the instructions available here: https://confluence.slac.stanford.edu/display/LSSTDESC/Using+Jupyter+at+NERSC


In [6]:
%%html
<style>
  table {margin-left: 0 !important;}
</style>

### Sets of releases:
 - better photoz around clusters (Markus' work)
 - shear catalog (HSM) 
 - etc. 
 
 References: 
 
 https://academic.oup.com/pasj/article/70/SP1/S25/4774314#431908315 (Mandelbaum et al)
 
 https://iopscience.iop.org/article/10.3847/1538-4365/aab4f5 (DES photometric data set)
 
 Things to look into: Systematics Tests In LEnsing

In [4]:
i detect isprimary == True Identify unique detections only
i deblend skipped == False Deblender skipped this group of objects
i pixelflags interpolatedcenter == False A pixel flagged as interpolated is close to object center
i pixelflags saturatedcenter == False A pixel flagged as saturated is close to object center
i pixelflags crcenter== False A pixel flagged as a cosmic ray hit is close to object center
i pixelflags suspectcenter == False A pixel flagged as near saturation is close to object center

i pixelflags clipped == False Source footprint includes clipped pixels
i pixelflags edge == False Object too close to image boundary for reliable measurements
i pixelflags bad== False A pixel flagged as otherwise bad is close to object center

i extendedness value ! = 0 Extended object

i sdsscentroid flag == False Centroid measurement failed
i hsmshaperegauss flag == False Error code returned by shape measurement code
i hsmshaperegauss sigma ! = NaN Shape measurement uncertainty should not be NaN


Galaxy property cuts
i cmodel flux/i cmodel fluxerr ≥ 10 Galaxy has high enough S/N in i-band
i hsmshaperegauss resolution ≥ 0.3 Galaxy is sufficiently resolved
(i hsmshaperegauss e12+i hsmshaperegauss e22
)
1/2 < 2 Cut on the amplitude of galaxy ellipticity
0 ≤i hsmshaperegauss sigma ≤ 0.4 Estimated shape measurement error is reasonable
i cmodel mag − a i ≤ 24.5 CModel Magnitude cut
i apertureflux 10 mag ≤ 25.5 Aperture (1 arcsec diameter) magnitude cut
i blendedness abs < 10−0.38 Avoid spurious detections and those contaminated by blend

## Base LSST flags

| Flag name | True/False  | Description   |
|:---|:-------------:|:-----------:|
| detect_isPrimary | True  | Removes duplicate detections (primarily from coadd edge regions) as well as skyobjects   |
| refExtendedness  | 1 (should be ne 0??)    | Source is classified as a galaxy | 
| deblend_skipped  | False | Removes objects skipped by deblender | 


### Pixel flags (center)

| Flag name | True/False  | Description   |
|:---|:-------------:|:-----------:|
| g_pixelFlags_clippedCenter | False  | Center is close to clipped pixel  |
| g_pixelFlags_crCenter | False  | Cosmic ray in source center   |
| g_pixelFlags_edge | False  | Source outside usable exposure region   |
| g_pixelFlags_inexact_psfCenter | False  | Source center close to inexact psf pixels   |
| g_pixelFlags_interpolatedCenter | False  | Interpolated pixel in source center   |
| g_pixelFlags_offimage | False  | Source center off image   |
| g_pixelFlags_saturatedCenter | False  | Saturated pixel in source center   |
| g_pixelFlags_sensor_edgeCenter | False  | Source center close to sensor edge pixel   |
| g_pixelFlags_suspectCenter | False  | Source center close to suspect pixel   |



### Pixel flags (footprint)

| Flag name | True/False  | Description   |
|:---|:-------------:|:-----------:|
| g_pixelFlags_bad | False  | Shows a bad pixel in the source footprint   |
| g_pixelFlags_clipped | False  | Footprint includes clipped pixels   |
| g_pixelFlags_cr | False  | Cosmic ray in footprint   |
| g_pixelFlags_inexact_psf | False  | Source includes inexact psf pixels   |
| g_pixelFlags_interpolated | False  | Interpolated pixel in source footprint   |
| g_pixelFlags_offimage | False  | Source center off image   |
| g_pixelFlags_saturated | False  | Saturated pixel in source footprint   |
| g_pixelFlags_sensor_edge | False  | Source footprint includes sensor edge pixel   |
| g_pixelFlags_suspect | False  | Source footprint includes suspect pixel   |


### Photometry cuts

| Flag | True/False  | Description   |
|:---|:-------------:|:-----------:|
| i_cModel_flag | False  | Failure of cModel fit   |


| Flag | Exact Cut  | Description   |
|:---|:-------------:|:-----------:|
| cmodel_magnitude | mag(i_cModel_flux)<24.5 *  | Limiting to match spectroscopic training and calibration data   |
| cmodel S/N       | cModel_S/N > 10            | restricting unforced signal to noise                            |
| blendedness      | i_blendedness < 10^(-0.375) | removing objects with strong neighbour contamination (not mildly blended objects) | 


*note: should be after extinction correction but this is unavailable for DP0.2

## Extra flags

| Flag name | True/False  | Description   |
|:---|:-------------:|:-----------:|
| detect_isIsolated | True  | Removes blended objects (test for deblender) |


https://dm.lsst.org/sdm_schemas/browser/dp02.html


In [None]:


Resolution factor [equation (4)] R2 ≥ 0.3. A completely unresolved object will have R2 = 0, while a
fully resolved one will have R2 = 1.

Total magnitude of the distortion (after PSF correction) defined in equation (2) should satisfy the constraint
|e| < 2. Due to noise, the distribution of distortion values extends into the non-physical |e| > 1 regime. 
Truncating the distribution too aggressively at 1 leads to a negative shear bias; however, some truncation
is needed to enable mean shear statistics to converge.

The catalog estimate of the shape measurement uncertainty due to pixel noise, σe, 
should lie in the range [0, 0.4]. This cut removes only a tiny fraction of highly anomalous objects, 
<1% of those that pass the other cuts.

Multi-band detection cut, defined by requiring at least two other bands (out of grzy) 
to have a cmodel detection significance exceeding 5. This cut removes a very small fraction of objects,
<1%, that pass our other cuts. In addition to ensuring enough color information to compute a photometric redshift, 
this cut also helps remove junk detections, asteroids (Hildebrandt et al. 2017), and so on.


As noted in sub-subsection 2.3.3, the object must lie in a region where all 
overlapping exposures contributed to the coadd, so the coadded PSF model 
(which does not account for missing pixels within sensors) is correct. 
Due to a bug in hscPipe, this filtering was not complete; a small number of
objects lying on CCD boundaries, sensor defects, or cosmic rays were not flagged 
by the pipeline and could not be removed in this cut. The internal PSF quality tests 
in subsection 4.2 are sensitive to this problem, however, and demonstrate that its effects 
do not cause the PSF model errors to exceed our requirements.



In [None]:
Principal Columns: For convenience, Rubin Observatory staff have identified the principal columns which are most likely to be useful. These principal columns will be pre-selected in the Table View of the RSP’s Portal Aspect.

Recommended Search Parameter “detect_isPrimary = True”: A good default search query parameter for the Object, Source, and ForcedSource catalogs is to set detect_isPrimary = True. The detect_isPrimary parameter is True if a source has no children, is in the inner region of a coadd patch, is in the inner region of a coadd tract, and is not “detected” in a pseudo-filter. Setting detect_isPrimary to True will remove any duplicates, sky objects, etc. See this documentation on filtering for unique, deblended sources with the detect_isPrimary flag for more information.

For photometry of point sources: PSF model fluxes are generally recommended, but there could be issues for objects near the edges of CCDs. For single-visit (source) photometry, it is recommended to use psfFlux for the flux, psfFluxErr for the flux error, and psfFlux_flag for culling sources with poorly determined PSF model fluxes. For coadd (object) photometry, it is recommended to use <band>_psfFlux for the flux, <band>_psfFluxErr for the flux error, and <band>_pixelFlags_inexact_psfCenter to identify objects which may contain sources with poorly determined PSF photometry. (Note: the object <band>_inputCount value can help indicate how strong this effect may be; the larger <band>_inputCount, the smaller the effect.)

For photometry of extended sources: <band>_cModelFlux is a reasonable choice for galaxy fluxes, but the Gaussian aperture fluxes are generally preferred for galaxy colors. Of the many Gaussian aperture fluxes, the <band>_gaap1p0Flux (the sigma=1.0-arcsec Gaussian aperture) seems to be a reasonable choice. Currently, the Gaussian optimal aperture (<band>_gaapOptimalFlux) tends to fail often and is not generally recommended. For further information on Gaussian aperture photometry, please consult Kuijken (2008), Kuiken et al. (2015), and/or Konrad Kuijken’s talk at the March 2020 Rubin Observatory Algorithms Workshop (link).

Truth catalog data: The TruthSummary and MatchesTruth tables are accessible via TAP (and not the Butler) as demonstrated in DP0.2 tutorial notebook “DP02_08_Truth_Tables.ipynb”. Additional truth data has been made available by DESC as parquet files in the shared disk space in the RSP at data.lsst.cloud, with access demonstrated in this DP0.2 contributed notebook. Find more information about the matching algorithm in Matching the Object and Truth Tables.



Principal Columns: For convenience, Rubin Observatory staff have identified the principal columns which are most 
likely to be useful. These principal columns will be pre-selected in the Table View of the RSP’s Portal Aspect.

Recommended Search Parameter “detect_isPrimary = True”: A good default search query parameter for 
the Object, Source, and ForcedSource catalogs is to set detect_isPrimary = True. The detect_isPrimary 
parameter is True if a source has no children, is in the inner region of a coadd patch, is in the 
inner region of a coadd tract, and is not “detected” in a pseudo-filter. Setting detect_isPrimary 
to True will remove any duplicates, sky objects, etc.

See this documentation on filtering for unique, deblended sources with the detect_isPrimary flag for more information.

For photometry of point sources: PSF model fluxes are generally recommended, but there could be issues
for objects near the edges of CCDs. For single-visit (source) photometry, it is recommended to use 
psfFlux for the flux, psfFluxErr for the flux error, and psfFlux_flag for culling sources with poorly
determined PSF model fluxes. For coadd (object) photometry, it is recommended to use <band>_psfFlux 
for the flux, <band>_psfFluxErr for the flux error, and <band>_pixelFlags_inexact_psfCenter to identify 
objects which may contain sources with poorly determined PSF photometry. (Note: the object <band>_inputCount
value can help indicate how strong this effect may be; the larger <band>_inputCount, the smaller the effect.)

For photometry of extended sources: <band>_cModelFlux is a reasonable choice for galaxy fluxes, 
but the Gaussian aperture fluxes are generally preferred for galaxy colors. Of the many Gaussian aperture 
fluxes, the <band>_gaap1p0Flux (the sigma=1.0-arcsec Gaussian aperture) seems to be a reasonable choice. 
Currently, the Gaussian optimal aperture (<band>_gaapOptimalFlux) tends to fail often and is not generally 
recommended. For further information on Gaussian aperture photometry, please consult Kuijken (2008), 
Kuiken et al. (2015), and/or Konrad Kuijken’s talk at the March 2020 Rubin Observatory Algorithms Workshop (link).

Truth catalog data: The TruthSummary and MatchesTruth tables are accessible via TAP (and not the Butler) as 
demonstrated in DP0.2 tutorial notebook “DP02_08_Truth_Tables.ipynb”. Additional truth data has been made 
available by DESC as parquet files in the shared disk space in the RSP at data.lsst.cloud, with access 
demonstrated in this DP0.2 contributed notebook. Find more information about the matching algorithm in 
Matching the Object and Truth Tables.

