### This notebook uses a current PR of gcr-catalogs, make sure you clone the PR and add it to `sys.path`
```
git clone -b coadd_reader git@github.com:djperrefort/gcr-catalogs.git
```

This notebook also requires `FoFCatalogMatching`: https://github.com/yymao/FoFCatalogMatching

### This notebook tests the FoF Catalog Matching algorithm for truth and coadd catalogs, run 1.1
### Owners: Anže Slosar, Bhairav Valera

In [None]:
import sys
sys.path.insert(0, '/global/common/software/lsst/common/miniconda/current/lib/python3.6/site-packages')
sys.path.insert(0, '/global/homes/b/bhairav/gcr-catalogs')
import numpy as np
import FoFCatalogMatching
import GCRCatalogs
import operator
from astropy.coordinates import SkyCoord
from GCR import GCRQuery
from scipy.optimize import curve_fit
from collections import defaultdict
from matplotlib.colors import LogNorm

In [None]:
import matplotlib.pyplot as plt
import matplotlib
%matplotlib inline

Getting the complete truth and coadd catalogs without tract restrictions.

In [None]:
coaddCat = GCRCatalogs.load_catalog('dc2_coadd_run1.1p')
refCat = GCRCatalogs.load_catalog('dc2_truth_run1.1', {'md5': None})

Adding quantity modifiers in order for the HSC filters to act on the coadd catalog. 

In [None]:
refCat.add_quantity_modifier('i_mag_cmodel', 
                              (lambda x: -2.5 * np.log10(x) + 27.0, 'i_modelfit_CModel_flux'), 
                              overwrite=True)
refCat.add_quantity_modifier('i_SN_cmodel', 
                              (np.divide, 'i_modelfit_CModel_flux', 'i_modelfit_CModel_fluxSigma'), 
                              overwrite=True)
refCat.add_quantity_modifier('HSM_res', 
                              'ext_shapeHSM_HsmShapeRegauss_resolution', 
                              overwrite=True)
refCat.add_quantity_modifier('HSM_ell', 
                              (np.hypot, 'ext_shapeHSM_HsmShapeRegauss_e1', 'ext_shapeHSM_HsmShapeRegauss_e2'), 
                              overwrite=True)
refCat.add_quantity_modifier('psf_size', 
                              (lambda xx, yy, xy: 0.168*2.355*(xx*yy - xy*xy)**0.25, 'i_base_SdssShape_psf_xx', 'i_base_SdssShape_psf_yy', 'i_base_SdssShape_psf_xy'),
                              overwrite=True)

coaddCat.add_quantity_modifier('i_mag_cmodel', 
                              (lambda x: -2.5 * np.log10(x) + 27.0, 'i_modelfit_CModel_flux'), 
                              overwrite=True)
coaddCat.add_quantity_modifier('i_SN_cmodel', 
                              (np.divide, 'i_modelfit_CModel_flux', 'i_modelfit_CModel_fluxSigma'), 
                              overwrite=True)
coaddCat.add_quantity_modifier('HSM_res', 
                              'ext_shapeHSM_HsmShapeRegauss_resolution', 
                              overwrite=True)
coaddCat.add_quantity_modifier('HSM_ell', 
                              (np.hypot, 'ext_shapeHSM_HsmShapeRegauss_e1', 'ext_shapeHSM_HsmShapeRegauss_e2'), 
                              overwrite=True)
coaddCat.add_quantity_modifier('psf_size', 
                              (lambda xx, yy, xy: 0.168*2.355*(xx*yy - xy*xy)**0.25, 'i_base_SdssShape_psf_xx', 'i_base_SdssShape_psf_yy', 'i_base_SdssShape_psf_xy'),
                              overwrite=True)

Filters on the sample size for right ascension/declination as well as u and i-band magnitudes decreases runtime for matching.

In [None]:
ra_min, ra_max = 55.5, 56.0
dec_min, dec_max = -29.0, -28.5

coord_filter = GCRQuery(
    'ra >= {}'.format(ra_min),
    'ra < {}'.format(ra_max),
    'dec >= {}'.format(dec_min),
    'dec < {}'.format(dec_max)
)

mag_cutIn = GCRQuery(
    (np.isfinite, 'lsst_u'),
    'lsst_u < 45',
#    (np.isfinite, 'i_magLSST'),
#    'i_magLSST < 24.5',
)
mag_cutOut = GCRQuery(
    (np.isfinite, 'mag_i_lsst'),
    'mag_i_lsst < 24.5'
)

This cell contains a list of the HSC lensing cuts that are applied to the coadd catalog to investigate any objects that deviate from what is expected from a certain position in the sky. They are included here just for documentation purposes. `lensing_cuts` is an array of filters that can be applied to the coadd catalog.

In [None]:
#GCRQuery((np.isnan, 'i_modelfit_CModel_flux')), # (from this and below) remove nan entries
#GCRQuery((np.isnan, 'ext_shapeHSM_HsmShapeRegauss_resolution')),
#GCRQuery((np.isnan, 'ext_shapeHSM_HsmShapeRegauss_e1')),
#GCRQuery((np.isnan, 'ext_shapeHSM_HsmShapeRegauss_e2')),
#GCRQuery('detect_isPrimary'), # (from this and below) basic flag cuts 
#GCRQuery('deblend_skipped'),
#GCRQuery('base_PixelFlags_flag_edge'),
#GCRQuery('base_PixelFlags_flag_interpolatedCenter'),
#GCRQuery('base_PixelFlags_flag_saturatedCenter'),
#GCRQuery('base_PixelFlags_flag_crCenter'),
#GCRQuery('base_PixelFlags_flag_bad'),
#GCRQuery('base_PixelFlags_flag_suspectCenter'),
#GCRQuery('base_PixelFlags_flag_clipped'),
#GCRQuery('ext_shapeHSM_HsmShapeRegauss_flag'),
#GCRQuery('i_SN_cmodel >= 10'), # (from this and below) cut on object properties
#GCRQuery('HSM_res >= 0.3'),
#GCRQuery('HSM_ell < 2.0'),
#GCRQuery('ext_shapeHSM_HsmShapeRegauss_sigma <= 0.4'),
#GCRQuery('i_mag_cmodel < 24.5'), # FIXME: Doesnt have exinction correction
#GCRQuery('base_Blendedness_abs_flux < 10**(-0.375)'),
lensing_cuts = [
    GCRQuery(
    'ra >= {}'.format(ra_min),
    'ra < {}'.format(ra_max),
    'dec >= {}'.format(dec_min),
    'dec < {}'.format(dec_max))
    #GCRQuery('mag_i < 24.5')
]

Getting all relevant quantities. Note that the filters aren't used in order to examine the entire set of data. Restrictions can either be applied using a list or through just one `GCRQuery()`.
In order to apply any filters:
```
coordinatesCoaddCat = coaddCat.get_quantities([.....], filters=lensing_cuts)
```
as a set (list) of different filters or
```
coordinatesRefCat = refCat.get_quantities([.....], filters=[coord_filter]) 
```
as just one set of filters.

In [None]:
coordinatesCoaddCat = coaddCat.get_quantities(['ra', 'dec',
                                               'i_modelfit_CModel_flux',
                                               'ext_shapeHSM_HsmShapeRegauss_resolution',
                                               'ext_shapeHSM_HsmShapeRegauss_e1',
                                               'ext_shapeHSM_HsmShapeRegauss_e2',
                                               'detect_isPrimary',
                                               'deblend_skipped',
                                               'base_PixelFlags_flag_edge',
                                               'base_PixelFlags_flag_interpolatedCenter',
                                               'base_PixelFlags_flag_saturatedCenter',
                                               'base_PixelFlags_flag_crCenter',
                                               'base_PixelFlags_flag_bad',
                                               'base_PixelFlags_flag_suspectCenter',
                                               'base_PixelFlags_flag_clipped',
                                               'ext_shapeHSM_HsmShapeRegauss_flag',
                                               'i_SN_cmodel',
                                               'HSM_res',
                                               'HSM_ell',
                                               'ext_shapeHSM_HsmShapeRegauss_sigma',
                                               'i_mag_cmodel',
                                               'base_Blendedness_abs_flux',
                                               'mag_u',
                                               'mag_g',
                                               'mag_r',
                                               'mag_i',
                                               'mag_z',
                                               'mag_y'                                               
                                              ])#, filters=lensing_cuts)

coordinatesRefCat = refCat.get_quantities(['ra', 'dec',
                                           'mag_true_u',
                                           'mag_true_g',
                                           'mag_true_r',
                                           'mag_true_i',
                                           'mag_true_z',
                                           'mag_true_y'])#, filters=[coord_filter])

Running catalog matching takes ~20 mins for a separation (linking length) of 1 arcsecond. The runtime for 2 arcseconds is ~1 hour. Larger linking lengths have not been tested but definitely will take a much longer time to run.

In [None]:
results = FoFCatalogMatching.match(
    catalog_dict=dict(ref=coordinatesRefCat, coadd=coordinatesCoaddCat),
    linking_lengths=1.0,
    catalog_len_getter=lambda x: len(x['ra']))

The next two cells sort results into bins based on the quality of the matches. Scale represents the number of objects within each bin. [1][1] bin represents perfect matches (1 truth matched with 1 coadd). [1][0] represents objects that are in truth that were not detected in coadd. [0][1] represents "fake" detections. Higher bins point to blending issues.

In [None]:
ref_mask = results['catalog_key'] == 'ref'
coadd_mask = ~ref_mask
n_groups = results['group_id'].max() + 1
n_ref = np.bincount(results['group_id'][ref_mask], minlength=n_groups)
n_coadd = np.bincount(results['group_id'][coadd_mask], minlength=n_groups)
n_max = max(n_ref.max(), n_coadd.max()) + 1
hist_2d = np.bincount(n_coadd * n_max + n_ref, minlength=n_max*n_max).reshape(n_max, n_max)

In [None]:
plt.figure(figsize=(10, 10))
e = (-0.5, n_max, -0.5, n_max)
plt.imshow(hist_2d, extent=e, origin='lower', norm=LogNorm())
plt.xlabel('Number of ref objects')
plt.ylabel('Number of coadd objects')
plt.axis([-0.5, 5, -0.5, 5])
cbar = plt.colorbar()
plt.savefig('matching_histogram.png', dpi = 300)

Getting relevant masks (all objects in a particular bin) and then indexing them based on their objectIDs.

In [None]:
one_to_one_group_mask = np.in1d(results['group_id'], np.flatnonzero((n_ref == 1) & (n_coadd == 1)))
zero_to_one_group_mask = np.in1d(results['group_id'], np.flatnonzero((n_ref == 0) & (n_coadd == 1)))
ref_idx11 = results['row_index'][one_to_one_group_mask & ref_mask]
coadd_idx11 = results['row_index'][one_to_one_group_mask & coadd_mask]
ref_idx01 = results['row_index'][zero_to_one_group_mask & ref_mask]
coadd_idx01 = results['row_index'][zero_to_one_group_mask & coadd_mask]

The astroPy module allows us to easily calculate angular separation between different objects using the `SkyCoord` class. The indices `ref_idx11` and `coadd_idx11` are used to verify that the perfect matches indeed reference the same objects. The first plot shows how accurate the matching is with most objects having almost no variation in their right ascension/declination.  

In [None]:
ref_sc = SkyCoord(coordinatesRefCat['ra'][ref_idx11], coordinatesRefCat['dec'][ref_idx11], unit="deg")
coadd_sc = SkyCoord(coordinatesCoaddCat['ra'][coadd_idx11], coordinatesCoaddCat['dec'][coadd_idx11], unit="deg")
delta_ra = (coadd_sc.ra.arcsec - ref_sc.ra.arcsec)
delta_dec = (coadd_sc.dec.arcsec - ref_sc.dec.arcsec)
delta_arcsec = ref_sc.separation(coadd_sc).arcsec

In [None]:
plt.figure(figsize=(10, 10))
plt.xlabel(r'$\Delta$ RA[arcseconds]')
plt.ylabel(r'$\Delta$ Dec[arcseconds]')
plt.hist2d(delta_ra, delta_dec, bins=40, norm=LogNorm());
cbar = plt.colorbar()
plt.savefig('deltaRA-DEC.png', dpi = 300)

The angular separation plot shows that most matches occured at a small arcsecond separation. An exponential fit is a good predictor of how many objects are expected at a given arcsecond separation.

In [None]:
#Plotting Delta angle for the outputs
plt.figure(figsize=(10, 10))
values, edges, _ = plt.hist(delta_arcsec, bins=1000)
plt.xlim(0, 1.0)
plt.title('Delta angle')
plt.xlabel(r'$\Delta$ angle [arcsec]')
plt.ylabel('Frequency')
plt.savefig('DeltaAngle.png', dpi = 300)

In [None]:
plt.figure(figsize = (10, 10))
plt.plot(edges[:-1], values, label='True')
x_data = edges[:-1]
y_data = values
plt.plot(edges[:-1], values*0., 'k:')
def fitfunction1(x, a, b):
    y = a * x * np.exp(-b * x ** 0.38)
    return y
(A, B), _ = curve_fit(fitfunction1, x_data, y_data)
plt.plot(x_data, fitfunction1(x_data, A, B), label='fitted')
plt.title('Fitted with the form a * x * exp(-b * x ** 0.38)')
plt.xlabel('$\Delta$ Angle [arcsec]')
plt.ylabel('Frequency')
leg = plt.legend()
leg_lines = leg.get_lines()
plt.setp(leg_lines, linewidth=4000)
ax = plt.gca()
ax.legend(loc='lower left')
plt.savefig('functionFit2.png', dpi = 300)

The `lensing_filters` list contains all of the HSC Lensing Cuts to be applied to the coadd catalog. The list is parsed and each filter can be applied individually or all at once. What is returned is an array of the number of objects that either passed or failed the lensing cuts. The results are then plotted in the next cell.

In [None]:
lensing_filters = ['None',
              '@i_modelfit_CModel_flux',
              '@ext_shapeHSM_HsmShapeRegauss_resolution',
              '@ext_shapeHSM_HsmShapeRegauss_e1', 
              '@ext_shapeHSM_HsmShapeRegauss_e2',
              'detect_isPrimary',
              '~deblend_skipped',
              '~base_PixelFlags_flag_edge',
              '~base_PixelFlags_flag_interpolatedCenter',
              '~base_PixelFlags_flag_saturatedCenter',
              '~base_PixelFlags_flag_crCenter',
              '~base_PixelFlags_flag_bad',
              '~base_PixelFlags_flag_suspectCenter',
              '~base_PixelFlags_flag_clipped',
              '~ext_shapeHSM_HsmShapeRegauss_flag',
              'i_SN_cmodel >= 10',
              'HSM_res >= 0.3',
              'HSM_ell < 2.0',
              'ext_shapeHSM_HsmShapeRegauss_sigma <= 0.4',
              'i_mag_cmodel < 24.5',
              'base_Blendedness_abs_flux < 0.42169650342',
              'All'
             ]

allFiltersList = []

def listParser(listOfFilters):
    rel_ops = set(['<', '<=', '>', '>='])
    operatorsTable = {'<' : operator.lt,
                      '>' : operator.gt,
                      '<=' : operator.le,
                      '>=' : operator.ge}
    values_list01 = []
    values_list11 = []
    AllA, AllB = None, None
    
    for x in listOfFilters:
        if x=='None':
            a=np.ones(len(coadd_idx01),dtype=bool)
            b=np.ones(len(coadd_idx11),dtype=bool)
        elif x=="All":
            a=AllA
            b=AllB
            AF_coadd_idx01 = coadd_idx01[np.where(a)]
            AF_coadd_idx11 = coadd_idx11[np.where(b)]
            NAF_coadd_idx01 = coadd_idx01[np.where(~a)]
            NAF_coadd_idx11 = coadd_idx11[np.where(~b)]
            AF_ref_idx11 = ref_idx11[np.where(b)]
            NAF_ref_idx11 = ref_idx11[np.where(~b)]
        elif not any(r in x for r in rel_ops):
            notNan=False
            logicalNot=False
            if x[0]=='@':
                notNan=True
                x=x[1:]
            elif x[0]=='~':
                logicalNot=True
                x=x[1:]
            a = coordinatesCoaddCat[x][coadd_idx01]
            b = coordinatesCoaddCat[x][coadd_idx11]
            if notNan:
                a=~np.isnan(a)
                b=~np.isnan(b)
            elif logicalNot:
                a=~a
                b=~b
        else:
            filterName,op,value=x.split(' ')
            value=np.float(value)
            a = operatorsTable[op](coordinatesCoaddCat[filterName][coadd_idx01], value)
            b = operatorsTable[op](coordinatesCoaddCat[filterName][coadd_idx11], value)
        values_list01.append(np.sum(a))
        values_list11.append(np.sum(b))
        if AllA is None:
            AllA=a
            AllB=b
        else:
            AllA = AllA & a
            AllB = AllB & b
    return np.array(values_list01), np.array(values_list11), AF_coadd_idx01, AF_coadd_idx11, NAF_coadd_idx01, NAF_coadd_idx11, AF_ref_idx11, NAF_ref_idx11

values_list01, values_list11, AF_coadd_idx01, AF_coadd_idx11, NAF_coadd_idx01, NAF_coadd_idx11, AF_ref_idx11, NAF_ref_idx11 = listParser(lensing_filters)

In [None]:
xaxis = list(range(len(lensing_filters)))
normedList01 = values_list01/values_list01.max()
normedList11 = values_list11/values_list11.max()
plt.figure(figsize=(10, 10))
plt.title('Effect of HSC lensing cuts')
plt.scatter(xaxis, normedList11, label='(1,1) bin', edgecolors='black', s=100)
plt.scatter(xaxis, normedList01, label='(0,1) bin', edgecolors='black', s=100)
plt.xticks(np.arange(len(values_list11)), lensing_filters, rotation=90)
plt.xlabel('Filters')
plt.ylabel('Number of members [log]')
ax = plt.gca()
ax.legend()
ax.set_yscale('log')
ax.grid(color='black', linestyle='-', linewidth=0.3)
plt.ylim(2e-4, 1.2)
plt.savefig('HSCLensingCuts', dpi = 300)

Getting the the actual values of the passed or failed objects to examine them further. Indexing the passed/failed objects allows quick access to the values of interest.

In [None]:
accepted_i_mag_cmodel_01 = coordinatesCoaddCat['i_mag_cmodel'][AF_coadd_idx01]
naccepted_i_mag_cmodel_01 = coordinatesCoaddCat['i_mag_cmodel'][NAF_coadd_idx01]
accepted_i_mag_cmodel_11 = coordinatesCoaddCat['i_mag_cmodel'][AF_coadd_idx11]
naccepted_i_mag_cmodel_11 = coordinatesCoaddCat['i_mag_cmodel'][NAF_coadd_idx11]

In [None]:
accepted_i_ref11 = coordinatesRefCat['mag_true_i'][AF_ref_idx11]
naccepted_i_ref11 = coordinatesRefCat['mag_true_i'][NAF_ref_idx11]
accepted_i_coadd11 = coordinatesCoaddCat['mag_i'][AF_coadd_idx11]
naccepted_i_coadd11 = coordinatesCoaddCat['mag_i'][NAF_coadd_idx11]

accepted_u_ref11 = coordinatesRefCat['mag_true_u'][AF_ref_idx11]
naccepted_u_ref11 = coordinatesRefCat['mag_true_u'][NAF_ref_idx11]
accepted_u_coadd11 = coordinatesCoaddCat['mag_u'][AF_coadd_idx11]
naccepted_u_coadd11 = coordinatesCoaddCat['mag_u'][NAF_coadd_idx11]

Comparing i-band magnitudes in the coadd catalog and truth catalog after a lensing cut. This can be done with any other quantity given that the indices are universal. 

In [None]:
plt.figure(figsize=(10,10))
plt.scatter(accepted_i_ref11, accepted_i_mag_cmodel_11, label='accepted', edgecolors='black')
plt.title('i_mag_cmodel vs mag_i_truth after filters [1][1] bin')
plt.xlabel('Truth')
plt.ylabel('Coadd')

In [None]:
plt.figure(figsize=(10, 10))
plt.hist2d(accepted_i_ref11, accepted_i_mag_cmodel_11, bins=100, norm=LogNorm())
ax = plt.gca()
x = np.linspace(*ax.get_xlim())
ax.plot(x, x, 'k-')
plt.colorbar()
plt.title('i_mag_cmodel vs mag_i_truth after filters [1][1] bin: histogram, logarithmic scale')
plt.xlabel('ref')
plt.ylabel('coadd')

Any anomalously bright objects can be found using their index. The `for` loops pick out any objects that match the conditions. To examine any objects within the coadd cat that have passed the filters:
```
for index in AF_coadd_idx##:
    if some condition:
        listOfObjects.append(index)
```
The `%store` function is proprietary to Jupyter Notebooks. It will locally store any value, list, array, etc for access in another notebook. It can then be read by the `%store -r` function (see `postage_stamp_generation.ipynb`).

In [None]:
bright_objects_filtered_idx01 = []
bright_objects_filtered_idx11 = []

for index in AF_coadd_idx01:
    if coordinatesCoaddCat['i_mag_cmodel'][index] < 19:
        bright_objects_filtered_idx01.append(index)
for index in AF_coadd_idx11:
    if coordinatesCoaddCat['i_mag_cmodel'][index] < 19:
        bright_objects_filtered_idx11.append(index)

bright_objects_filtered_ra01 = coordinatesCoaddCat['ra'][bright_objects_filtered_idx01]
bright_objects_filtered_dec01 = coordinatesCoaddCat['dec'][bright_objects_filtered_idx01]
%store bright_objects_filtered_idx01
%store bright_objects_filtered_ra01
%store bright_objects_filtered_dec01

bright_objects_filtered_ra11 = coordinatesCoaddCat['ra'][bright_objects_filtered_idx11]
bright_objects_filtered_dec11 = coordinatesCoaddCat['dec'][bright_objects_filtered_idx11]
%store bright_objects_filtered_idx11
%store bright_objects_filtered_ra11
%store bright_objects_filtered_dec11