# DC2 Refcat Loader Demo

<br>Developer(s): **Keith Bechtol** ([@bechtol](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@bechtol))
<br>Maintainer(s): **Peter Ferguson** ([@psferguson](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@psferguson))
<br>Level: **Intermediate**
<br>Last Verified to Run: **2022-02-25**
<br>Verified Stack Release: **w_2021_49**

Contact authors: Peter Ferguson <br>
Target audience: All DP0 delegates. <br>
Container Size: medium <br>
Questions welcome at <a href="https://community.lsst.org/c/support/dp0">community.lsst.org/c/support/dp0</a> <br>
Find DP0 documentation and resources at <a href="https://dp0-1.lsst.io">dp0-1.lsst.io</a> <br>

**Credit:** This tutorial was originally developed by Keith Bechtol.

### Learning Objectives

This notebook demonstrates how to: <br>
1. Determine the reference catalog associated with a dataset 
2. Load this reference catalog 
3. Load a source catalog
4. Load the reference catalog that overlaps

### Set Up 
You can find the Stack version by using `eups list -s` on the terminal command line.

In [None]:
# Site, host, and stack version
! echo $EXTERNAL_INSTANCE_URL
! echo $HOSTNAME
! eups list -s | grep lsst_distrib

In [None]:
import os, os.path
import numpy as np
from astropy.time import Time

import lsst.geom
from lsst.pipe.tasks.loadReferenceCatalog import LoadReferenceCatalogConfig, LoadReferenceCatalogTask
from lsst.meas.algorithms import ReferenceObjectLoader
import lsst.daf.butler as dafButler
from lsst.utils import getPackageDir

from astropy.table import vstack
import astropy.units as u
import astropy.coordinates as coord

import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
# Set up some plotting defaults:

params = {
   'axes.labelsize': 28,
   'font.size': 24,
   'legend.fontsize': 14,
   'xtick.major.width': 3,
   'xtick.minor.width': 2,
   'xtick.major.size': 12,
   'xtick.minor.size': 6,
   'xtick.direction': 'in',
   'xtick.top': True,
   'lines.linewidth':3,
   'axes.linewidth':3,
   'axes.labelweight':3,
   'axes.titleweight':3,
   'ytick.major.width':3,
   'ytick.minor.width':2,
   'ytick.major.size': 12,
   'ytick.minor.size': 6,
   'ytick.direction': 'in',
   'ytick.right': True,
   'figure.figsize': [9, 8]
   }

plt.rcParams.update(params)

In [None]:
# Location of the DC2 Gen3 repository on this site
URL = os.getenv('EXTERNAL_INSTANCE_URL')
if URL.endswith('data.lsst.cloud'): # IDF
    repo = "s3://butler-us-central1-dp01"
elif URL.endswith('ncsa.illinois.edu'): # NCSA
    repo = "/repo/dc2"
else:
    raise Exception(f"Unrecognized URL: {URL}")

collections=['2.2i/runs/DP0.1']

config= os.path.join(repo,'butler.yaml')
butler = dafButler.Butler(config=config)
registry = butler.registry

Given this collection we can list the associated reference catalogs.

For DP0.1 there is just one: `cal_ref_cat_2_2`

In [None]:
registry.getCollectionSummary('refcats').datasetTypes.names

In [None]:
refDataset='cal_ref_cat_2_2'

For a given dataID we can see what reference datasets are available

In [None]:
dataId = {'visit': 192350, 'detector': 175, 'band': 'i', 'instrument':'LSSTCam-imSim'}

In [None]:
refcatRefs = list(registry.queryDatasets(datasetType=refDataset,
                                          collections=["refcats"],
                                          instrument=dataId['instrument'],
                                          where=f"visit={dataId['visit']} AND detector={dataId['detector']}").expanded())
refDataIds=[_.dataId for _ in refcatRefs]
refCatsDef = [butler.getDeferred(refDataset, __, collections=['refcats']) for __ in refDataIds]

Then we can load the source catalog data as well as the refcat data, and convert them to astropy tables 

In [None]:
# Get the source catalog for this visit and convert to astropy table
datasetRefs=list(registry.queryDatasets(datasetType='src',
                                          collections="2.2i/runs/DP0.1",
                                          **dataId))
sourceCat = butler.getDirect(datasetRefs[0])

In [None]:
#load the associated refcats explicitly 
refCats=[butler.getDirect(__) for __ in refcatRefs]

In [None]:
#next we plot the two loaded datasets
fig,ax=plt.subplots()
for refCat in refCats:
    ax.scatter(refCat["coord_ra"], refCat["coord_dec"], label="refcat",s=1)
plt.scatter(sourceCat["coord_ra"], sourceCat["coord_dec"], label="sourcecat", s=1)
plt.legend()
plt.xlabel("RA")
plt.ylabel("DEC")

Notice that two refCats have been returned (blue and orange). This occurs because the refCat has been "sharded" into heirarchical triangular mesh (HTM) regions. The source catalog for this specific detector (green) overlaps two different HTM regions. We can get more details about the refCats from the `refCatsDef` objects.

In [None]:
refCatsDef

In [None]:
# We can also load the refcat with a spatial query
config = LoadReferenceCatalogConfig()
config.refObjLoader.ref_dataset_name = refDataset

config.refObjLoader.load(os.path.join(getPackageDir('obs_lsst'),
                                          'config',
                                          'filterMap.py'))
config.doApplyColorTerms = False

In [None]:
loaderTask = LoadReferenceCatalogTask(config=config,
                                      dataIds=refDataIds,
                                      refCats=refCatsDef)

# Define center relative to DC2 catalog
center = lsst.geom.SpherePoint(np.median(sourceCat['coord_ra']),
                               np.median(sourceCat['coord_dec']),
                               lsst.geom.radians)
# Alternatively, define center relative to reference catalog
# center = lsst.geom.SpherePoint(refCats[0]['coord_ra'][0],
#                                refCats[0]['coord_dec'][0],
#                                lsst.geom.radians)
print('Using center (RA, DEC) =', center)

refCatSpatial = loaderTask.getSkyCircleCatalog(center,
                                         1.0*lsst.geom.degrees,
                                         ['i'])
print('Found %i reference catalog objects'%(len(refCatSpatial)))

In [None]:
fig,ax=plt.subplots()

ax.scatter(refCatSpatial["ra"], refCatSpatial["dec"], label="refcat",s=1)
plt.scatter(sourceCat["coord_ra"]*180/np.pi, sourceCat["coord_dec"]*180/np.pi, label="sourcecat", s=1)
plt.legend()
plt.xlabel("RA")
plt.ylabel("DEC")

In this case a single refCat object is returned.... ***Why?***