# Ingest and load local refcat demo using DELVE_DR1

<br>Owner: **Peter Ferguson** ([@psferguson](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@psferguson))
<br>Last Verified to Run: **2021-12-10**
<br>Verified Stack Release: **w_2021_40**
Last verified to run on 2021-12-10 with LSST Science Pipelines release w_2021_40 <br>
Contact authors: Peter Ferguson <br>

### Learning Objectives

This notebook demonstrates how to: <br>
1. Ingest Fits catalog into gen2 refcat repo 
2. Convert to gen3 repo
3. Load a source catalog
4. Load the local reference catalog that overlaps

### Set Up 
You can find the Stack version by using `eups list -s` on the terminal command line.

In [None]:
# Site, host, and stack version
! echo $EXTERNAL_INSTANCE_URL
! echo $HOSTNAME
! eups list -s | grep lsst_distrib

In [None]:
import subprocess
import numpy as np
import matplotlib.pyplot as plt
import lsst.geom
from lsst.daf.butler import Butler

### Create a gen2 reference catalog

Currently we must first create a gen2 refcat then convert to gen3 

This is done by setting up a directory and config file, then calling `ingestIndexReferenceTask`

For this example we will create a refcat from a DELVE (DEcam Local Volume Exploration survey [Website](https://delve-survey.github.io/)) DR1 healpixel located on NCSA

In [None]:
# setting names
refcatDir='custom_refcat_demo'
configFile="ingestConfigOverride.cfg"
inputFile="/project/shared/data/delve_dr1/cat/cat_hpx_07798.fits"

In [None]:
! mkdir {refcatDir}
! echo "lsst.obs.lsst.LsstCamMapper" > {refcatDir}/_mapper

Below is the set of configs used in creating this refcat
 1. Since the refcat is in Fits format we retarget the file reader
 2. It is required to give a name to this refcat, in this case delve_dr1
 3. We also need to specify ra, dec, mag, and mag_error columns
 4. Finally we can give the config a list of extra columns to include in the refcat (e.g. star/gal classifier)

In [None]:
%%writefile {configFile}
from lsst.meas.algorithms.readFitsCatalogTask import ReadFitsCatalogTask

# Default is ReadTextCatalogTask
config.file_reader.retarget(ReadFitsCatalogTask)

# String to pass to the butler to retrieve persisted files.
config.dataset_config.ref_dataset_name='delve_dr1'


config.dataset_config.indexer.name='HTM'

# Depth of the HTM tree to make.  Default is depth=7 which gives ~ 0.3 sq. deg. per trixel.
config.dataset_config.indexer['HTM'].depth=7

# Number of python processes to use when ingesting.
config.n_processes=5

# Name of RA column
config.ra_name='RA'

# Name of Dec column
config.dec_name='DEC'

# Name of column to use as an identifier (optional).
config.id_name='QUICK_OBJECT_ID'

# The values in the reference catalog are assumed to be in AB magnitudes. List of column names to use for
# photometric information.  At least one entry is required.
config.mag_column_list=['MAG_PSF_G', 'MAG_PSF_R','MAG_PSF_I', 'MAG_PSF_Z']

# A map of magnitude column name (key) to magnitude error column (value).
config.mag_err_column_map={'MAG_PSF_G':'MAGERR_PSF_G', 'MAG_PSF_R':'MAGERR_PSF_R','MAG_PSF_I':'MAGERR_PSF_I', 'MAG_PSF_Z':'MAGERR_PSF_Z'}

# Names of extra columns to include 
config.extra_col_names=['SPREAD_MODEL_G','SPREAD_MODEL_R','SPREAD_MODEL_I','SPREAD_MODEL_Z',
                        'SPREADERR_MODEL_G', 'SPREADERR_MODEL_R', 'SPREADERR_MODEL_I', 'SPREADERR_MODEL_Z',
                        'EXTINCTION_G', 'EXTINCTION_R', 'EXTINCTION_I', 'EXTINCTION_Z']


We then use the `ingestReferenceCatalog.py` command line tool to ingest the catalog, this takes a bit of time to run. 

In [None]:
! ingestReferenceCatalog.py {refcatDir} {inputFile}  --configfile {configFile} 

### Run gen2 -> gen3 conversion
We now have a gen2 refcat, that needs to be converted to gen3

Start by setting up the config file

In [None]:
# Create conversion configuration file
! echo 'config.datasetIncludePatterns = ["ref_cat", ]' > convertRefCat.cfg
! echo "config.refCats = ['delve_dr1']" >> convertRefCat.cfg

In [None]:
#If we want the baseline (gaia,ps1,sdss) refcats in the same collection this cell can be run
#!ln -sf /datasets/refcats/htm/htm_baseline/* {refcatDir}/ref_cats/

Now we can run the `butler convert` command line task this will create a new repo if there is not one already.

In [None]:
newRepo="custom_refcat_demo/gen3repo"

In [None]:
# Note this now also creates the curated calib files
! butler convert --gen2root {refcatDir} --config-file convertRefCat.cfg  {newRepo}

### Loading the new refcat
Then we can load this new repo, and check the "refcats/gen2" collection to see what it contains. 

In [None]:
butler = Butler(newRepo)
registry = butler.registry

In [None]:
[i for i in list(registry.queryCollections())]

In [None]:
registry.getCollectionSummary('refcats/gen2').datasetTypes.names

In [None]:
refDataset="delve_dr1"
refcatRefs = list(registry.queryDatasets(datasetType=refDataset,
                                          collections=["refcats/gen2"]).expanded())
refDataIds=[_.dataId for _ in refcatRefs]
refCatsDef = [butler.getDeferred(refDataset, __, collections=['refcats']) for __ in refDataIds]

In [None]:
refCats=[butler.getDirect(__) for __ in refcatRefs]

Finally we can plot the loaded refcat 

In [None]:
fit,ax=plt.subplots()
for refCat in refCats:
    ax.scatter(refCat["coord_ra"], refCat["coord_dec"], label="refcat",s=0.01)
plt.xlabel("RA")
plt.ylabel("DEC")