# LVV-T959: Inter-Band Astrometric Performance

**Written By: Bryce Kalmbach**

**Last updated: 07-18-2019**

**Tested on Stack Version: w_2019_28**

## Requirements:

[OSS-REQ-0388](https://docushare.lsst.org/docushare/dsweb/Get/LSE-030#page=68)

1. RMS difference between separations measured in the r-band and those measured in any other filter is less than or equal to 10 milliarcsec.

2. Fraction of separations measured relative to the r-band that can exceed the color difference outlier limit (20 milliarcsec) is less than or equal to 10 percent.

## Proposed Test Case:

1. Image an average field in all six bands.  Repeat at different airmasses.

2. Perform source detection and astrometric measurements on the images from step 1

3. Find separations between all pairs of sources in catalogs from step 2

4. For each band, compute the RMS difference in source separations relative to the r-band.  Verify that this values is less than or equal to 10 milliarcseconds.

5. Verify that no more than 10 percent of source separation measurements in any band vary by more than 20 milliarcseconds from the r band measurements

### Import necessary tools

In [None]:
import os
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd

In [None]:
from lsst.daf.persistence import Butler
import lsst.daf.persistence as daf_persistence
from lsst.afw.table import MultiMatch

from astropy.coordinates import SkyCoord
from astropy import units as u

from itertools import combinations

In [None]:
# Make our plots nice and readable
plt.rcParams.update({'font.size': 18})

### Set parameters for testing

* `test_bandpass`: The notebook will set up to test astrometry in this bandpass against 'r'

* `faint_r_lim`: If set to `None`, the notebook will calculate separations for every pair of objects that are present in all visits. This can take a long time or perhaps we want to see how astrometry changes as a function of magnitude. Therefore, we can set this to only keep sources with an r-band magnitude brighter than this limit.

In [None]:
test_bandpass = 'g'

faint_r_lim = 21. #None

### Setup Butler to get Twinkles data

In [None]:
# Set up a butler
datadir = '/project/shared/data/Twinkles_subset/output_data_v2/'
butler = Butler(datadir)

In [None]:
# Get dataIds
subset = butler.subset('src')

Below we define methods to create a matched catalog for sources and create `objects` made up of individual sources detected in a single visit.

In [None]:
def get_filter_visits(subset, bandpass):
    """
    Get a list of the available visit numbers in a subset for a given bandpass.
    """
    visit_list = []
    for data_ref in subset:
        data_id = data_ref.dataId
        if data_id['filter'] == bandpass:
            visit_list.append(data_id['visit'])
    return visit_list

In [None]:
def get_matched_catalog(subset, bandpass_list):
    """
    Create a matched catalog from a subset with observations in the bandpasses listed.
    """
    
    visit_list = []
    for bandpass in bandpass_list:
        bp_visits = get_filter_visits(subset, bandpass)
        for v in bp_visits:
            visit_list.append(v)

    matched_cat = None
    calexps = {}
    visit_filter_dict = {}            

    for data_ref in subset:
        data_id = data_ref.dataId
        if data_id['visit'] not in visit_list:
            continue
        print(data_id['visit'], ',', data_id['filter'])
        src_cat = data_ref.get('src')
        calexps[data_id['visit']] = data_ref.get('calexp')
        if matched_cat is None:
            id_fmt = {'visit':np.int64}
            matched_cat = MultiMatch(src_cat.schema, id_fmt)
        visit_filter_dict[str(data_id['visit'])] = data_id['filter']
        matched_cat.add(src_cat, data_id)
        
    final_catalog = matched_cat.finish()
    
    return final_catalog, calexps, visit_filter_dict

To start we create a pandas dataframe to hold information on each visit we have available.

In [None]:
r_visits = get_filter_visits(subset, 'r')
temp_df_r = pd.DataFrame(r_visits, columns=['visit'])
temp_df_r['filter'] = 'r'

test_visits = get_filter_visits(subset, test_bandpass)
temp_df_test = pd.DataFrame(test_visits, columns=['visit'])
temp_df_test['filter'] = test_bandpass

visit_df = pd.concat([temp_df_r, temp_df_test]).reset_index(drop=True)

Here we create the final matched catalog for all the visits in the r-band and the test bandpass. We keep the `calexps` to calculate magnitudes and `visit_filter_dict` contains a record of which filter goes with a visit.

In [None]:
final_catalog, calexps, visit_filter_dict = get_matched_catalog(subset, ['r', test_bandpass])

In [None]:
# Only keep the columns we need going forward and convert to pandas dataframe
final_catalog = final_catalog.asAstropy()
final_catalog = final_catalog[['id', 'coord_ra', 'coord_dec', 'base_PsfFlux_instFlux', 'object', 'visit']]
final_catalog = final_catalog.to_pandas()

In [None]:
# Add filter information into each line of the catalog
filter_list = []
for vis_num in final_catalog['visit'].values:
    filter_list.append(visit_filter_dict[str(vis_num)])
final_catalog['filter'] = filter_list

In [None]:
# Add in magnitude information for cuts
mag = []
for obj_row in final_catalog.values:
    calib = calexps[obj_row[-2]].getPhotoCalib()
    mag.append(calib.instFluxToMagnitude(obj_row[-4]))
final_catalog['mag'] = mag

In [None]:
# Add in image quality info based upon PSF to visit dataframe
# Code based upon https://github.com/lsst-com/notebooks/blob/master/image_quality_demo.ipynb
psf_fwhm = []
for obj_row in visit_df.values:
    psf = calexps[obj_row[0]].getPsf()
    shape = psf.computeShape()
    fwhm = 2 * np.sqrt(2. * np.log(2)) * shape.getTraceRadius() * \
        calexps[obj_row[0]].getWcs().getPixelScale().asArcseconds()
    psf_fwhm.append(fwhm)
visit_df['psf_fwhm'] = psf_fwhm

### Find separations in all pairs of sources

The first thing we do is keep only the objects that appear in all visits so that we will have the highest confidence we are matching to the same sources in each filter.

In [None]:
# Make pairs of all objects with detections in both filters
# Faster to use numpy array than loop over pandas df
# Currently keeps only the objects present in all visits
unique, counts = np.unique(final_catalog['object'].values, return_counts=True)
in_all = unique[np.where(counts == len(visit_filter_dict.keys()))[0]]
num_unique_objects = len(in_all)
print("Number of Objects present in all visits: %i" % num_unique_objects)

In [None]:
keep_catalog = final_catalog[final_catalog['object'].isin(in_all)]

If a r-band magnitude cut was set then this will trim the catalog appropriately.

In [None]:
if faint_r_lim is not None:
    bright_objects = np.unique(keep_catalog.query('filter == "r" and mag < %f' % faint_r_lim)['object'])
    keep_catalog = keep_catalog[keep_catalog['object'].isin(bright_objects)]
    num_unique_objects = len(bright_objects)
    print("Number of Objects with r < %.2f present in all visits: %i" % (faint_r_lim, num_unique_objects))

In [None]:
r_cat = keep_catalog.query('filter == "r"').reset_index(drop=True)
test_cat = keep_catalog.query('filter == "%s"' % test_bandpass).reset_index(drop=True)

The next thing to do is to compile a list of all possible pairs. It is recommended to set an r-band magnitude cut to keep this list a reasonable size or finding separations for *all* pairs in a visit may take a long time.

In [None]:
pairs_list = list(combinations(np.arange(num_unique_objects), 2))

Randomly choose a visit from each filter to compare. Could change this to pick based upon available properties.

In [None]:
r_visit_df = visit_df.query('filter == "%s"' % 'r')
test_visit_df = visit_df.query('filter == "%s"' % test_bandpass)

In [None]:
rand_state = np.random.RandomState(32)
r_visit = rand_state.choice(r_visit_df['visit'].values)
test_visit = rand_state.choice(test_visit_df['visit'].values)

Finally we calculate the separations for all the object pairs in a single visit for each filter and then compare.

In [None]:
def calc_separations(catalog, pairs_list, visit):
    cat_seps = np.empty((len(pairs_list)))
    print('Visit %i' % visit)
    visit_cat = catalog.query('visit == %i' % visit)
    coords = SkyCoord(visit_cat['coord_ra']*u.rad, visit_cat['coord_dec']*u.rad)
    visit_seps = []
    j = 0
    for pair_1, pair_2 in pairs_list:
        if pair_1 >= j:
            print('Calculating Separations For Object %i out of %i' % (pair_1, num_unique_objects))
            j += 50
        visit_seps.append(coords[pair_1].separation(coords[pair_2]).arcsec)
    cat_seps[:] = visit_seps
        
    return cat_seps

In [None]:
r_seps = calc_separations(r_cat, pairs_list, r_visit)

In [None]:
test_seps = calc_separations(test_cat, pairs_list, test_visit)

In [None]:
sep_differences = []
for r_sep, test_sep in zip(r_seps, test_seps):
    sep_differences.append(r_sep - test_sep)
sep_differences = np.array(sep_differences)

### Plot results against requirements

1. RMS difference between separations measured in the r-band and those measured in any other filter is less than or equal to 10 milliarcsec.

In [None]:
rms_diff = np.sqrt(np.mean(np.square(sep_differences)))

In [None]:
fig = plt.figure(figsize=(10, 8))
plt.hist(np.abs(sep_differences), bins=20)
plt.axvline(rms_diff, 0, 1, c='k', label='RMS difference to r-band separation = %.2f mas' % (rms_diff*1000.), lw=4)
plt.axvline(0.010, 0, 1, c='r', label='Requirement = 10 milliarcsec', lw=4)
plt.xlabel('Difference in Measured Separation to r-band Separation (arcsec)')
plt.ylabel('Number of Pairs')
plt.legend()

2. Fraction of separations measured relative to the r-band that can exceed the color difference outlier limit (20 milliarcsec) is less than or equal to 10 percent.

In [None]:
fig = plt.figure(figsize=(10, 8))
n, bins, _ = plt.hist(np.abs(sep_differences), bins=np.arange(0., np.max(np.abs(sep_differences)), 0.01), cumulative=True, density=True)
current_outlier_frac = n[np.where(bins < 0.02)[0][-1]]
plt.axhline(current_outlier_frac, 0, 1, c='k', label='Outlier Percentage = %.2f%s' % ((1.-current_outlier_frac)*100, '%'), lw=4)
plt.axhline(0.9, 0, 1, c='r', ls='--', label='90th percentile', lw=4)
plt.axvline(0.020, 0, 1, c='r', label='Requirement: Outlier Fraction (> 20mas) <= 10%', lw=4)
plt.xlabel('Difference in Measured Separation to r-band Separation (arcsec)')
plt.ylabel('Cumulative Fraction of Pairs')
plt.legend(loc=4)

### Test against requirements

If these fail with a new version of the stack our CI testing of notebooks will also fail and alert us.

In [None]:
class RequirementFailure(ValueError):
    "Requirement not met."

In [None]:
# Set up for potential error messages
error_msg = ""
error_present = False
error_val = 0

In [None]:
# Test RMS of separation differences
if rms_diff*1000. > 10.:
    error_present = True
    error_val += 1
    error_msg += str('Error #%i: \n' % error_val +
                     'Failure RMS of differences in separations compared to r-band greater than 10 milliarcsec for bandpass: %s. ' % test_bandpass + 
                     'Test Value = %.2f mas. \n' % (rms_diff*1000.))

In [None]:
# Test Outlier Fraction of separation differences
if (1.-current_outlier_frac)*100 > 10.:
    error_present = True
    error_val += 1
    error_msg += str('Error #%i: \n' % error_val + 
                     'Separation Difference Outlier Fraction (pair separations > 20 mas ' +
                     'different compared to r-band) is greater than 10%s for bandpass: %s. Test Value = %.2f%s \n' % ('%', test_bandpass, 
                                                                                                                   (1.-current_outlier_frac)*100, '%'))

In [None]:
if error_present is True:
    error_msg = str('%i Total Errors: \n' % error_val + error_msg)
    raise RequirementFailure(error_msg)