## Test case LVV-T389 - Single Visit Photometric Repeatability

- Stolen from : https://github.com/lsst-sitcom/sctr-116/tree/jcarlin_LVV-T389
- Sylvie Dagoret-Campagne
- Tested 2025/02/12 on w_2025_04 (not working on w_2024_50)

Verify that the RMS of magnitudes in all filters and outlier rate of magnitudes is within specification.

- Hi all -- Seeking feedback on a verification artifact.
- For [this Test Case](https://rubinobs.atlassian.net/projects/LVV?selectedItem=com.atlassian.plugins.atlassian-connect-plugin:com.kanoah.test-manager__main-project-page#!/v2/testCase/LVV-T389), would [https://github.com/lsst-sitcom/sctr-116/blob/jcarlin_LVV-T389/notebooks/test_LVV-T389.ipynb](https://github.com/lsst-sitcom/sctr-116/blob/jcarlin_LVV-T389/notebooks/test_LVV-T389.ipynb) be an acceptable/sufficient artifact? (I started with an easy one that's already in `analysis_tools`!)

- See for Learning Purposes : https://github.com/lsst-dm/DMTR-401/tree/tickets/DM-40311/notebooks

#### Discussion:

OSS-REQ-0387 (from [LSE-030](https://ls.st/oss)) and LSR-REQ-0093 (from [LSE-029](https://ls.st/lsr)) state that the following requirements must be met (among others):
- The RMS photometric repeatability of bright non-saturated unresolved point sources in the g, r, and i filters, `PA1gri`, shall be less than 5 mmag.
- The RMS photometric repeatability of bright non-saturated unresolved point sources in the u, z, and y filters, `PA1uzy`, shall be less than 7.5 mmag.
- The maximum fraction of isolated nonsaturated point source measurements exceeding the outlier limit, `PF1`, shall be less than 10%. Here, the outlier limits are defined as `PA2gri`=15 mmag and `PA2uzy`=22.5 mmag.

This test can be verified using data products produced during Data Release Processing campaigns executed by the Data Management pipelines team. In particular, we will use the LSST ComCam data as reprocessed with weekly pipelines version w_2025_04, in Butler collection “LSSTComCam/runs/DRP/DP1/w_2025_04/DM-48556".

The `PA1` and `PF1` metrics and related plots are created by tasks in the `analysis_tools` package. Thus verification of this requirement can be accomplished by simply retrieving the datasets produced by those tasks and confirming that they meet the required accuracy.

#### analysis_tools calculation

The [StellarPhotometricRepeatability task](https://github.com/lsst/analysis_tools/blob/d7c7025cbdcf02a9f8440e7a8cf441586eeecb3d/python/lsst/analysis/tools/atools/photometricRepeatability.py#L53) in `analysis_tools` handles the calculation of these metrics and plots. Its docstring describes the calculation as follows:

"Compute photometric repeatability from multiple measurements of a set of stars. First, a set of per-source quality criteria are applied. Second, the individual source measurements are grouped together by object index and per-group quantities are computed (e.g., a representative S/N for the group based on the median of associated per-source measurements). Third, additional per-group criteria are applied. Fourth, summary statistics are computed for the filtered groups."

This is calculated by first measuring the RMS variations of magnitudes measured over all visits for each star in a given tract. Then `PA1` and `PF1` are derived from the distribution of these measurements, with `PA1` representing the median value, and `PF1` the percentage of measurements exceeding the `PA2` outlier limit.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import astropy.units as u
from astropy.table import Table, hstack

from lsst.daf.butler import Butler
from IPython.display import Image

In [None]:
# Initialize the butler repo pointing to the DM-48556 (w_2025_04) collection
repo = '/repo/main'
collection = 'LSSTComCam/runs/DRP/DP1/w_2025_04/DM-48556'
#collection = 'LSSTComCam/runs/DRP/DP1/w_2025_05/DM-48666'

butler = Butler(repo, collections=collection)

#### Retrieve the metrics from the butler

The photometric repeatability metrics are created by `analysis_tools`, and reside in datasets of type `matchedVisitCore_metrics`. Use a butler query to identify all of the existing LSSTComCam datasets of this type.

In [None]:
metrics_all = butler.query_datasets('matchedVisitCore_metrics', where="skymap='lsst_cells_v1'")

The query returns references to `matchedVisitCore_metrics` datasets, which are `MetricBundle`s. The following cell extracts each `MetricBundle` from the butler, then extracts the relevant metrics (`{band}_stellarPhotRepeatStdev` for PA1, and `{band}_stellarPhotRepeatOutlierFraction` for PF1) to a table.

In [None]:
tracts = []
bands = []
PA1_all = []
PF1_all = []

for met_ref in metrics_all:
    metrics = butler.get(met_ref)
    mets = metrics['stellarPhotometricRepeatability']
    for band in ['u', 'g', 'r', 'i', 'z', 'y']:
        tracts.append(met_ref.dataId['tract'])
        bands.append(band)
        pa1_metric = band+'_stellarPhotRepeatStdev'
        pf1_metric = band+'_stellarPhotRepeatOutlierFraction'
        for m in mets:
            if m.metric_name.metric == pa1_metric:
                PA1_all.append(m.quantity.value)
            if m.metric_name.metric == pf1_metric:
                PF1_all.append(m.quantity.value)
                

In [None]:
tab_all = Table([tracts, bands, PA1_all, PF1_all], names=['tract', 'band', 'PA1', 'PF1'],
                units=[None, None, u.mmag, u.percent])

In [None]:
tab_all

The table includes metrics from all bands. Separately select the metrics corresponding to each band. Furthermore, because `PA1` and `PF1` are defined differently for "gri" and "uzy" bands, create selections for "gri" and "uzy" subsets.

In [None]:
sel_u = (tab_all['band']=='u')
sel_g = (tab_all['band']=='g')
sel_r = (tab_all['band']=='r')
sel_i = (tab_all['band']=='i')
sel_z = (tab_all['band']=='z')
sel_y = (tab_all['band']=='y')

sel_gri = sel_g | sel_r | sel_i
sel_uzy = sel_u | sel_z | sel_y


#### Distribution of metric values

`PA1` and `PF1` are measured _per tract_. Plot histograms of all tract measurements for these metrics, and compare their median values against the requirement thresholds.

In [None]:
params = {'axes.labelsize': 20,
          'font.size': 20,
          'legend.fontsize': 14,
          'xtick.major.width': 3,
          'xtick.minor.width': 2,
          'xtick.major.size': 12,
          'xtick.minor.size': 6,
          'xtick.direction': 'in',
          'xtick.top': True,
          'lines.linewidth': 3,
          'axes.linewidth': 3,
          'axes.labelweight': 3,
          'axes.titleweight': 3,
          'ytick.major.width': 3,
          'ytick.minor.width': 2,
          'ytick.major.size': 12,
          'ytick.minor.size': 6,
          'ytick.direction': 'in',
          'ytick.right': True,
          'figure.figsize': [7, 5],
          'figure.facecolor': 'White'}
plt.rcParams.update(params)

fig, ax = plt.subplots(2, 1, figsize=(7, 8), sharex=True)
plt.subplots_adjust(hspace=0)

plt.sca(ax[0])
plt.hist(tab_all[sel_gri]['PA1'], bins=np.arange(0, 25, 0.5), color='black', histtype='step', linewidth=2, label='gri')
pa1_gri_median = np.nanmedian(tab_all[sel_gri]['PA1'])
plt.vlines(pa1_gri_median, 0, 8.5, linestyle='--', color='Gray',
           label=f'median: {pa1_gri_median:.2f} mmag')
plt.vlines(5.0, 0, 8.5, linestyle=':', color='red',
           label='requirement: <5.0 mmag')
plt.xlim(0, 26.5)
plt.ylim(0, 8.5)
plt.legend()
plt.ylabel('number of tracts')
plt.minorticks_on()
plt.sca(ax[1])
plt.hist(tab_all[sel_uzy]['PA1'], bins=np.arange(0, 25, 0.5), color='black', histtype='step', linewidth=2, label='uzy')
pa1_uzy_median = np.nanmedian(tab_all[sel_uzy]['PA1'])
plt.vlines(pa1_uzy_median, 0, 8.5, linestyle='--', color='Gray',
           label=f'median: {pa1_uzy_median:.2f} mmag')
plt.vlines(7.5, 0, 8.5, linestyle=':', color='red',
           label='requirement: <7.5 mmag')
plt.xlim(0, 26.5)
plt.ylim(0, 8.5)
plt.legend()
plt.xlabel('PA1 (mmag)')
plt.ylabel('number of tracts')
plt.minorticks_on()
plt.show()

We see that the median value of `PA1_gri`=7.76 mmag exceeds the requirement threshold of 5.0 mmag. The median value of `PA1_uzy`=7.05 mmag meets the requirement.

In [None]:
fig, ax = plt.subplots(2, 1, figsize=(7, 8), sharex=True)
plt.subplots_adjust(hspace=0)

plt.sca(ax[0])
plt.hist(tab_all[sel_gri]['PF1'], bins=np.arange(0, 7.5, 0.5), color='black', histtype='step', linewidth=2, label='gri')
pf1_gri_median = np.nanmedian(tab_all[sel_gri]['PF1'])
plt.vlines(pf1_gri_median, 0, 22, linestyle='--', color='Gray',
           label=f'median: {pf1_gri_median:.2f} $\%$')
plt.vlines(10.0, 0, 22, linestyle=':', color='red',
           label='requirement: <10%')
plt.xlim(0, 11.5)
# plt.ylim(0, 18.5)
plt.legend()
plt.ylabel('number of tracts')
plt.minorticks_on()
plt.sca(ax[1])
plt.hist(tab_all[sel_uzy]['PF1'], bins=np.arange(0, 7.5, 0.5), color='black', histtype='step', linewidth=2, label='uzy')
pf1_uzy_median = np.nanmedian(tab_all[sel_uzy]['PF1'])
plt.vlines(pf1_uzy_median, 0, 19.5, linestyle='--', color='Gray',
           label=f'median: {pf1_uzy_median:.2f} $\%$')
plt.vlines(10.0, 0, 22, linestyle=':', color='red',
           label='requirement: <10%')
plt.xlim(0, 11.5)
plt.ylim(0, 19.5)
plt.legend()
plt.xlabel(r'PF1 $(\%)$')
plt.ylabel('number of tracts')
plt.minorticks_on()
plt.show()

We see that the median values of both `PF1_gri` and `PF1_uzy` are well below the requirement threshold of 10%.

### Plots associated with the metrics

Retrieve the plots that are created alongside the metrics by `analysis_tools`, and display them in the notebook.

The following cell extracts a list of dataset references for all histograms of dataset type `matchedVisitCore_stellarPhotometricRepeatability_HistPlot`. These are per-tract, per-band histograms of the photometric repeatability over all visits.

In [None]:
plots_all = butler.query_datasets('matchedVisitCore_stellarPhotometricRepeatability_HistPlot',
                                  where="skymap='lsst_cells_v1'")

Display one of the plots. This is a histogram of the RMS repeatability values for all stars in a given tract/band.

In [None]:
uri = butler.getURI(plots_all[12])
image_bytes = uri.read()
Image(image_bytes, width=600)


## Results
We have demonstrated that there is software within the Rubin Science Pipelines to calculate photometric repeatability (`PA1`) and the percentage of outliers (`PF1`). Additionally we have shown the metrics and plots that are produced by `analysis_tools` each time the DRP pipeline is executed.

The metrics measured on LSST ComCam data exceed the requirement threshold for `PA1_gri`, but meet the requirements for `PA1_uzy` and `PF1`. It is unclear whether `PA1_gri` reflects poor data quality (and/or effects of data processing), or whether it is more of a reflection of the limited datasets gathered during the ComCam on-sky campaign.

The result of this test is a "**Fail**", but could likely be passing with careful exploration of the outliers causing large repeatability in the "gri" bands.