# Pre-bootcamp exercises: accessing data products via butler

**Description:** Introduction to data access with the Butler using a small test dataset from HSC, [rc2_subset](https://github.com/lsst/rc2_subset).

**Contact authors:** Keith Bechtol

**Last verified to run:** 2023-04-18

**LSST Science Piplines version:** w_2023_15

This notebook is intended to be run after executing the data processing steps in `process_rc2_subset.sh` to demonstrate how to access the reduced data products, e.g., object tables and source tables. Alternatively, one can use an existing sandbox repo to bypass processing steps.

## Preliminaries

In [None]:
import lsst.daf.butler as dafButler

In [None]:
# User instance of the repo
collections = ['u/%s'%os.environ['USER']]
repo = '/sdf/group/rubin/user/%s/bootcamp_2023/rc2_subset/SMALL_HSC/'%(os.environ['USER'])

# Existing sandbox repo if you prefer to skip processing steps
#collections = ['u/bechtol/step3']
#repo = '/sdf/group/rubin/user/bechtol/bootcamp_2023/rc2_subset_16Apr2023/SMALL_HSC/'

In [None]:
butler = dafButler.Butler(repo, collections=collections)
registry = butler.registry

Check what dataset types are present in the collection

In [None]:
for datasetType in registry.queryDatasetTypes():
    if registry.queryDatasets(datasetType, collections=collections).any(execute=False, exact=False):
        print(datasetType)

## Object tables

In [None]:
refs = sorted(registry.queryDatasets("objectTable_tract"))
print(len(refs))

In [None]:
refs[0].dataId

In [None]:
objectTable = butler.get(refs[0])
objectTable

## Source tables

In [None]:
refs = sorted(registry.queryDatasets("sourceTable_visit"))

In [None]:
for ref in refs: print(ref.dataId.full)

In [None]:
butler.get(refs[-1])

## Run analysis_tools interactively

Demonstration of running analysis tools interactively in a notbeook by passing in-memory data inputs to create metrics and diagnostic plots.

In [None]:
from lsst.analysis.tools.analysisMetrics import ShapeSizeFractionalMetric
from lsst.analysis.tools.tasks.base import _StandinPlotInfo

In [None]:
metric = ShapeSizeFractionalMetric()

In [None]:
results = metric(objectTable, band='i')

In [None]:
results

In [None]:
from lsst.analysis.tools.analysisPlots import ShapeSizeFractionalDiffScatterPlot

In [None]:
plot = ShapeSizeFractionalDiffScatterPlot()
# set some configs, we will go into this later
plot.produce.addSummaryPlot = False

In [None]:
# later keyword arguments will not be required going forward
results = plot(objectTable, band='i', skymap=None, plotInfo=_StandinPlotInfo())