# OR3 Photo-z Sandbox

Author: Melissa Graham

Last verified to run: Fri Apr 5 2024

LSST Science Pipelines version: Weekly 2024_04

The contents of this notebook have relied on the
<a href="https://github.com/lsst-sitcom/ops_rehearsal_commissioning_2024/blob/main/notebooks/ops_rehearsal_comcam_analysis.ipynb">ops_rehearsal_comcam_analysis notebook</a>.

## Set up

Import packages

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from lsst.daf.butler import Butler
import gc

## Access OR3 DRP data

### Find the object catalog

Based on the information in this Confluence page: https://confluence.lsstcorp.org/display/DM/Campaigns,
the Data Release Processing (DRP) for simulated ComCam data at USDF was complete back in March.

It is DRP that creates the deepCoadds and Object catalog which is the starting point for photo-z estimates.

In [None]:
repo = '/repo/ops-rehearsal-3-prep'
collection = 'u/homer/w_2024_12/DM-43439'
butler = Butler(repo, collections=collection)
registry = butler.registry

Determine which `DatasetTypes` exist in the collection.

Limit the search to the data products, and do not list configurations, logs, etc.

In [None]:
# for datasetType in registry.queryDatasetTypes():
#     if registry.queryDatasets(datasetType, 
#                               collections=collection).any(execute=False,
#                                                           exact=False):
#         if ('_config' not in datasetType.name) and \
#         ('_log' not in datasetType.name) and \
#         ('_metadata' not in datasetType.name) and \
#         ('_resource_usage' not in datasetType.name):
#             print(datasetType)

Only look for the `object`-related data products.

We have `objectTable` and `objectTable_tract`, plus a whole bunch of related datasets.

In [None]:
# for datasetType in registry.queryDatasetTypes():
#     if registry.queryDatasets(datasetType, 
#                               collections=collection).any(execute=False,
#                                                           exact=False):
#         if ('_config' not in datasetType.name) and \
#         ('_log' not in datasetType.name) and \
#         ('_metadata' not in datasetType.name) and \
#         ('_resource_usage' not in datasetType.name):
#             temp = str(datasetType.name)
#             if temp.find('object') > -1:
#                 print(temp)

Alternatively, can do it this way and reach the same conclusion.

In [None]:
# for dtype in sorted(registry.queryDatasetTypes(expression="*object*")):
#     print(dtype.name)

Get all the butler references for the `objectTable_tract`.

In [None]:
oTt_refs = list(butler.registry.queryDatasets('objectTable_tract'))

What are the `dataId` composed of, for the object table?

They would be all the same, so just check the first.

In [None]:
for i, ref in enumerate(oTt_refs):
    if i == 0:
        print(ref.dataId)

### Characterize object catalog

#### Number of tracts, and number of visits per tract

How many unique tracts are covered by `objectTable_tract`.

In [None]:
tracts = np.unique([ref.dataId['tract'] for ref in oTt_refs])
print(tracts)
print(len(tracts))

How many visits were available for the deepCoadd in each tract.

See that the numbers go from <10 to >1000. Some tracts will not have enough visits to even coadd (yet, an `objectTable` was made for them...). This shows depth variation over the full region is to be expected.

In [None]:
temp = []
for tract in tracts:
    visits = list(butler.registry.queryDatasets('visitSummary', tract=tract, 
                                                skymap='DC2', findFirst=True))
    temp.append(len(visits))
    # print(tract, len(visits))
nvisits = np.asarray(temp, dtype='int')
del temp

Plot histogram of the number of tracts (y) vs. number of visits/tract (x).

In [None]:
fig = plt.figure(figsize=(3, 2))
plt.hist(nvisits, bins=20)
plt.xlabel('number of visits per tract')
plt.ylabel('number of tracts')
plt.show()

#### Show r-band mag distribution for representative tracts

How to access the `objectTable_tract` for a given tract.

In [None]:
dataId = {'skymap': 'ops_rehersal_prep_2k_v1', 'tract': 3384}
objects = butler.get('objectTable_tract', dataId=dataId)

Show table.

Schema is going to be very similar to the DP0.2 Object table.

https://dm.lsst.org/sdm_schemas/browser/dp02.html#Object

In [None]:
# objects

Number of columns, number of rows.

In [None]:
print('# cols: ', len(objects.columns))
print('# rows: ', len(objects))

Extract data into numpy arrays for analysis.

In [None]:
r_cModelFlux = np.asarray(objects.get('r_cModelFlux'))
detect_isPrimary = np.asarray(objects.get('detect_isPrimary'))

Calculate magnitudes.

In [None]:
tx = np.where((r_cModelFlux > 0.0) & (detect_isPrimary == 1))[0]
print(len(tx))
r_cModelMag = -2.50 * np.log10(r_cModelFlux[tx]) + 31.4
del tx

Plot the magnitude distribution.

In [None]:
tx = np.where(r_cModelMag < 50)[0]
fig = plt.figure(figsize=(3, 2))
plt.hist(r_cModelMag[tx], bins=20, log=True)
plt.show()
del tx

Plot distribution of r-band magnitudes for the following tracts (# visits):

```
9881 8
7684 60
9638 299
7149 602
9880 1280
```

In [None]:
del dataId, objects
del r_cModelFlux, detect_isPrimary, r_cModelMag
gc.collect()

In [None]:
use_tracts = [9881, 7684, 9638, 7149, 9880]
use_nvisits = [8, 60, 299, 602, 1280]

fig = plt.figure(figsize=(6, 4))

for i, tract in enumerate(use_tracts):
    dataId = {'skymap': 'ops_rehersal_prep_2k_v1', 'tract': tract}
    objects = butler.get('objectTable_tract', dataId=dataId)
    
    r_cModelFlux = np.asarray(objects.get('r_cModelFlux'))
    detect_isPrimary = np.asarray(objects.get('detect_isPrimary'))
    tx = np.where((r_cModelFlux > 0.0) & (detect_isPrimary == 1))[0]
    r_cModelMag = -2.50 * np.log10(r_cModelFlux[tx]) + 31.4
    del tx
    
    tx = np.where(r_cModelMag < 50)[0]
    plt.hist(r_cModelMag[tx], bins=20, log=True, histtype='step',
             label=str(use_nvisits[i]))
    
    del dataId, objects
    del r_cModelFlux, detect_isPrimary, r_cModelMag
    gc.collect()

plt.legend(loc='upper left')
plt.show()

We can see too-faint things... ??? 