<img align="left" src = https://project.lsst.org/sites/default/files/Rubin-O-Logo_0.png width=250 style="padding: 10px"> 
<b>DRAFT: Coadd Recreation</b> <br>
Contact author: Melissa Graham <br>
Last verified to run: <i>yyyy-mm-dd</i> <br>
LSST Science Piplines version: Weekly 2022_22 <br>
Container Size: large <br>
Targeted learning level: intermediate <br>

In [None]:
# %load_ext pycodestyle_magic
# %flake8_on
# import logging
# logging.getLogger("flake8").setLevel(logging.FATAL)

**Description:** Recreate a deepCoad using only a subset of the input visits.

**Skills:** Use of pipetasks for image coaddition. Creating and writing to Butler collections. Properties of deepCoadds.

**LSST Data Products:** DP0.2 images: deepCoadd. DP0.2 catalogs: visitTable.

**Packages:** lsst.daf.butler, lsst.ctrl.mpexec, lsst.pipe.base

**Credit:** Originally developed by Melissa Graham and Clare Saunders.

**Get Support:**
Find DP0-related documentation and resources at <a href="https://dp0-2.lsst.io">dp0-2.lsst.io</a>. Questions are welcome as new topics in the <a href="https://community.lsst.org/c/support/dp0">Support - Data Preview 0 Category</a> of the Rubin Community Forum. Rubin staff will respond to all questions posted there.

## 1. Introduction

This notebook shows how to retrieve information about the individual images that contributed to a deepCoadd, and how to make a new coadded image using only a subset of the inputs.

In the past you might have used iraf's imcombine or Astromatics's SWarp (for example) to coadd images.
This notebook demonstrates the appropriate methods for coadding LSST images with the LSST Science Pipelines.

Science applications of coadding a subsets of LSST images includes searching for faint, slowly-evolving transients or variables (e.g.,, coadding images by season), using the effects of differential chromatic refraction (e.g., coadding images in bins of airmass), or perhaps searching for low surface brightness features (e.g., coadding only dark-time images with the faintest sky backgrounds).

### 1.1. Package imports

In [None]:
# standard python packages for numerical processing, plotting, and databases
import time
import getpass
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import pandas

# astropy package for time unit conversions
from astropy.time import Time

# lsst packages for data access and display
import lsst.geom
import lsst.afw.display as afwDisplay

# import lsst.daf.butler as dafButler
from lsst.daf.butler import Butler, DatasetType, CollectionType

# lsst packages for executing pipeline tasks
from lsst.ctrl.mpexec import SimplePipelineExecutor
from lsst.pipe.base import Pipeline, Instrument

### 1.2. Define functions and parameters

Set a few parameters related to plotting and display.

In [None]:
font = {'size': 14}
matplotlib.rc('font', **font)

pandas.set_option('display.max_rows', 1000)

afwDisplay.setDefaultBackend('matplotlib')

Set the DP0.2 config and collection for the butler.

In [None]:
config = 'dp02'
collection = '2.2i/runs/DP0.2'

Option to display all the deepCoadd datasetTypes available via our butler.

In [None]:
# for x in sorted(butler.registry.queryDatasetTypes()):
#     temp = str(x)
#     if temp.find('deepCoadd') > -1:
#         print(x)
#     del temp

## 2. Identify the visits to combine

This example starts with a given sky coordinate -- in this case, the right ascension and declination of a known galaxy cluster in the DC2 data set are used.

The DC2 skyMap is used to identify the deepCoadd which contains that coordinate, and then the butler is used to retrieve the deepCoadd and the list of visit ids which were combined to create it.

The visitTable is then used to obtain the acquisition dates of the input visits.
For this example, the visits in a short time range (with modified julian dates, MJDs, between 60925 and 60955) have been arbitrarily selected as the visits to be coadded.

First, instantiate a butler to use in Section 2.

> **Notice:** It is OK to ignore a pink-window message saying "WARNING: version mismatch between CFITSIO header (v4.000999999999999) and linked library (v4.01)."

In [None]:
butler = Butler(config, collections=collection)

### 2.1. Identify and retrieve the deepCoadd

This takes 4-5 seconds.

In [None]:
%%time

my_ra_deg = 55.745834
my_dec_deg = -32.269167

my_spherePoint = lsst.geom.SpherePoint(my_ra_deg*lsst.geom.degrees,
                                       my_dec_deg*lsst.geom.degrees)

skymap = butler.get('skyMap')
tract = skymap.findTract(my_spherePoint)
my_tract = tract.tract_id
my_patch = tract.findPatch(my_spherePoint).getSequentialIndex()
print('My tract and patch: ', my_tract, my_patch)

my_dataId = {'band': 'i', 'tract': my_tract, 
             'patch': my_patch}
my_deepCoadd = butler.get('deepCoadd', dataId=my_dataId)

# clean up
del my_ra_deg, my_dec_deg, my_spherePoint, tract

Option to display the deepCoadd image.

In [None]:
# fig = plt.figure(figsize=(6, 4))
# afw_display = afwDisplay.Display(1)
# afw_display.scale('asinh', 'zscale')
# afw_display.mtv(my_deepCoadd.image)
# plt.gca().axis('off')

Option to learn more about the deepCoadd metadata, such as bounding box, corners, and the World Coordinate System (WCS).
It is not necessary to know the bounding box for a deepCoadd in order to find all of the calexps that were used to assemble it, this is simply a demonstration for the learner.

In [None]:
# my_deepCoadd_bbox = butler.get('deepCoadd.bbox', dataId=my_dataId)
# print('bbox')
# print(my_deepCoadd_bbox.beginX, my_deepCoadd_bbox.beginY, 
#       my_deepCoadd_bbox.endX, my_deepCoadd_bbox.endY)

# print('')
# print('corners')
# print(my_deepCoadd_bbox.getCorners())

# print('')
# print('wcs')
# my_deepCoadd_wcs = butler.get('deepCoadd.wcs', dataId=my_dataId)
# print(my_deepCoadd_wcs)

# # clean up
# del my_deepCoadd_bbox, my_deepCoadd_wcs

### 2.2. Retrieve the deepCoadd's input visits

This takes 2-3 seconds.

In [None]:
%%time

my_coadd_inputs = butler.get("deepCoadd_calexp.coaddInputs", my_dataId)

Option to display the coadd inputs as an astropy table.

In [None]:
# my_coadd_inputs.visits.asAstropy()

The length of this table, 161, indicates that 161 separate visits contributed to this deepCoadd.

In [None]:
len(my_coadd_inputs.visits)

Option to list all of the deepCoadd input visit ids.

In [None]:
# my_coadd_visits = my_coadd_inputs.visits['id']
# my_coadd_visits

### 2.3. Identify the acquisition dates for the input visits

First, get the entire visit table.

In [None]:
visitTableRef = list(butler.registry.queryDatasets('visitTable'))

In [None]:
visitTable = butler.get(visitTableRef[0])

Option to display the contents of the visitTable.

In [None]:
# visitTable

The fact that the id column for both the my_coadd_inputs.vists table and the visitTable is the visit number (visit id) makes it simple to retrieve the MJDs of our coadd input visits.

In [None]:
my_coadd_visits_mjds = visitTable.loc[my_coadd_inputs.visits['id']]['expMidptMJD']

These list of MJDs have 161 elements, for the 161 separate visits contributed to this deepCoadd.

In [None]:
len(my_coadd_visits_mjds)

### 2.4. Identify input visits to combine into a new Coadd

Identify input visits with MJD between 60925 abd 60955.

In [None]:
range_start = 60925
range_end = 60955

fig, ax = plt.subplots(2, figsize=(10, 10))

ax[0].hist(my_coadd_visits_mjds, bins=150, color='dodgerblue')
ax[0].set_xlabel('MJD')
ax[0].set_ylabel('Number of Visits')
ax[0].axvline(range_start, ls='dashed', color='darkorange')
ax[0].axvline(range_end, ls='dashed', color='darkorange')

ax[1].hist(my_coadd_visits_mjds, bins=150, color='dodgerblue')
ax[1].set_xlabel('MJD')
ax[1].set_ylabel('Number of Visits')
ax[1].set_xlim([60880, 60985])
ax[1].axvline(range_start, ls='dashed', color='darkorange')
ax[1].axvline(range_end, ls='dashed', color='darkorange')
ax[1].text(range_start+1, 7.5, 'date range', color='darkorange')
ax[1].text(range_start+1, 7.0, 'of interest', color='darkorange')

plt.show()

There are 6 visits in the data range of interest.

Put this list of visits into a string, formatted as a tuple, for use in a query later on.
("Formatted as a tuple" means within round brackets and separated by commas).

In [None]:
my_range = np.array((my_coadd_visits_mjds > range_start)
                    & (my_coadd_visits_mjds < range_end))

my_visits = my_coadd_inputs.visits[my_range]

my_visits_tupleString = "("+",".join(my_visits['id'].astype(str))+")"
print(my_visits_tupleString)

## 3. Create a coadd with the subset of visits 

### 3.1. Name a new butler collection for output

> **Important:** Use the convention `u/[Your User Name]/coadd_recreation_nb` to set up a new output collection for this tutorial.
Recall that RSP user names are the same as GitHub user names, because GitHub accounts are used to authorize access to the RSP.

In [None]:
my_username = getpass.getuser()
my_outputCollection = 'u/'+my_username+'/coadd_recreation_nb'
print(my_outputCollection)

Check if this output collection already exists? If nothing is printed below this cell, the output collection does not already exist.

In [None]:
for c in sorted(butler.registry.queryCollections()):
    if c.find(my_outputCollection) > -1:
        print(c)

#### 3.1.1. To delete a collection you made

In the course of experimenting with the LSST Science Pipelines, if you create output collections that you want to then delete (e.g., they contain mistakes that you wouldn't want to accidentally include in your science analysis), this is how to remove a collection and its contents.

The cell below instantiates a temporary butler with write permissions, and then removes the collection by name.

> **Help Question:** At first this worked but now does not; need to fix.

In [None]:
# for c in sorted(butler.registry.queryCollections()):
#     if c.find(my_outputCollection) > -1:
#         print('Found: ', c)
#         try:
#             tmpButler = Butler(config, writeable=True)
#             tmpButler.registry.removeCollection(c)
#             print('Removed: ', c)
#         except:
#             print('Could not remove: ', c)
#         del tmpButler

<br>
Delete the current butler. In the next section, a simple butler is created and used thereafter.

In [None]:
del butler

### 3.2. Set up a simple butler with the new output collection

> **Notice:** In future updates to the LSST Science Pipelines, it will be possible to use:

> `simpleButler = SimplePipelineExecutor.prep_butler(config, inputs=[collection], output=my_outputCollection)`

For the LSST Science Pipelines version Weekly_2022_22, the following workaround is needed to set up a simple butler and create a new output collection with the name selected above.

In [None]:
outputRun = f"{my_outputCollection}/{Instrument.makeCollectionTimestamp()}"

tmpButler = Butler(config, writeable=True)
tmpButler.registry.registerCollection(outputRun, CollectionType.RUN)
tmpButler.registry.registerCollection(my_outputCollection, CollectionType.CHAINED)

collections = [collection]
collections.insert(0, outputRun)

tmpButler.registry.setCollectionChain(my_outputCollection, collections)

simpleButler = Butler(butler=tmpButler, collections=[my_outputCollection], run=outputRun)

Check that the subdirectory of the newly created output collection is first in the list, and note that the collection has had a timestamp added.
The output will go into that specific collection with that timestamp.
Notice that timestamps are year month day T hour minute second Z, where the time is UTC.

In [None]:
simpleButler.registry.getCollectionChain(my_outputCollection)

Option: check the output collection's timestamps that currently exist.

In [None]:
# for c in sorted(simpleButler.registry.queryCollections()):
#     if c.find(my_outputCollection) > -1:
#         print('Found: ', c)

### 3.3. Select the makeWarp and assembleCoadd tasks

In order to combine the identified visits into a new Coadd, two steps of the larger Data Release pipeline must be included: makeWarp and assembleCoadd. 
Although warped images are created during processing, they are not stored long-term because they take up a lot of space and because they can be easily recreated when needed.

Find relevant documentation for more information about <a href="https://pipelines.lsst.io/v/weekly/modules/lsst.pipe.base/creating-a-pipeline.html">creating a pipeline</a> or the <a href="https://pipelines.lsst.io/modules/lsst.pipe.tasks/tasks/lsst.pipe.tasks.assembleCoadd.AssembleCoaddTask.html">assembleCoadd<a> task.

The following method for creating the `assembleCoaddPipeline` task in a notebook -- using the `from_uri` function and passing a file path -- is not intuitive, but is preferred because it gives the user all of the configuration settings for the instrument automatically, and because it most closely replicates the command-line syntax.

In [None]:
assembleCoaddPipeline = Pipeline.from_uri('${PIPE_TASKS_DIR}/pipelines/DRP.yaml#makeWarp,assembleCoadd')

The other tasks available are listed in the yaml file. 
To see the other tasks, first open a new terminal (click the blue + button at upper left and then select terminal).
Then create a Rubin Observatory environment, navigate to the DRP.yaml file, and view its contents with:
> `setup lsst_distrib` <br>
> `cd ${PIPE_TASKS_DIR}/pipelines/`<br>
> `more DRP.yaml`

When attempting to use `more` on the DRP.yaml file, a redirect to its true location might be returned. If so, follow the path given.

### 3.4. Configure the pipeline

Configurations can be set using `addConfigOverride(<taskName>, <configName>, <configValue>)`.

There is only one configuration that must be set, and it is to clarify to the pipeline that it _does not_ need to redo a final image characterization step.
Currently, this configuration is needed due to a version mismatch: the DP0.2 data sets were processed with Version 23 of the LSST Science Pipelines, whereas this notebook uses the version "Weekly 2022_22".
This configuration might not be needed in the future.

In [None]:
assembleCoaddPipeline.addConfigOverride('makeWarp', 'doApplyFinalizedPsf', False) 

### 3.5. Create the full query string

Above, the visits to be recombined were stored in "my_visits_tupleString".

Below, the full query string is built to include the patch being recreated.

In [None]:
queryString = f"tract = {my_tract} AND patch = {my_patch} AND "+\
              f"visit in {my_visits_tupleString} AND skymap = 'DC2'"

print(queryString)

### 3.6. Use the Simple Pipeline Executor to run the pipeline

Set up the Simple Pipeline Executor.

https://pipelines.lsst.io/v/weekly/py-api/lsst.ctrl.mpexec.SimplePipelineExecutor.html

In [None]:
# SimplePipelineExecutor.from_pipeline?

This takes about 1.5 minutes.

In [None]:
%%time
spe = SimplePipelineExecutor.from_pipeline(assembleCoaddPipeline, where=queryString, butler=simpleButler)

Run the pipeline. There will be a lot of standard output. Right-click to the left of the cell and choose "Enable Scrolling for Outputs" to condense all of the output into a scrollable inset window.

This takes about 16 minutes for 6 visits.

In [None]:
%%time
quanta = spe.run()

## 4. Display and analyse the results

### 4.1. The quanta

> **Help Question:** What exactly is quanta? I'm calling it an object but is that appropriate?

The object `quanta` is ...

In [None]:
quanta

In [None]:
quanta[0].outputs

### 4.2. The new Coadd

> **Help Question:** Is it unnecessary to create a new butler to access the results, as is done below? Can we just get the results directly from quanta?

Create a new butler which only looks at the output collection in which the Coadd just created was stored.

In [None]:
my_butler = Butler(config, collections='u/melissagraham/coadd_recreation_nb/20220623T024126Z')

The dataId for the deepCoadd of interest was already defined:
> `my_dataId = {'band': 'i', 'tract': my_tract, 'patch': my_patch}`

Use it to retrieve the newly made Coadd (which is named deepCoadd by default) from the newly made butler.

In [None]:
my_new_deepCoadd = my_butler.get('deepCoadd', dataId=my_dataId)

Check the inputs of my_new_deepCoadd, and compare them with the "queryString", and see that they match.

In [None]:
my_new_deepCoadd_inputs = my_butler.get("deepCoadd.coaddInputs", my_dataId)

In [None]:
my_new_deepCoadd_inputs.visits.asAstropy()

In [None]:
print(queryString)

Display the new Coadd.

In [None]:
fig = plt.figure(figsize=(10, 6))
afw_display = afwDisplay.Display(1)
afw_display.scale('asinh', 'zscale')
afw_display.mtv(my_new_deepCoadd.image)
plt.gca().axis('off')

### 4.3. Run source measurement on the new Coadd

To be added: run source detection on new coadded image, and compare with the original deepCoadd, show how the new one is shallower.

## 5. Exercises for the learner

Use airmass constraints instead of MJD to identify the subset of visits to coadd.

> Hint: use 
> `my_coadd_visits_airmass = visitTable.loc[my_coadd_inputs.visits['id']]['airmass']`

## 6. Known Limitations

1. In order to make multiple different deepCoadds, e.g., one per week over a multi-week period, the user needs to repeat the process starting with making the simpleButler. Each new deepCoadd would be stored with the same name, "deepCoadd", but with a different Butler collection timestamp. The user would be able to tell which new deepCoadd was composed of which visits using the ".coaddInputs" as above.