Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-18000: Implement PipelineTask and yaml file to convert DiaSources for SDM system inside ap_association #104

Merged
merged 4 commits into from
Feb 25, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
150 changes: 150 additions & 0 deletions data/DiaSource.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
funcs:
diaSourceId: # the index of deepCoadd_disSource IS the diaSourceId
functor: Column
args: id
ccdVisitId:
functor: Column
args: ccdVisitId
diaObjectId:
functor: Column
args: diaObjectId
# ssObjectId not implemented
parentDiaSourceId:
functor: Column
args: parent
midPointTai:
functor: Column
args: midPointTai
pixelId:
functor: Column
args: pixelId
bboxSize:
functor: Column
args: bboxSize
ra:
functor: RAColumn
# raErr: not available yet DM-15180
decl:
functor: DecColumn
# declErr: not available yet DM-15180
# ra_decl_Cov: not available yet
x:
functor: Column
args: slot_Centroid_x
y:
functor: Column
args: slot_Centroid_y
xErr:
functor: Column
args: slot_Centroid_xErr
yErr:
functor: Column
args: slot_Centroid_yErr
# x_y_Cov: not available
apFlux:
functor: LocalNanojansky
args:
- slot_ApFlux_instFlux
- slot_ApFlux_instFluxErr
- base_LocalPhotoCalib
- base_LocalPhotoCalibErr
apFluxErr:
functor: LocalNanojanskyErr
args:
- slot_ApFlux_instFlux
- slot_ApFlux_instFluxErr
- base_LocalPhotoCalib
- base_LocalPhotoCalibErr
# SNR need to make functor. DM-
psFlux:
functor: LocalNanojansky
args:
- slot_PsfFlux_instFlux
- slot_PsfFlux_instFluxErr
- base_LocalPhotoCalib
- base_LocalPhotoCalibErr
psFluxErr:
functor: LocalNanojanskyErr
args:
- slot_PsfFlux_instFlux
- slot_PsfFlux_instFluxErr
- base_LocalPhotoCalib
- base_LocalPhotoCalibErr
# ps_ra_cov not implemented
# ps_dec_cov not implemented
# psLnl not implemented
# psChi2 not implemented
# psNdata not implemented
# trailFlux not implemented
# trailRa not implemented
# trailDec not implemented
# trailAngle not implemented
# trailCov[15] not implemented
# trailLnL not implemented
# trailChi2 not implemented
# trailNdata not implemented
# dipMeanFlux needs functor DM-
# dipFluxDiff needs functor DM-
# dipRa not implemented
# dipDec not implemented
# (this may be redundant with RA/DEC as the default centroid is the
# dip model, defaulting to SdssCentroid on Dip-Fit failure.)
dipLength:
functor: ConvertPixelToArcseconds
args:
- ip_diffim_DipoleFit_separation
- base_LocalWcs_CDMatrix_1_1
- base_LocalWcs_CDMatrix_1_2
- base_LocalWcs_CDMatrix_2_1
- base_LocalWcs_CDMatrix_2_2
dipAngle:
functor: Column
args: ip_diffim_DipoleFit_orientation
# dipCov not implemented
# dipLnl not implemented
dipChi2:
functor: Column
args: ip_diffim_DipoleFit_chi2dof
# dipNdata not implemented
totFlux:
functor: LocalNanojansky
args:
- ip_diffim_forced_PsfFlux_instFlux
- ip_diffim_forced_PsfFlux_instFluxErr
- base_LocalPhotoCalib
- base_LocalPhotoCalibErr
totFluxErr:
functor: LocalNanojanskyErr
args:
- ip_diffim_forced_PsfFlux_instFlux
- ip_diffim_forced_PsfFlux_instFluxErr
- base_LocalPhotoCalib
- base_LocalPhotoCalibErr
# diffFlux not implemented and likely dropped due to no snaps.
# diffFluxErr not implemented and likely dropped due to no snaps.
# fpBkgd not measured yet. DM-
# fpBkgdErr not measured yet. DM-

# These values below work but need a new functor for converting pixel^2
# units to arcsec^2 DM-
Ixx:
functor: Column
args: slot_Shape_xx
Iyy:
functor: Column
args: slot_Shape_yy
Ixy:
functor: Column
args: slot_Shape_xy
# Icov not implemented
IxxPsf:
functor: Column
args: slot_PsfShape_xx
IyyPsf:
functor: Column
args: slot_PsfShape_yy
IxyPsf:
functor: Column
args: slot_PsfShape_xy
# extendedness not implemented
# spuriousness not implemented
181 changes: 181 additions & 0 deletions python/lsst/ap/association/transformDiaSourceCatalog.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
# This file is part of ap_association
#
# Developed for the LSST Data Management System.
# This product includes software developed by the LSST Project
# (https://www.lsst.org).
# See the COPYRIGHT file at the top-level directory of this distribution
# for details of code ownership.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.

__all__ = ("TransformDiaSourceCatalogConnections",
"TransformDiaSourceCatalogConfig",
"TransformDiaSourceCatalogTask")

import numpy as np
import os

from lsst.daf.base import DateTime
import lsst.pex.config as pexConfig
import lsst.pipe.base as pipeBase
import lsst.pipe.base.connectionTypes as connTypes
from lsst.pipe.tasks.postprocess import TransformCatalogBaseTask
from lsst.pipe.tasks.parquetTable import ParquetTable
from lsst.utils import getPackageDir


class TransformDiaSourceCatalogConnections(pipeBase.PipelineTaskConnections,
dimensions=("instrument", "visit", "detector"),
defaultTemplates={"coaddName": "deep", "fakesType": ""}):
"""Butler connections for TransformDiaSourceCatalogTask.
"""
diaSourceCat = connTypes.Input(
doc="Catalog of DiaSources produced during image differencing.",
name="{fakesType}{coaddName}Diff_diaSrc",
storageClass="SourceCatalog",
dimensions=("instrument", "visit", "detector"),
)
diffIm = connTypes.Input(
doc="Difference image on which the DiaSources were detected.",
name="{fakesType}{coaddName}Diff_differenceExp",
storageClass="ExposureF",
dimensions=("instrument", "visit", "detector"),
)
diaSourceTable = connTypes.Output(
doc=".",
name="{fakesType}{coaddName}Diff_diaSrcTable",
storageClass="DataFrame",
dimensions=("instrument", "visit", "detector"),
)


class TransformDiaSourceCatalogConfig(pipeBase.PipelineTaskConfig,
pipelineConnections=TransformDiaSourceCatalogConnections):
"""
"""
functorFile = pexConfig.Field(
dtype=str,
doc='Path to YAML file specifying Science DataModel functors to use '
'when copying columns and computing calibrated values.',
default=os.path.join(getPackageDir("ap_association"),
"data",
"DiaSource.yaml")
)


class TransformDiaSourceCatalogTask(TransformCatalogBaseTask):
"""Apply Science DataModel-ification on the DiaSource afw table.

This task calibrates and renames columns in the DiaSource catalog
to ready the catalog for insertion into the Apdb.

This is a Gen3 Butler only task. It will not run in Gen2.
"""

ConfigClass = TransformDiaSourceCatalogConfig
_DefaultName = "transformDiaSourceCatalog"

def __init__(self, **kwargs):
super().__init__(**kwargs)
self.funcs = self.getFunctors()

def runQuantum(self, butlerQC, inputRefs, outputRefs):
inputs = butlerQC.get(inputRefs)
expId, expBits = butlerQC.quantum.dataId.pack("visit_detector",
returnMaxBits=True)
inputs["ccdVisitId"] = expId
inputs["band"] = butlerQC.quantum.dataId["band"]

outputs = self.run(**inputs)

butlerQC.put(outputs, outputRefs)

def run(self,
diaSourceCat,
diffIm,
band,
ccdVisitId,
funcs=None):
"""Convert input catalog to ParquetTable/Pandas and run functors.

Additionally, add new columns for stripping information from the
exposure and into the DiaSource catalog.

Parameters
----------

Returns
-------
results : `lsst.pipe.base.Struct`
Results struct with components.

- ``diaSourceTable`` : Catalog of DiaSources with calibrated values
and renamed columns.
(`lsst.pipe.tasks.ParquetTable` or `pandas.DataFrame`)
"""
self.log.info(
"Transforming/standardizing the DiaSource table ccdVisitId: %i",
ccdVisitId)

diaSourceDf = diaSourceCat.asAstropy().to_pandas()
diaSourceDf["bboxSize"] = self.computeBBoxSizes(diaSourceCat)
diaSourceDf["ccdVisitId"] = ccdVisitId
diaSourceDf["filterName"] = band
diaSourceDf["midPointTai"] = diffIm.getInfo().getVisitInfo().getDate().get(system=DateTime.MJD)
diaSourceDf["diaObjectId"] = 0
diaSourceDf["pixelId"] = 0

df = self.transform(band,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.transform() returns the struct containing both df and analysis. Is the analysis part of it used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Analysis in this context is a PostprocessAnalysis (https://github.com/lsst/pipe_tasks/blob/0a6ae7924aec43e43d9a5ca4990e8100581c9634/python/lsst/pipe/tasks/postprocess.py#L341) object that is used by the TransformBaseCatalog and inherited classes to run the functors. I don't see a reason for returning it as it's a class to run the functors and clean up if requested.

ParquetTable(dataFrame=diaSourceDf),
self.funcs,
dataId=None).df
return pipeBase.Struct(
diaSourceTable=df
)

def computeBBoxSizes(self, inputCatalog):
"""Compute the size of a square bbox that fully contains the detection
footprint.

Parameters
----------
inputCatalog : `lsst.afw.table.SourceCatalog`
Catalog containing detected footprints.

Returns
-------
outputBBoxSizes : `numpy.ndarray`, (N,)
Array of bbox sizes.
"""
outputBBoxSizes = np.empty(len(inputCatalog), dtype=int)
for idx, record in enumerate(inputCatalog):
footprintBBox = record.getFootprint().getBBox()
# Compute twice the size of the largest dimension of the footprint
# bounding box. This is the largest footprint we should need to cover
# the complete DiaSource assuming the centroid is withing the bounding
# box.
maxSize = 2 * np.max([footprintBBox.getWidth(),
footprintBBox.getHeight()])
recX = record.getCentroid().x
recY = record.getCentroid().y
bboxSize = int(
np.ceil(2 * np.max(np.fabs([footprintBBox.maxX - recX,
footprintBBox.minX - recX,
footprintBBox.maxY - recY,
footprintBBox.minY - recY]))))
if bboxSize > maxSize:
bboxSize = maxSize
outputBBoxSizes[idx] = bboxSize

return outputBBoxSizes
22 changes: 22 additions & 0 deletions tests/data/testDiaSource.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Small set of columns to use for testing of the SDM transformer.
funcs:
diaSourceId: # the index of deepCoadd_disSource IS the diaSourceId
functor: Index
ccdVisitId:
functor: Column
args: ccdVisitId
diaObjectId:
functor: Column
args: diaObjectId
midPointTai:
functor: Column
args: midPointTai
pixelId:
functor: Column
args: pixelId
bboxSize:
functor: Column
args: bboxSize
filterName:
functor: Column
args: filterName