Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-22663: Reimplement make_apdb.py for Gen 3 #62

Merged
merged 5 commits into from
Nov 3, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
33 changes: 22 additions & 11 deletions doc/lsst.ap.pipe/apdb.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,13 @@ Configuring the database
========================

The database is configured using `~lsst.dax.apdb.ApdbConfig`.
|ap_pipe| command line users can pass configuration information to the script through the :option:`--config <ap_pipe.py --config>` and :option:`--configfile <ap_pipe.py --configfile>` command-line options.
|make_apdb| also uses `ApPipeConfig` and the :option:`--config` and :option:`--configfile` options, so users can pass exactly the same arguments to |make_apdb| and |ap_pipe|.
Supporting identical command line arguments for both scripts makes it easy to keep the database settings in sync.
|ap_pipe| command line users can pass configuration information to the script through the :option:`--config <ap_pipe.py --config>` and :option:`--configfile <ap_pipe.py --configfile>` command-line options, using the prefix ``diaPipe.apdb.`` to distinguish APDB information from other pipeline configuration.
mrawls marked this conversation as resolved.
Show resolved Hide resolved
|make_apdb| configures the database directly through :option:`--config` and :option:`--config-file` (different spelling, for consistency with |pipetask|), with no prefix.

For |pipetask| users the process is almost the same, except that |pipetask|'s config syntax is not exactly the same.
The :option:`--config <pipetask --config>` and :option:`--configfile <pipetask --configfile>` options for |pipetask| use colons as separators between each task and its config; replace these with periods for |make_apdb|.
For |pipetask| users, the APDB is configured with the :option:`--config <pipetask run --config>` and :option:`--config-file <pipetask run --config-file>` options.
APDB configuration info uses the prefix ``diaPipe:apdb.``, with a colon, but is otherwise the same.

Note that ``apdb.db_url`` has no default; a value *must* be provided by the user.
Note that the `~lsst.dax.apdb.ApdbConfig.db_url` field has no default; a value *must* be provided by the user.

.. _section-ap-pipe-apdb-examples:

Expand All @@ -46,23 +45,35 @@ Databases can be configured using direct config overrides (see :ref:`ap-pipe-pip

.. prompt:: bash

make_apdb.py -c diaPipe.apdb.isolation_level=READ_UNCOMMITTED diaPipe.apdb.db_url="sqlite:///databases/apdb.db" differencer.coaddName=dcr
make_apdb.py -c isolation_level=READ_UNCOMMITTED db_url="sqlite:///databases/apdb.db"
kfindeisen marked this conversation as resolved.
Show resolved Hide resolved
ap_pipe.py -c diaPipe.apdb.isolation_level=READ_UNCOMMITTED diaPipe.apdb.db_url="sqlite:///databases/apdb.db" differencer.coaddName=dcr repo --calib repo/calibs --rerun myrun --id [optional IDs to process]

|make_apdb| ignores any `ApPipeConfig` fields not related to the APDB (in the example, ``differencer.coaddName``), so there is no need to filter them out.
The user is responsible for making sure the two APDB configurations are consistent.

In Gen 3, this becomes (see :ref:`ap-pipe-pipeline-tutorial` for an explanation of |pipetask|):

.. prompt:: bash

make_apdb.py -c diaPipe.apdb.isolation_level=READ_UNCOMMITTED diaPipe.apdb.db_url="sqlite:///databases/apdb.db" differencer.coaddName=dcr
make_apdb.py -c isolation_level=READ_UNCOMMITTED db_url="sqlite:///databases/apdb.db"
pipetask run -p ApPipe.yaml -c diaPipe:apdb.isolation_level=READ_UNCOMMITTED diaPipe:apdb.db_url="sqlite:///databases/apdb.db" differencer:coaddName=dcr -b repo -o myrun

Databases can also be set up using config files:
.. warning::

As in Gen 2, make sure the APDB is created with a configuration consistent with the one used by the pipeline.
Note that the pipeline file given by ``-p`` may include APDB config overrides of its own.
You can double-check what configuration is being run by calling :command:`pipetask run` with the ``--show config="apdb*"`` argument, though this lists *all* configuration options, including those left at their defaults.

Databases can also be set up using :ref:`config files <command-line-task-config-howto-configfile>`:

.. code-block:: py
:caption: myApdbConfig.py

config.db_url = "sqlite:///databases/apdb.db"
config.isolation_level = "READ_UNCOMMITTED"

.. prompt:: bash

make_apdb.py -C myApPipeConfig.py
make_apdb.py -C myApdbConfig.py
kfindeisen marked this conversation as resolved.
Show resolved Hide resolved
ap_pipe.py repo --calib repo/calibs --rerun myrun -C myApPipeConfig.py --id [optional ID to process]

.. _section-ap-pipe-apdb-seealso:
Expand Down
2 changes: 1 addition & 1 deletion doc/lsst.ap.pipe/pipeline-tutorial-gen2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ To process your ingested data, run
.. prompt:: bash

mkdir apdb/
make_apdb.py -c diaPipe.apdb.isolation_level=READ_UNCOMMITTED -c diaPipe.apdb.db_url="sqlite:///apdb/association.db"
make_apdb.py -c isolation_level=READ_UNCOMMITTED -c db_url="sqlite:///apdb/association.db"
ap_pipe.py repo --calib repo/calibs --rerun processed -c diaPipe.apdb.isolation_level=READ_UNCOMMITTED -c diaPipe.apdb.db_url="sqlite:///apdb/association.db" --id visit=123456^123457 ccdnum=42 filter=g --template templates

In this case, a ``processed`` directory will be created within ``repo/rerun`` and the results will be written there.
Expand Down
2 changes: 1 addition & 1 deletion doc/lsst.ap.pipe/pipeline-tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ To process your ingested data, run
.. prompt:: bash

mkdir apdb/
make_apdb.py -c diaPipe.apdb.isolation_level=READ_UNCOMMITTED -c diaPipe.apdb.db_url="sqlite:///apdb/association.db"
make_apdb.py -c isolation_level=READ_UNCOMMITTED -c db_url="sqlite:///apdb/association.db"
pipetask run -p ${AP_PIPE_DIR}/pipelines/ApPipe.yaml --instrument lsst.obs.decam.DarkEnergyCamera --register-dataset-types -c diaPipe:apdb.isolation_level=READ_UNCOMMITTED -c diaPipe:apdb.db_url="sqlite:///apdb.db" -b repo/ -i "templates/deep,skymaps,DECam/raw/all,DECam/calib,refcats" -o processed -d "visit in (123456, 123457) and detector=42"

In this case, a ``processed/<timestamp>`` collection will be created within ``repo`` and the results will be written there.
Expand Down
83 changes: 63 additions & 20 deletions python/lsst/ap/pipe/make_apdb.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,9 @@

import argparse

from lsst.pipe.base import ConfigFileAction, ConfigValueAction
from lsst.ap.association import make_dia_object_schema, make_dia_source_schema

from .ap_pipe import ApPipeConfig
from lsst.dax.apdb import Apdb
from lsst.pipe.base.configOverrides import ConfigOverrides
from lsst.ap.association import DiaPipelineConfig, make_dia_object_schema, make_dia_source_schema


class ConfigOnlyParser(argparse.ArgumentParser):
Expand All @@ -38,26 +37,25 @@ def __init__(self, description=None, **kwargs):
# Description must be readable in both Sphinx and make_apdb.py -h
description = """\
Create a Alert Production Database using config overrides for
`lsst.ap.pipe.ApPipeConfig`.
`lsst.dax.apdb.ApdbConfig`.

This script takes the same ``--config`` and ``--configfile`` arguments as
command-line tasks. Calling ``ap_pipe.py`` with the same arguments uses the
newly created database.
This script takes the same ``--config`` and ``--config-file`` arguments as
pipeline runs. However, the configs are at a lower level than the AP pipeline.

The config overrides must define ``apdb.db_url`` to create a valid config.
The config overrides must define ``db_url`` to create a valid config.
"""

super().__init__(description=description, **kwargs)

self.add_argument("-c", "--config", nargs="*", action=ConfigValueAction,
help="config override(s), e.g. "
"``-c apdb.prefix=fancy diaPipe.apdb.db_url=\"sqlite://\"``",
"``-c prefix=fancy db_url=\"sqlite://\"``",
metavar="NAME=VALUE")
self.add_argument("-C", "--configfile", dest="configfile", nargs="*", action=ConfigFileAction,
help="config override file(s) for ApPipeConfig")
self.add_argument("-C", "--config-file", dest="configfile", nargs="*", action=ConfigFileAction,
help="config override file(s) for ApdbConfig")

def parse_args(self, args=None, namespace=None):
"""Parse arguments for an `ApPipeConfig`.
"""Parse arguments for an `ApdbConfig`.

Parameters
----------
Expand All @@ -76,14 +74,24 @@ def parse_args(self, args=None, namespace=None):
"""
if not namespace:
namespace = argparse.Namespace()
namespace.config = ApPipeConfig()
namespace.overrides = ConfigOverrides()

# ConfigFileAction and ConfigValueAction require namespace.config to exist
# ConfigFileAction and ConfigValueAction require namespace.overrides to exist
namespace = super().parse_args(args, namespace)
del namespace.configfile
# Make ApdbConfig as a subconfig of DiaPipelineConfig to ensure correct defaults get set
namespace.config = DiaPipelineConfig().apdb.value
try:
namespace.overrides.applyTo(namespace.config)
except Exception as e: # yes, configs really can raise anything
message = str(e)
if "diaPipe" in message:
message = "it looks like one of the config files uses ApPipeConfig; " \
"try dropping 'diaPipe.apdb'\n" + message
self.error(f"cannot apply config: {message}")

namespace.config.validate()
# Do not freeze the config, as a workaround for DM-24435.
namespace.config.freeze()

return namespace

Expand All @@ -92,7 +100,7 @@ def makeApdb(args=None):
"""Create an APDB according to a config.

The command-line arguments should provide config values or a config file
for `ApPipeConfig`.
for `ApdbConfig`.

Parameters
----------
Expand All @@ -108,8 +116,43 @@ def makeApdb(args=None):
parser = ConfigOnlyParser()
parsedCmd = parser.parse_args(args=args)

apdb = parsedCmd.config.diaPipe.apdb.apply(
afw_schemas=dict(DiaObject=make_dia_object_schema(),
DiaSource=make_dia_source_schema()))
apdb = Apdb(config=parsedCmd.config,
afw_schemas=dict(DiaObject=make_dia_object_schema(),
DiaSource=make_dia_source_schema()))
apdb.makeSchema()
return apdb


# --------------------------------------------------------------------
# argparse.Actions for use with ConfigOverrides
# ConfigOverrides is normally used with Click; there is no built-in
# argparse support.


class ConfigValueAction(argparse.Action):
"""argparse action to override config parameters using
name=value pairs from the command-line.
"""

def __call__(self, parser, namespace, values, option_string):
if namespace.overrides is None:
return
for nameValue in values:
name, sep, valueStr = nameValue.partition("=")
if not valueStr:
parser.error(f"{option_string} value {nameValue} must be in form name=value")
if "diaPipe.apdb" in name:
parser.error(f"{nameValue} looks like it uses ApPipeConfig; try dropping 'diaPipe.apdb'")
kfindeisen marked this conversation as resolved.
Show resolved Hide resolved

namespace.overrides.addValueOverride(name, valueStr)


class ConfigFileAction(argparse.Action):
"""argparse action to load config overrides from one or more files.
"""

def __call__(self, parser, namespace, values, option_string=None):
if namespace.overrides is None:
return
for configfile in values:
namespace.overrides.addFileOverride(configfile)
68 changes: 64 additions & 4 deletions tests/test_makeapdb.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,10 @@
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.

import contextlib
import io
import shlex
import sys
import unittest

import lsst.utils.tests
Expand Down Expand Up @@ -50,17 +53,74 @@ def setUp(self):
def testExtras(self):
"""Verify that a command line containing extra arguments is rejected.
"""
args = '-c diaPipe.apdb.db_url="dummy" --id visit=42'
args = '-c db_url="dummy" --id visit=42'
with self.assertRaises(SystemExit):
self._parseString(args)

def testSetValue(self):
"""Verify that command-line arguments get propagated.
"""
args = '-c diaPipe.apdb.db_url="dummy" -c diaPipe.apdb.dia_object_index=pix_id_iov'
args = '-c db_url="dummy" -c dia_object_index=pix_id_iov'
parsed = self._parseString(args)
self.assertEqual(parsed.config.diaPipe.apdb.db_url, 'dummy')
self.assertEqual(parsed.config.diaPipe.apdb.dia_object_index, 'pix_id_iov')
self.assertEqual(parsed.config.db_url, 'dummy')
self.assertEqual(parsed.config.dia_object_index, 'pix_id_iov')

def testSetValueFile(self):
"""Verify that config files are handled correctly.
"""
with lsst.utils.tests.getTempFilePath(ext=".py") as configFile:
with open(configFile, mode='wt') as config:
config.write('config.db_url = "dummy"\n')
config.write('config.dia_object_index = "pix_id_iov"\n')

args = f"-C {configFile}"
parsed = self._parseString(args)

self.assertEqual(parsed.config.db_url, 'dummy')
self.assertEqual(parsed.config.dia_object_index, 'pix_id_iov')

def testDiaPipeDefaults(self):
"""Verify that DiaPipelineTask's custom APDB settings are included.
"""
args = '-c db_url="dummy"'
parsed = self._parseString(args)
self.assertIn("ap_association", parsed.config.extra_schema_file)

@contextlib.contextmanager
def _temporaryBuffer(self):
tempStdErr = io.StringIO()
savedStdErr = sys.stderr
sys.stderr = tempStdErr
try:
yield tempStdErr
finally:
sys.stderr = savedStdErr

def testOldConfig(self):
"""Verify that old-style config options are caught.
"""
args = '-c diaPipe.apdb.db_url="dummy"'
with self._temporaryBuffer() as buffer:
with self.assertRaises(SystemExit):
self._parseString(args)

output = buffer.getvalue()
self.assertIn("try dropping 'diaPipe.apdb'", output)

def testOldConfigFile(self):
"""Verify that old-style config file entries are caught.
"""
with lsst.utils.tests.getTempFilePath(ext=".py") as configFile:
with open(configFile, mode='wt') as config:
config.write('config.diaPipe.apdb.db_url = "dummy"\n')

args = f"-C {configFile}"
with self._temporaryBuffer() as buffer:
with self.assertRaises(SystemExit):
self._parseString(args)

output = buffer.getvalue()
self.assertIn("try dropping 'diaPipe.apdb'", output)


class MemoryTester(lsst.utils.tests.MemoryTestCase):
Expand Down