Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-16536: Migrate all metrics from ap.verify.measurements #63

Merged
merged 7 commits into from
Mar 18, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 0 additions & 1 deletion bin/run_ci_dataset.sh
Original file line number Diff line number Diff line change
Expand Up @@ -85,5 +85,4 @@ ap_verify.py --dataset "${DATASET}" \
--output "${WORKSPACE}" \
--processes "${NUMPROC}" \
--metrics-file "${WORKSPACE}/ap_verify.{dataId}.verify.json" \
--silent \
&>> "${WORKSPACE}"/apVerify.log
11 changes: 11 additions & 0 deletions config/default_dataset_metrics.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
from lsst.verify.tasks import ConfigPpdbLoader
# Import these modules to ensure the metrics are registered
import lsst.ap.association.metrics # noqa: F401

ppdbConfigs = ["totalUnassociatedDiaObjects"]
config.measurers = ppdbConfigs

# List comprehension would be cleaner, but can't refer to config inside one
for subConfig in ppdbConfigs:
config.measurers[subConfig].dbLoader.retarget(ConfigPpdbLoader)
config.measurers[subConfig].dbInfo.name = "apPipe_config"
13 changes: 10 additions & 3 deletions config/default_metrics.py → config/default_image_metrics.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
from lsst.ap.verify.measurements.profiling import TimingMetricConfig
# Import these modules to ensure the metrics are registered
import lsst.ip.diffim.metrics # noqa: F401
import lsst.ap.association.metrics # noqa: F401

config.jobFileTemplate = "ap_verify.metricTask{id}.{dataId}.verify.json"

config.measurers = ["timing"]
metadataConfigs = ["numNewDiaObjects",
"numUnassociatedDiaObjects",
"fracUpdatedDiaObjects"]
config.measurers = ["timing", "numSciSources", "fracDiaSourcesToSciSources"] + metadataConfigs

timingConfigs = {
"apPipe.runDataRef": "ap_pipe.ApPipeTime",
Expand All @@ -25,3 +29,6 @@
config.measurers["timing"].configs[target] = subConfig
for subConfig in config.measurers["timing"].configs.values():
subConfig.metadata.name = "apPipe_metadata"
# List comprehension would be cleaner, but can't refer to config inside one
for subConfig in metadataConfigs:
config.measurers[subConfig].metadata.name = "apPipe_metadata"
47 changes: 16 additions & 31 deletions doc/lsst.ap.verify/command-line-reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,15 @@ Required arguments are :option:`--dataset` and :option:`--output`.

Allowed names can be queried using the :option:`--help` argument.

.. option:: --dataset-metrics-config <filename>

**Input dataset-level metrics config.**

A config file containing a `~lsst.verify.gen2tasks.MetricsControllerConfig`, which specifies which metrics are measured and sets any options.
If this argument is omitted, :file:`config/default_dataset_metrics.py` will be used.

Use :option:`--image-metrics-config` to configure image-level metrics instead.

.. option:: -h, --help

**Print help.**
Expand All @@ -70,12 +79,14 @@ Required arguments are :option:`--dataset` and :option:`--output`.

When ``processes`` is larger than 1 the pipeline may use the Python `multiprocessing` module to parallelize processing of multiple datasets across multiple processors.

.. option:: --metrics-config <filename>
.. option:: --image-metrics-config <filename>

**Input metrics config.**
**Input image-level metrics config.**

A config file containing a `~lsst.verify.gen2tasks.MetricsControllerConfig`, which specifies which metrics are measured and sets any options.
If this argument is omitted, :file:`config/default_metrics.py` will be used.
If this argument is omitted, :file:`config/default_image_metrics.py` will be used.

Use :option:`--dataset-metrics-config` to configure dataset-level metrics instead.

.. option:: --metrics-file <filename>

Expand All @@ -98,31 +109,5 @@ Required arguments are :option:`--dataset` and :option:`--output`.

**Do not report measurements to SQuaSH.**

Disables upload of measurements, so that ``ap_verify`` can be run for testing purposes by developers.

.. note::

Ingestion of :doc:`lsst.verify</modules/lsst.verify/index>` metrics is not yet supported by SQuaSH, so this flag should always be provided for now.


.. _ap-verify-cmd-envvar:

Environment variables
=====================

The :envvar:`SQUASH_USER`, :envvar:`SQUASH_PASSWORD`, and :envvar:`SQUASH_URL` environment variables are used by :doc:`the verify framework</modules/lsst.verify/index>` to configure SQuaSH upload.
:envvar:`SQUASH_USER` and :envvar:`SQUASH_PASSWORD` must be defined in any environment where :command:`ap_verify.py` is run unless the :option:`--silent` flag is used.

.. TODO: remove this once `lsst.verify` documents them, and update the link (DM-12849)

.. envvar:: SQUASH_USER

User name to use for SQuaSH submissions.

.. envvar:: SQUASH_PASSWORD

Unencrypted password for :envvar:`SQUASH_USER`.

.. envvar:: SQUASH_URL

The location for a SQuaSH REST API. Defaults to the SQuaSH server at ``lsst.codes``.
This flag previously disabled upload of measurements to SQuaSH.
SQuaSH support has been removed from ap_verify, so this flag has no effect and is deprecated.
1 change: 0 additions & 1 deletion doc/lsst.ap.verify/failsafe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ Recovering Metrics From Partial Runs
``ap_verify`` produces some measurements even if the pipeline cannot run to completion.
Specifically, if a task fails, any previously completed tasks that store measurements to disk will have done so.
In addition, if a metric cannot be computed, ``ap_verify`` may attempt to store the values of the remaining metrics.
Measurements from failed runs will never be submitted to SQuaSH.

If the pipeline fails, ``ap_verify`` may not preserve measurements computed from the dataset.
Once the framework for handling metrics is finalized, ``ap_verify`` may be able to offer a broader guarantee that does not depend on how or where any individual metric is implemented.
31 changes: 30 additions & 1 deletion doc/lsst.ap.verify/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ lsst.ap.verify
##############

The ``lsst.ap.verify`` package provides an executable python program for pipeline verification.
It runs the alert production pipeline (encapsulated in the :doc:`lsst.ap.pipe </modules/lsst.ap.pipe/index>` package), computes :doc:`lsst.verify </modules/lsst.verify/index>` metrics on both the pipeline's state and its output, and works with the `SQuaSH <https://squash.lsst.codes/>`_ system to allow their monitoring and analysis.
It runs the alert production pipeline (encapsulated in the :doc:`lsst.ap.pipe </modules/lsst.ap.pipe/index>` package) and computes :doc:`lsst.verify </modules/lsst.verify/index>` metrics on both the pipeline's state and its output.

``ap_verify`` is designed to work with small, standardized :doc:`datasets<datasets>` that can be interchanged to test the Stack's performance under different conditions.
To ensure consistent results, it :doc:`runs the entire AP pipeline<running>` as a single unit, from data ingestion to source association.
Expand Down Expand Up @@ -40,6 +40,35 @@ You can find Jira issues for this module under the `ap_verify <https://jira.lsst

new-metrics

Task reference
==============

.. _lsst.ap.verify-command-line-tasks:

Command-line tasks
------------------

.. lsst-cmdlinetasks::
:root: lsst.ap.verify

.. _lsst.ap.verify-tasks:

Tasks
-----

.. lsst-tasks::
:root: lsst.ap.verify
:toctree: tasks

.. _lsst.ap.verify-configs:

Configurations
--------------

.. lsst-configs::
:root: lsst.ap.verify
:toctree: configs

.. _lsst.ap.verify-pyapi:

Python API reference
Expand Down
9 changes: 2 additions & 7 deletions doc/lsst.ap.verify/running.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Using the `HiTS 2015 <https://github.com/lsst/ap_verify_hits2015/>`_ dataset as

.. prompt:: bash

ap_verify.py --dataset HiTS2015 --id "visit=412518 filter=g" --output workspaces/hits/ --silent
ap_verify.py --dataset HiTS2015 --id "visit=412518 filter=g" --output workspaces/hits/

Here the inputs are:

Expand All @@ -42,8 +42,6 @@ while the output is:

* :command:`workspaces/hits/` is the location where the pipeline will create any :ref:`Butler repositories<command-line-task-data-repo-using-uris>` necessary,

* :command:`--silent` disables SQuaSH metrics reporting.

This call will create a new directory at :file:`workspaces/hits`, ingest the HiTS data into a new repository based on :file:`<hits-data>/repo/`, then run visit 412518 through the entire AP pipeline.

.. note::
Expand Down Expand Up @@ -79,10 +77,7 @@ After ``ap_verify`` has run, it will produce files named, by default, :file:`ap_
The file name may be customized using the :option:`--metrics-file <ap_verify.py --metrics-file>` command-line argument.
These files contain metric measurements in ``lsst.verify`` format, and can be loaded and read as described in the :doc:`lsst.verify documentation</modules/lsst.verify/index>` or in `SQR-019 <https://sqr-019.lsst.io>`_.

Unless the :option:`--silent <ap_verify.py --silent>` argument is provided, ``ap_verify`` will also upload measurements to the `SQuaSH service <https://squash.lsst.codes/>`_ on completion.
See the SQuaSH documentation for details.

If the pipeline is interrupted by a fatal error, completed measurements will be saved to metrics files for debugging purposes, but nothing will get sent to SQuaSH.
If the pipeline is interrupted by a fatal error, completed measurements will be saved to metrics files for debugging purposes.
See the :ref:`error-handling policy <ap-verify-failsafe-partialmetric>` for details.

Further reading
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
.. lsst-task-topic:: lsst.ap.verify.measurements.profiling.TimingMetricTask

################
TimingMetricTask
################

``TimingMetricTask`` creates a `~lsst.verify.Measurement` based on data collected by @\ `~lsst.pipe.base.timeMethod`.
It reads the raw timing data from the top-level `~lsst.pipe.base.CmdLineTask`'s metadata, which is identified by the task configuration.

.. _lsst.ap.verify.measurements.TimingMetricTask-summary:

Processing summary
==================

``TimingMetricTask`` searches the metadata for @\ `~lsst.pipe.base.timeMethod`-generated keys corresponding to the method of interest.
If it finds matching keys, it stores the elapsed time as a `~lsst.verify.Measurement`.

.. _lsst.ap.verify.measurements.TimingMetricTask-api:

Python API summary
==================

.. lsst-task-api-summary:: lsst.ap.verify.measurements.profiling.TimingMetricTask

.. _lsst.ap.verify.measurements.TimingMetricTask-butler:

Butler datasets
===============

Input datasets
--------------

:lsst-config-field:`~lsst.ap.verify.measurements.profiling.TimingMetricConfig.metadata`
The metadata of the top-level command-line task (e.g., ``ProcessCcdTask``, ``ApPipeTask``) being instrumented.
Because the metadata produced by each top-level task is a different Butler dataset type, this dataset **must** be explicitly configured when running ``TimingMetricTask`` or a :lsst-task:`~lsst.verify.gen2tasks.MetricsControllerTask` that contains it.

.. _lsst.ap.verify.measurements.TimingMetricTask-subtasks:

Retargetable subtasks
=====================

.. lsst-task-config-subtasks:: lsst.ap.verify.measurements.profiling.TimingMetricTask

.. _lsst.ap.verify.measurements.TimingMetricTask-configs:

Configuration fields
====================

.. lsst-task-config-fields:: lsst.ap.verify.measurements.profiling.TimingMetricTask

.. _lsst.ap.verify.measurements.TimingMetricTask-examples:

Examples
========

.. code-block:: py

from lsst.ap.verify.measurements import TimingMetricTask

config = TimingMetricTask.ConfigClass()
config.metadata.name = "apPipe_metadata"
config.target = "apPipe:ccdProcessor.runDataRef"
config.metric = "pipe_tasks.ProcessCcdTime"
task = TimingMetricTask(config)

# config.metadata provided for benefit of MetricsControllerTask/Pipeline
# but since we've defined it we might as well use it
metadata = butler.get(config.metadata.name)
processCcdTime = task.run(metadata).measurement
38 changes: 3 additions & 35 deletions python/lsst/ap/verify/ap_verify.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,16 +30,14 @@
__all__ = ["runApVerify", "runIngestion"]

import argparse
import os
import re

import lsst.log
import lsst.utils

from .dataset import Dataset
from .ingestion import ingestDataset
from .metrics import MetricsParser, checkSquashReady, AutoJob
from .metrics import MetricsParser, computeMetrics
from .pipeline_driver import ApPipeParser, runApPipe
from .measurements import measureFromButlerRepo
from .workspace import Workspace


Expand All @@ -57,9 +55,6 @@ def __init__(self):
required=True, help='The source of data to pass through the pipeline.')
self.add_argument('--output', required=True,
help='The location of the workspace to use for pipeline repositories.')
self.add_argument('--metrics-config',
help='The config file specifying the metrics to measure. '
'Defaults to config/default_metrics.py.')


class _ApVerifyParser(argparse.ArgumentParser):
Expand Down Expand Up @@ -134,32 +129,6 @@ def __call__(self, _parser, namespace, values, _option_string=None):
setattr(namespace, self.dest, Dataset(values))


def _measureFinalProperties(workspace, dataIds, args):
"""Measure any metrics that apply to the final result of the AP pipeline,
rather than to a particular processing stage.

Parameters
----------
workspace : `lsst.ap.verify.workspace.Workspace`
The abstract location containing input and output repositories.
dataIds : `lsst.pipe.base.DataIdContainer`
The data IDs ap_pipe was run on. Each data ID must be complete.
args : `argparse.Namespace`
Command-line arguments, including arguments controlling output.
"""
if args.metrics_config is not None:
metricFile = args.metrics_config
else:
metricFile = os.path.join(lsst.utils.getPackageDir("ap_verify"),
"config", "default_metrics.py")

for dataRef in dataIds.refList:
with AutoJob(workspace.workButler, dataRef.dataId, args) as metricsJob:
measurements = measureFromButlerRepo(metricFile, workspace.analysisButler, dataRef.dataId)
for measurement in measurements:
metricsJob.measurements.insert(measurement)


def runApVerify(cmdLine=None):
"""Execute the AP pipeline while handling metrics.

Expand All @@ -177,15 +146,14 @@ def runApVerify(cmdLine=None):
log = lsst.log.Log.getLogger('ap.verify.ap_verify.main')
# TODO: what is LSST's policy on exceptions escaping into main()?
args = _ApVerifyParser().parse_args(args=cmdLine)
checkSquashReady(args)
log.debug('Command-line arguments: %s', args)

workspace = Workspace(args.output)
ingestDataset(args.dataset, workspace)

log.info('Running pipeline...')
expandedDataIds = runApPipe(workspace, args)
_measureFinalProperties(workspace, expandedDataIds, args)
computeMetrics(workspace, expandedDataIds, args)


def runIngestion(cmdLine=None):
Expand Down
1 change: 0 additions & 1 deletion python/lsst/ap/verify/measurements/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1 @@
from .compute_metrics import *
from .profiling import TimingMetricConfig, TimingMetricTask