-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-11592: Better documentation for ap_verify #12
Merged
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
baa0390
Conform to LSST naming style.
kfindeisen 01f0355
Skeleton for Sphinx package documentation.
kfindeisen 2684204
Write ap_verify package doc.
kfindeisen c24be65
Add dataset documentation.
kfindeisen 7c8efb6
Transfer how-to guide.
kfindeisen b20d5e5
Add command-line reference.
kfindeisen 85e5a99
Add config documentation.
kfindeisen d93739c
Discuss error-handling policy.
kfindeisen 48e53a7
Add overview.
kfindeisen c9fd80c
Add ap.verify.measurements documentation.
kfindeisen 6ea59d6
Clean up API documentation.
kfindeisen 9040f71
Fix API visibility bug.
kfindeisen File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
.. _ap_verify-package: | ||
|
||
######### | ||
ap_verify | ||
######### | ||
|
||
The ``ap_verify`` package wraps `lsst.ap.pipe` with support for managing `lsst.verify` metrics. | ||
It allows `lsst.ap.pipe` to be run on standardized data repositories provided by the :ref:`dataset framework<ap-verify-datasets>`. | ||
|
||
Project info | ||
============ | ||
|
||
Repository | ||
https://github.com/lsst-dm/ap_verify | ||
|
||
JIRA component | ||
`ap_verify <https://jira.lsstcorp.org/browse/DM/component/14167>`_ | ||
|
||
Modules | ||
======= | ||
|
||
- :ref:`lsst.ap.verify <lsst.ap.verify>` | ||
- :ref:`lsst.ap.verify.measurements <lsst.ap.verify.measurements>` | ||
|
||
.. NOTE: Need pid and issuetype | ||
.. _`Create a ticket`: https://jira.lsstcorp.org/secure/CreateIssueDetails!init.jspa?pid=&issuetype=&components=14167 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
############################### | ||
ap_verify documentation preview | ||
############################### | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
|
||
ap_verify/index.rst | ||
lsst.ap.verify/index.rst | ||
lsst.ap.verify.measurements/index.rst | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
.. currentmodule:: lsst.ap.verify.measurements | ||
|
||
.. _lsst.ap.verify.measurements: | ||
|
||
########################### | ||
lsst.ap.verify.measurements | ||
########################### | ||
|
||
The ``lsst.ap.verify.measurements`` package provides implementation code for metrics defined for the AP pipeline. | ||
It exposes functions that measure all applicable metrics from task metadata or processed Butler repositories. | ||
The set of metrics measured is deliberately kept opaque, so that ``ap_verify`` itself need not be modified every time a new metric is implemented. | ||
|
||
.. _lsst-ap-verify-measurements-overview: | ||
|
||
Python API reference | ||
==================== | ||
|
||
.. automodapi:: lsst.ap.verify.measurements | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,141 @@ | ||
.. _ap-verify-cmd: | ||
|
||
.. program:: ap_verify.py | ||
|
||
###################### | ||
Command-Line Reference | ||
###################### | ||
|
||
This page describes the command-line arguments and environment variables used by ``ap_verify``. | ||
|
||
Signature and syntax | ||
==================== | ||
|
||
The basic call signature of ``ap_verify`` is: | ||
|
||
.. code-block:: sh | ||
|
||
python ap_verify.py --dataset DATASET --output OUTPUTREPO --dataIdString DATAID | ||
|
||
These three arguments (or replacing ``--output`` with ``--rerun``) are mandatory, all others are optional. | ||
|
||
Status code | ||
=========== | ||
|
||
.. TODO: should we require that ap_verify and ap_pipe follow the CmdLineTask convention? (DM-12853) | ||
|
||
``ap_verify`` returns a status code of ``0`` if the pipeline ran to completion. | ||
If the pipeline fails, the status code will be an interpreter-dependent nonzero value. | ||
|
||
Named arguments | ||
=============== | ||
|
||
Required arguments are :option:`--dataset`, :option:`--dataIdString`, and exactly one of :option:`--output` or :option:`--rerun`. | ||
|
||
.. option:: --dataIdString <dataId> | ||
|
||
**Butler data ID.** | ||
|
||
The input data ID is required for all ``ap_verify`` runs except when using :option:`--help` or :option:`--version`. | ||
|
||
Specify data ID to process using data ID syntax. | ||
For example, ``--dataIdString "visit=12345 ccd=1 filter=g"``. | ||
|
||
Currently this argument is heavily restricted compared to its :ref:`command line task counterpart<command-line-task-dataid-howto>`. | ||
In particular, the dataId must specify exactly one visit and exactly one CCD, and may not be left blank to mean "all data". | ||
|
||
.. option:: --dataset <dataset_name> | ||
|
||
**Input dataset designation.** | ||
|
||
The input dataset is required for all ``ap_verify`` runs except when using :option:`--help` or :option:`--version`. | ||
|
||
The argument is a unique name for the dataset, which can be associated with a repository in the :ref:`configuration file<ap-verify-configuration-dataset>`. | ||
See :ref:`ap-verify-dataset-name` for more information on dataset names. | ||
|
||
Allowed names can be queried using the :option:`--help` argument. | ||
|
||
.. option:: -h, --help | ||
|
||
**Print help.** | ||
|
||
The help is equivalent to this documentation page, describing command-line arguments. | ||
|
||
.. option:: -j <processes>, --processes <processes> | ||
|
||
**Number of processes to use.** | ||
|
||
When ``processes`` is larger than 1 the pipeline may use the Python `multiprocessing` module to parallelize processing of multiple datasets across multiple processors. | ||
|
||
.. note:: | ||
|
||
This option is provided for forward-compatibility, but is not yet supported by ``ap_pipe``. | ||
|
||
.. option:: --output <output_repo> | ||
|
||
**Output data repository URI or path.** | ||
|
||
The output dataset or :option:`--rerun` is required for all ``ap_verify`` runs except when using :option:`--help` or :option:`--version`. | ||
|
||
The output data repository will be created if it does not exist. | ||
The path may be absolute or relative to the current working directory. | ||
|
||
``--output`` may not be used with the :option:`--rerun` argument. | ||
|
||
See :doc:`command-line-task-data-repo-howto` for background. | ||
|
||
.. TODO: I think the --rerun argument may have been a mistake -- it's almost entirely not quite unlike its command line task equivalent (DM-12853) | ||
|
||
.. option:: --rerun <output> | ||
|
||
**Specify output "rerun".** | ||
|
||
The rerun or :option:`--output` is required for all ``ap_verify`` runs except when using :option:`--help` or :option:`--version`. | ||
|
||
For ``ap_verify``, a rerun is an output repository relative to the dataset directory (as determined by :option:`--dataset`). | ||
This is different from command-line task reruns, which have an input repository and chain the rerun to it. | ||
An input rerun cannot be specified. | ||
|
||
``--rerun`` may not be used with the :option:`--output` argument. | ||
|
||
.. option:: --silent | ||
|
||
**Do not report measurements to SQuaSH.** | ||
|
||
Disables upload of measurements, so that ``ap_verify`` can be run for testing purposes by developers. | ||
|
||
.. note:: | ||
|
||
Ingestion of `lsst.verify` metrics is not yet supported by SQuaSH, so this flag should always be provided for now. | ||
|
||
.. option:: --version | ||
|
||
**Print version number.** | ||
|
||
Since ``ap_verify`` is not yet officially part of the Stack, the version number is arbitrary. | ||
|
||
|
||
.. _command-line-task-envvar: | ||
|
||
Environment variables | ||
===================== | ||
|
||
The :envvar:`SQUASH_USER`, :envvar:`SQUASH_PASSWORD`, and :envvar:`SQUASH_URL` environment variables are used by :ref:`the verify framework<lsst.verify>` to configure SQuaSH upload. | ||
:envvar:`SQUASH_USER` and :envvar:`SQUASH_PASSWORD` must be defined in any environment where ``ap_verify`` is run unless the :option:`--silent` flag is used. | ||
|
||
.. TODO: remove this once `lsst.verify` documents them, and update the link (DM-12849) | ||
|
||
.. envvar:: SQUASH_USER | ||
|
||
User name to use for SQuaSH submissions. | ||
|
||
.. envvar:: SQUASH_PASSWORD | ||
|
||
Unencrypted password for :envvar:`SQUASH_USER`. | ||
|
||
.. envvar:: SQUASH_URL | ||
|
||
The location for a SQuaSH REST API. Defaults to the SQuaSH server at ``lsst.codes``. | ||
|
||
.. _command-line-task-envvar-examples: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
.. _ap-verify-configuration: | ||
|
||
############################ | ||
Configuration File Reference | ||
############################ | ||
|
||
This page describes the file-based configuration options used by ``ap_verify``. | ||
Most users should not need to adjust these settings, but they allow capabilities such as registering new :ref:`datasets<ap-verify-datasets>`. | ||
|
||
.. TODO: more generic name? or split up file? (DM-12850) | ||
|
||
The ``ap_verify`` configuration file is located at :file:`config/dataset_config.yaml`. | ||
It consists of a list of dictionaries, each representing specific aspects of the program. | ||
|
||
.. _ap-verify-configuration-dataset: | ||
|
||
datasets | ||
======== | ||
|
||
The ``datasets`` dictionary maps dataset names (which must be provided on the ``ap_verify`` command line) to GitHub repository names. | ||
Adding a dataset to the config is necessary for ``ap_verify`` to recognize it; in practice, the entry will be made once by the dataset author and then committed. | ||
A dataset must still be :ref:`installed<ap-verify-datasets-install>` on the machine before it can be used. | ||
|
||
.. _ap-verify-configuration-measurements: | ||
|
||
measurements | ||
============ | ||
|
||
.. warning:: | ||
|
||
The metrics being used by ``ap_verify`` are still being defined. | ||
The syntax used to register them will likely change, and may be moved to a dedicated package entirely. | ||
This section of the configuration file should be treated as preliminary and subject to change. | ||
|
||
The ``measurements`` dictionary contains sub-dictionaries for each kind of metric. | ||
Currently there is only one: | ||
|
||
``timing`` | ||
A dictionary from tasks to the metrics that time them. | ||
Subtasks must be identified by the name the parent task assigns them, and should be prefixed by the parent task name (as in "imageDifference:detection") to avoid ambiguity. | ||
Metrics must use the full name following the convention of `lsst.verify.metrics`, as in "meas_algorithms.SourceDetectionTime". | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
.. _ap-verify-datasets-butler: | ||
|
||
################################ | ||
Datasets vs. Butler Repositories | ||
################################ | ||
|
||
Datasets are organized using a :ref:`specific directory structure<ap-verify-datasets-structure>` instead of an :ref:`LSST Butler repository<butler>`. | ||
This is by design: | ||
:ref:`ingestion of observatory files into a repository<ingest>` is considered part of the pipeline system being tested by ``ap_verify``, so ``ap_verify`` must be fed uningested data as its input. | ||
The ingestion step creates a valid repository that is then used by the rest of the pipeline. | ||
|
||
.. TODO: depends on whether subdirectories need a particular structure (DM-12851) | ||
|
||
A secondary benefit of this approach is that dataset maintainers do not need to manually ensure that the Git repository associated with a dataset remains a valid Butler repository despite changes to the dataset. | ||
The dataset format merely requires that files be segregated into appropriate directories, a much looser integrity constraint. | ||
|
||
While datasets are not Butler repositories themselves, the dataset format includes a directory, :file:`repo`, that serves as a template for the post-ingestion repository. | ||
This template helps ensure that all repositories based on the dataset will be properly set up, in particular that any observatory-specific settings will be applied. | ||
:file:`repo` is never modified by ``ap_verify``; all repositories created by the pipeline must be located elsewhere, whether or not they are backed by the file system. | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be useful to add
here.
This way the options documented below can be uniquely referenced as
or
for short
(http://www.sphinx-doc.org/en/stable/domains.html?highlight=option#directive-program)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I was using https://raw.githubusercontent.com/lsst/pipe_base/master/doc/lsst.pipe.base/command-line-task-argument-reference.rst as a template, which doesn't have that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, maybe I could add
program
, but in that case there was no specific script being reference, so it seemed safe to have the flags default to pointing to that command-line task page rather than being namespaced.