Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-15872: Incorporate AP documentation into pipelines.lsst.io #48

Merged
merged 6 commits into from
Oct 9, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,4 @@ Metrics are tested against both project- and lower-level requirements, and will
`ap_verify` is designed to work on downloadable Git LFS datasets, which must be installed separately.
Unlike the Alert Production pipeline itself, it cannot be run on generic data repositories.

For more details, including user instructions and information about supported datasets, consult the [package documentation](https://github.com/lsst-dm/ap_verify/tree/master/doc/lsst.ap.verify).
When `ap_verify` is formally added to the LSST Stack this documentation will be available through the [Science Pipelines](https://pipelines.lsst.io/) documentation.
For more details, including user instructions and information about supported datasets, consult the [package documentation](https://pipelines.lsst.io/v/daily/modules/lsst.ap.verify/).
23 changes: 0 additions & 23 deletions doc/ap_verify/index.rst

This file was deleted.

2 changes: 1 addition & 1 deletion doc/conf.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
"""Sphinx configuration file for an LSST stack package.

This configuration only affects single-package Sphinx documenation builds.
"""

from documenteer.sphinxconfig.stackconf import build_package_configs

import lsst.ap.verify

_g = globals()
Expand Down
4 changes: 1 addition & 3 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,4 @@ ap_verify documentation preview
.. toctree::
:maxdepth: 1

ap_verify/index.rst
lsst.ap.verify/index.rst
lsst.ap.verify.measurements/index.rst
lsst.ap.verify/index
18 changes: 0 additions & 18 deletions doc/lsst.ap.verify.measurements/index.rst

This file was deleted.

37 changes: 21 additions & 16 deletions doc/lsst.ap.verify/command-line-reference.rst
Original file line number Diff line number Diff line change
@@ -1,30 +1,38 @@
.. _ap-verify-cmd:
.. py:currentmodule:: lsst.ap.verify

.. program:: ap_verify.py

######################
Command-Line Reference
######################
.. _ap-verify-cmd:

################################
ap_verify command-line reference
################################

This page describes the command-line arguments and environment variables used by :command:`ap_verify.py`.

This page describes the command-line arguments and environment variables used by ``ap_verify``.
.. _ap-verify-cmd-basic:

Signature and syntax
====================

The basic call signature of ``ap_verify`` is:
The basic call signature of :command:`ap_verify.py` is:

.. prompt:: bash

ap_verify.py --dataset DATASET --output WORKSPACE --id DATAID

These three arguments are mandatory, all others are optional.

.. _ap-verify-cmd-return:

Status code
===========

``ap_verify`` returns a status code of ``0`` if the pipeline ran to completion.
:command:`ap_verify.py` returns a status code of ``0`` if the pipeline ran to completion.
If the pipeline fails, the status code will be an interpreter-dependent nonzero value.

.. _ap-verify-cmd-args:

Named arguments
===============

Expand All @@ -39,7 +47,7 @@ Required arguments are :option:`--dataset`, :option:`--id`, and :option:`--outpu
Specify data ID to process using data ID syntax.
For example, ``--id "visit=12345 ccd=1 filter=g"``.

Currently this argument is heavily restricted compared to its :ref:`command line task counterpart<command-line-task-dataid-howto>`.
Currently this argument is heavily restricted compared to its :doc:`command line task counterpart</modules/lsst.pipe.base/command-line-task-dataid-howto>`.
In particular, the dataId must specify exactly one visit and exactly one CCD, and may not be left blank to mean "all data".

.. option:: --dataset <dataset_name>
Expand Down Expand Up @@ -73,7 +81,7 @@ Required arguments are :option:`--dataset`, :option:`--id`, and :option:`--outpu

**Output metrics file.**

The name of a file to contain the metrics measured by ``ap_verify``, in a format readable by the `lsst.verify` framework.
The name of a file to contain the metrics measured by ``ap_verify``, in a format readable by the :doc:`lsst.verify</modules/lsst.verify/index>` framework.
If omitted, the output will go to a file named :file:`ap_verify.verify.json` in the user's working directory.

This argument can be used to run multiple instances of ``ap_verify`` concurrently, with each instance producing output to a different metrics file.
Expand All @@ -95,16 +103,16 @@ Required arguments are :option:`--dataset`, :option:`--id`, and :option:`--outpu

.. note::

Ingestion of `lsst.verify` metrics is not yet supported by SQuaSH, so this flag should always be provided for now.
Ingestion of :doc:`lsst.verify</modules/lsst.verify/index>` metrics is not yet supported by SQuaSH, so this flag should always be provided for now.


.. _command-line-task-envvar:
.. _ap-verify-cmd-envvar:

Environment variables
=====================

The :envvar:`SQUASH_USER`, :envvar:`SQUASH_PASSWORD`, and :envvar:`SQUASH_URL` environment variables are used by :ref:`the verify framework<lsst.verify>` to configure SQuaSH upload.
:envvar:`SQUASH_USER` and :envvar:`SQUASH_PASSWORD` must be defined in any environment where ``ap_verify`` is run unless the :option:`--silent` flag is used.
The :envvar:`SQUASH_USER`, :envvar:`SQUASH_PASSWORD`, and :envvar:`SQUASH_URL` environment variables are used by :doc:`the verify framework</modules/lsst.verify/index>` to configure SQuaSH upload.
:envvar:`SQUASH_USER` and :envvar:`SQUASH_PASSWORD` must be defined in any environment where :command:`ap_verify.py` is run unless the :option:`--silent` flag is used.

.. TODO: remove this once `lsst.verify` documents them, and update the link (DM-12849)

Expand All @@ -119,6 +127,3 @@ The :envvar:`SQUASH_USER`, :envvar:`SQUASH_PASSWORD`, and :envvar:`SQUASH_URL` e
.. envvar:: SQUASH_URL

The location for a SQuaSH REST API. Defaults to the SQuaSH server at ``lsst.codes``.

.. _command-line-task-envvar-examples:

15 changes: 8 additions & 7 deletions doc/lsst.ap.verify/configuration.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
.. py:currentmodule:: lsst.ap.verify

.. _ap-verify-configuration:

############################
Configuration File Reference
############################
######################################
ap_verify configuration file reference
######################################

This page describes the file-based configuration options used by ``ap_verify``.
Most users should not need to adjust these settings, but they allow capabilities such as registering new :ref:`datasets<ap-verify-datasets>`.
Most users should not need to adjust these settings, but they allow capabilities such as registering new :doc:`datasets<datasets>`.

.. TODO: more generic name? or split up file? (DM-12850)

Expand All @@ -17,9 +19,9 @@ It consists of a list of dictionaries, each representing specific aspects of the
datasets
========

The ``datasets`` dictionary maps dataset names (which must be provided on the ``ap_verify`` command line) to GitHub repository names.
The ``datasets`` dictionary maps dataset names (which must be provided on the :command:`ap_verify.py` command line) to GitHub repository names.
Adding a dataset to the config is necessary for ``ap_verify`` to recognize it; in practice, the entry will be made once by the dataset author and then committed.
A dataset must still be :ref:`installed<ap-verify-datasets-install>` on the machine before it can be used.
A dataset must still be :doc:`installed<datasets-install>` on the machine before it can be used.

.. _ap-verify-configuration-measurements:

Expand All @@ -39,4 +41,3 @@ Currently there is only one:
A dictionary from tasks to the metrics that time them.
Subtasks must be identified by the name the parent task assigns them, and should be prefixed by the parent task name (as in "imageDifference:detection") to avoid ambiguity.
Metrics must use the full name following the convention of `lsst.verify.metrics`, as in "meas_algorithms.SourceDetectionTime".

5 changes: 3 additions & 2 deletions doc/lsst.ap.verify/datasets-butler.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
.. py:currentmodule:: lsst.ap.verify

.. _ap-verify-datasets-butler:

################################
Datasets vs. Butler Repositories
Datasets vs. Butler repositories
################################

Datasets are organized using a :ref:`specific directory structure<ap-verify-datasets-structure>` instead of an :ref:`LSST Butler repository<butler>`.
Expand All @@ -15,4 +17,3 @@ The dataset format merely requires that files be segregated into science and cal
While datasets are not Butler repositories themselves, the dataset format includes a directory, :file:`repo`, that serves as a template for the post-ingestion repository.
This template helps ensure that all repositories based on the dataset will be properly set up, in particular that any observatory-specific settings will be applied.
:file:`repo` is never modified by ``ap_verify``; all repositories created by the pipeline must be located elsewhere, whether or not they are backed by the file system.

37 changes: 20 additions & 17 deletions doc/lsst.ap.verify/datasets-creation.rst
Original file line number Diff line number Diff line change
@@ -1,19 +1,21 @@
.. py:currentmodule:: lsst.ap.verify

.. _ap-verify-datasets-creation:

.. _ap-verify-datasets-structure:

###########################
Packaging Data as a Dataset
Packaging data as a dataset
###########################

:ref:`ap-verify-datasets` is designed to be as generic as possible, and should be able to accommodate any collection of observations so long as the source observatory has an :ref:`observatory interface (obs) package<obs-framework>` in the LSST software stack.
:doc:`datasets` is designed to be as generic as possible, and should be able to accommodate any collection of observations so long as the source observatory has an :ref:`observatory interface (obs) package<obs-framework>` in the LSST software stack.
This page describes how to create and maintain a dataset.
It does not include :ref:`configuring ap_verify to use the dataset<ap-verify-configuration>`.
It does not include :ref:`configuring ap_verify to use the dataset<ap-verify-configuration-dataset>`.

.. _ap-verify-datasets-creation-gitlfs:

Creating a Dataset Repository
-----------------------------
Creating a dataset repository
=============================

Datasets are Git LFS repositories with a particular directory and file structure.
The easiest way to create a new dataset is to `create an LFS repository <https://developer.lsst.io/git/git-lfs.html#git-lfs-create>`_, and add a copy of the `dataset template repository`_ as the initial commit.
Expand All @@ -23,8 +25,8 @@ This will create empty directories for all data and will add placeholder files f

.. _ap-verify-datasets-creation-layout:

Organizing the Data
-------------------
Organizing the data
===================

* The :file:`raw` and :file:`calib` directories contain science and calibration data, respectively.
The directories may have any internal structure.
Expand All @@ -36,8 +38,8 @@ The templates and reference catalogs need not be all-sky, but should cover the c

.. _ap-verify-datasets-creation-docs:

Documenting Datasets
--------------------
Documenting datasets
====================

Datasets provide package-level documentation in their :file:`doc` directory.
An example is provided in the `dataset template repository`_.
Expand All @@ -50,10 +52,10 @@ The dataset's package-level documentation should include:

.. _ap-verify-datasets-creation-config:

Configuring Dataset Ingestion
-----------------------------
Configuring dataset ingestion
=============================

Each dataset's :file:`config` directory should contain a :ref:`task config file<command-line-task-config-howto-configfile>` named :file:`datasetIngest.py`, which specifies an `lsst.ap.verify.DatasetIngestConfig`.
Each dataset's :file:`config` directory should contain a :ref:`task config file<command-line-task-config-howto-configfile>` named :file:`datasetIngest.py`, which specifies a `DatasetIngestConfig`.
The file typically contains filenames or file patterns specific to the dataset.
In particular, defect files and reference catalogs are ignored by default and need to be explicitly named.

Expand All @@ -64,8 +66,8 @@ Configuration settings specific to an instrument rather than a dataset should be

.. _ap-verify-datasets-creation-obs:

Registering an Observatory Package
----------------------------------
Registering an observatory package
==================================

The observatory package must be named in two files:

Expand All @@ -77,8 +79,9 @@ The observatory package must be named in two files:

.. _ap-verify-datasets-creation-name:

Registering a Dataset Name
--------------------------
Registering a dataset name
==========================

In order to be supported by ``ap_verify``, datasets must be registered in ``ap_verify``'s :ref:`configuration file<ap-verify-configuration-dataset>` and registered as an *optional* EUPS dependency of ``ap_verify``.
In order to be supported by ``ap_verify``, datasets must be registered in ``ap_verify``'s :ref:`configuration file<ap-verify-configuration-dataset>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, possessives of inline syntax often need to be escaped, like this:

``ap_verify``\ ’s

I haven't compiled this yet (long story) so I can't say if the rendering is mangled in this case or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It compiles fine. I think it's the single backticks that you need to watch out for.

The line for the new dataset should be committed to the ``ap_verify`` Git repository.
To avoid accidental downloads, datasets **should not** be registered as an EUPS dependency of ``ap_verify``, even an optional one.
15 changes: 8 additions & 7 deletions doc/lsst.ap.verify/datasets-install.rst
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
.. py:currentmodule:: lsst.ap.verify

.. _ap-verify-datasets-install:

###################
Installing Datasets
Installing datasets
###################

:ref:`ap-verify-datasets` packages data in self-contained units that are intended to be easy to install for LSST Stack users.
:doc:`datasets` packages data in self-contained units that are intended to be easy to install for LSST Stack users.
It is not necessary to install all datasets supported by ``ap_verify``, only those you intend to use.

Prerequisites
-------------
=============

The Dataset framework requires that the computer have version 13.0 or later of the LSST Stack (specifically, the ``obs`` packages and their dependencies) installed.
:ref:`Installing lsst_distrib <part-installation>` is the simplest way to ensure all dependencies are satisfied.
Expand All @@ -19,18 +21,17 @@ EUPS is included in the Stack installation.
.. _Git LFS: https://developer.lsst.io/tools/git_lfs.html
.. _EUPS: https://developer.lsst.io/build-ci/eups_tutorial.html

Installation Procedure
----------------------
Installation procedure
======================

Use the `LSST Software Build Tool <https://developer.lsst.io/stack/lsstsw.html>`_ to request the dataset by its package name.
A :ref:`list of existing datasets <ap-verify-datasets-index>` is maintained as part of this documentation.
Because of their large size (typically hundreds of GB), datasets are *never* installed as a dependency of another package; they must be requested explicitly.

For example, to install the :ref:`HiTS 2015 <ap_verify_hits2015-package>` dataset,
For example, to install the `HiTS 2015 <https://github.com/lsst/ap_verify_hits2015/>`_ dataset,

.. prompt:: bash

rebuild -u ap_verify_hits2015

Once this is done, ``ap_verify`` will be able to find the HiTS data upon request.