Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-11592: Better documentation for ap_verify #12

Merged
merged 12 commits into from Dec 14, 2017
Merged

Conversation

kfindeisen
Copy link
Member

This PR adds new-style documentation to the ap_verify package. Since the package is not a library and has a minimal public-facing API, most of the documentation is instructions for ap_verify users and guides to key ap_verify concepts (in particular, datasets).

There are a number of cross-references to the other ap packages, the verify documentation, the pipe_base documentation, and the butler and obs framework topic documentation. Naturally, these references will need to be updated once these documents actually exist.

Since new-style documentation is not yet built by Travis, I don't have an HTML version of the documentation to link to.

Copy link
Member

@jonathansick jonathansick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kfindeisen, looks really good. I've added some suggested changes as line comments.

ap_verify
#########

The ``ap_verify`` package wraps :py:mod:`ap_pipe <lsst.ap.pipe>` with support for managing :py:mod:`verify <lsst.verify>` metrics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "verify" use the actual module namespace for clarity:

for managing `lsst.verify` metrics.

Note that single back-ticks can be used for any type of Python API reference (being the default role).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So :mod: is also redundant?

-------------

The Dataset framework requires that the computer have version 13.0 or later of the LSST Stack (specifically, the ``obs`` packages and their dependencies) installed.
`Installing lsst_distrib <https://pipelines.lsst.io/#installation>`_ is the simplest way to ensure all dependencies are satisfied.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:ref:`Installing lsst_distrib <part-installation>` [...]

The Dataset framework requires that the computer have version 13.0 or later of the LSST Stack (specifically, the ``obs`` packages and their dependencies) installed.
`Installing lsst_distrib <https://pipelines.lsst.io/#installation>`_ is the simplest way to ensure all dependencies are satisfied.

The framework also requires `Git LFS`_ and the `EUPS`_ package management system. The latter is usually included by the Stack installation process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second sentence:

EUPS is included in the Stack installation.


.. TODO: should we have a proper versioning system for datasets? (DM-12853)

Once the dataset has been installed, use ``eups declare`` to register the downloaded directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to wrap inline shell commands in the command role, like

:command:`eups declare`


.. code-block:: sh

$ git clone https://github.com/lsst/ap_verify_hits2015 mydata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been leaving out any $ or > characters to reduce copy-paste errors. There are special directives for making uncopyable prompts but I've had some trouble with them.


``ap_verify`` is a Python script designed to be run on both developer machines and verification servers.
This page describes the minimum options needed to run ``ap_verify``.
For more details, see the :ref:`ap-verify-cmd` or run ``ap_verify`` with the ``-h`` argument.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[...] or run  :option:`ap_verify.py -h`.

(links to the reference docs).

"""Measure any metrics that apply to the final result of the AP pipeline,
rather than to a particular processing stage.

Parameters
----------
metadata: `lsst.daf.base.PropertySet`
The metadata produced by the AP pipeline.
metrics_job: `verify.Job`
metricsJob: `verify.Job`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lsst.verify.Job

— needs the actual namespace to make the link.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. It looks like the dev guide contradicts itself here:

Does the link generation actually work differently in these two contexts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I may need to update those docs. I guess we'll see if the link works when it's integrated into pipelines.lsst.io.

Copy link
Member Author

@kfindeisen kfindeisen Dec 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related question: should exceptions also be fully qualified? There are a few that are not in __builtins__, and the dev guide doesn't say.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fully-qualified.

def measure_from_metadata(metadata):
"""Attempts to compute all known metrics on Task metadata.
def measureFromMetadata(metadata):
"""Attempt to compute all known metrics on Task metadata.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave out the "Attempt to" part.


Raises
------
`RuntimeError`:
the config file exists, but does not contain the expected data
RuntimeError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No colon after the exception in the Raises section

Raises
------
RuntimeError
    The ``ap_verify`` configuration file exists, but does not contain the
    expected data under ``measurements``.

(check the Raises sections of other docstrings)

The fully qualified name of the metric being measured, e.g.,
"pipe_tasks.ProcessCcdTime"

Returns
-------
an `lsst.verify.Measurement` for `metric_name`, or `None` if the timing
information for `task_name` is not present in `metadata`
an `lsst.verify.Measurement` for `metricName`, or `None` if the timing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returns is formatted like Parameters, even if you need to make up the name of the return variable. For example, I'd write this one as:

Returns
-------
measurement : `lsst.verify.Measurement`
    The metric measurement, or `None` if timing information for ``taskName`` is not present in the ``metadata``.

@kfindeisen kfindeisen force-pushed the tickets/DM-11592 branch 2 times, most recently from 4f8ce9a to 32fd6aa Compare December 6, 2017 23:20
Copy link
Contributor

@ebellm ebellm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall--just need to make a clarification about parallelization.


.. prompt:: bash

python ap_verify/bin/ap_verify.py --dataset HiTS2015 --output workspace/hits/ --dataIdString "visit=54123 ccd=25 filter=g" --silent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would note here that running ap_verify in parallel causes the metric output to be clobbered, and reference DM-13042 (which is the ticket to fix that).

How to Use Measurements of Metrics
----------------------------------

After ``ap_verify`` has run, it will produce a file named :file:`ap_verify.verify.json` in the working directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make the same comment here about parallelization also.

@kfindeisen
Copy link
Member Author

A compiled version [zip] of the latest docs.


.. warning::

``ap_verify.py`` is not designed to have multiple instances running concurrently.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about "...does not currently support running multiple instances concurrently." instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@kfindeisen kfindeisen merged commit 9040f71 into master Dec 14, 2017
@kfindeisen kfindeisen deleted the tickets/DM-11592 branch November 30, 2018 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants