Running ap_verify From the Command Line¶
ap_verify
is a Python script designed to be run on both developer machines and verification servers.
While ap_verify
is not a command-line task, the command-line interface is designed to resemble that of command-line tasks where practical.
This page describes the minimum options needed to run ap_verify
.
For more details, see the Command-Line Reference or run ap_verify.py -h
.
Datasets as Input Arguments¶
Since ap_verify
begins with an uningested dataset, the input argument is a dataset name rather than a repository.
Datasets are identified by a name that gets mapped to an eups-registered directory containing the data.
The mapping is configurable.
The dataset names are a placeholder for a future data repository versioning system, and may be replaced in a later version of ap_verify
.
How to Run ap_verify in a New Workspace¶
Using the HiTS 2015 dataset as an example, one can run ap_verify
as follows:
python ap_verify/bin/ap_verify.py --dataset HiTS2015 --output workspace/hits/ --id "visit=54123 ccd=25 filter=g" --silent
Here:
- HiTS2015 is the dataset name,
- workspace/hits/ is the location of the Butler repository in which the pipeline will work,
- visit=54123 ccd=25 filter=g is the dataId to process, and
- --silent disables SQuaSH metrics reporting.
This will create a workspace (a Butler repository) in workspace/hits
based on <hits-data>/data/
, ingest the HiTS data into it, then run visit 54123 through the entire AP pipeline.
Note
The command-line interface for ap_verify
is at present much more limited than those of command-line tasks.
In particular, only file-based repositories are supported, and compound dataIds cannot be provided.
See the Command-Line Reference for details.
Warning
ap_verify.py
does not support running multiple instances concurrently.
Attempting to run two or more programs, particularly from the same working directory, may cause them to compete for access to the workspace or to overwrite each others’ metrics.
How to Run ap_verify in the Dataset Directory¶
It is also possible to place a workspace in a subdirectory of a dataset directory. The syntax for this mode is:
python python/lsst/ap/verify/ap_verify.py --dataset HiTS2015 --rerun run1 --id "visit=54123 ccd=25 filter=g" --silent
The --rerun run1 argument will create a workspace in <hits-data>/rerun/run1/
.
Since datasets are not, in general, repositories, the --rerun
parameter only superficially resembles the analogous argument for command-line tasks.
In particular, ap_verify
‘s --rerun
does not support repository chaining (as in --rerun input:output); the input for ap_verify
will always be determined by the --dataset
.
How to Use Measurements of Metrics¶
After ap_verify
has run, it will produce a file named ap_verify.verify.json
in the working directory.
This file contains metric measurements in lsst.verify
format, and can be loaded and read as described in the lsst.verify
documentation or in SQR-019.
The file name is currently hard-coded, but may be customizable in a future version.
Unless the --silent
argument is provided, ap_verify
will also upload measurements to the SQuaSH service on completion.
See the SQuaSH documentation for details.
If the pipeline is interrupted by a fatal error, completed measurements will be saved to ap_verify.verify.json
for debugging purposes, but nothing will get sent to SQuaSH.
See the error-handling policy for details.