Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-13887: Let ap_verify process multiple images per instance #57

Merged
merged 9 commits into from Jan 15, 2019

Conversation

kfindeisen
Copy link
Member

This PR adds multiprocessing support to ap_verify by making the pipeline_driver module call ApPipeTask.parseAndRun instead of ApPipeTask.runDataRef. This change requires a complete reorganization of how ap_verify passes information from ap_pipe to the metrics code, but as a side benefit ap_verify no longer needs to parse data IDs itself.

The original intent was that ap_verify would create a single Job for
all metrics computed during execution. This is now neither necessary
nor possible, so it's better to leave Job files created by the
pipeline untouched.
This change eliminates the need to parse the data ID twice, and will
make it easy to switch to parseAndRun later.
bin/run_ci_dataset.sh Show resolved Hide resolved
doc/lsst.ap.verify/command-line-reference.rst Show resolved Hide resolved
This argument can be used to run multiple instances of ``ap_verify`` concurrently, with each instance producing output to a different metrics file.
The template for a file to contain metrics measured by ``ap_verify``, in a format readable by the :doc:`lsst.verify</modules/lsst.verify/index>` framework.
The string ``{dataId}`` shall be replaced with the data ID associated with the job, and its use is strongly recommended.
If omitted, the output will go to files named after ``ap_verify.{dataId}.verify.json`` in the user's working directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... haha... still, I think a putting these in the "output" directory might be a better default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, but I'd rather do this on a new ticket, since it's a change from the pre-existing behavior and I'd want to double-check that custom filenames don't break things (including CI!) in surprising ways.

python/lsst/ap/verify/metrics.py Show resolved Hide resolved
python/lsst/ap/verify/metrics.py Outdated Show resolved Hide resolved
python/lsst/ap/verify/pipeline_driver.py Show resolved Hide resolved
tests/test_driver.py Show resolved Hide resolved
This change involves several coupled decisions, including moving
responsibility for Job-level metadata to AutoJob. The main reason this
commit can't be broken up is because Job metadata requires a data ID,
which is only available during or after pipeline processing.
Therefore, removing runApPipe's dependency on Job and moving Job
creation inside the data ID loop needs to happen simultaneously.

I have not added tests for AutoJob, because the class is being
deprecated in favor of MetricsControllerTask.
@kfindeisen kfindeisen merged commit eb1924f into master Jan 15, 2019
@kfindeisen kfindeisen deleted the tickets/DM-13887 branch January 15, 2019 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants