DM-13887: Let ap_verify process multiple images per instance #57

kfindeisen · 2018-12-13T21:55:08Z

This PR adds multiprocessing support to ap_verify by making the pipeline_driver module call ApPipeTask.parseAndRun instead of ApPipeTask.runDataRef. This change requires a complete reorganization of how ap_verify passes information from ap_pipe to the metrics code, but as a side benefit ap_verify no longer needs to parse data IDs itself.

The original intent was that ap_verify would create a single Job for all metrics computed during execution. This is now neither necessary nor possible, so it's better to leave Job files created by the pipeline untouched.

This change eliminates the need to parse the data ID twice, and will make it easy to switch to parseAndRun later.

bin/run_ci_dataset.sh

doc/lsst.ap.verify/command-line-reference.rst

mrawls · 2019-01-15T01:16:17Z

doc/lsst.ap.verify/command-line-reference.rst

-   This argument can be used to run multiple instances of ``ap_verify`` concurrently, with each instance producing output to a different metrics file.
+   The template for a file to contain metrics measured by ``ap_verify``, in a format readable by the :doc:`lsst.verify</modules/lsst.verify/index>` framework.
+   The string ``{dataId}`` shall be replaced with the data ID associated with the job, and its use is strongly recommended.
+   If omitted, the output will go to files named after ``ap_verify.{dataId}.verify.json`` in the user's working directory.


Oh... haha... still, I think a putting these in the "output" directory might be a better default.

Good idea, but I'd rather do this on a new ticket, since it's a change from the pre-existing behavior and I'd want to double-check that custom filenames don't break things (including CI!) in surprising ways.

python/lsst/ap/verify/metrics.py

python/lsst/ap/verify/pipeline_driver.py

tests/test_driver.py

python/lsst/ap/verify/pipeline_driver.py

This change involves several coupled decisions, including moving responsibility for Job-level metadata to AutoJob. The main reason this commit can't be broken up is because Job metadata requires a data ID, which is only available during or after pipeline processing. Therefore, removing runApPipe's dependency on Job and moving Job creation inside the data ID loop needs to happen simultaneously. I have not added tests for AutoJob, because the class is being deprecated in favor of MetricsControllerTask.

kfindeisen requested a review from mrawls December 13, 2018 21:55

kfindeisen mentioned this pull request Dec 13, 2018

DM-13887: Let ap_verify process multiple images per instance lsst/ap_pipe#35

Merged

kfindeisen force-pushed the tickets/DM-13887 branch from d938616 to d28303b Compare December 13, 2018 23:17

kfindeisen added 3 commits January 9, 2019 15:28

Remove pipeline_driver's handling of lower-level metrics.

477cef0

The original intent was that ap_verify would create a single Job for all metrics computed during execution. This is now neither necessary nor possible, so it's better to leave Job files created by the pipeline untouched.

Add data ID reporting to runApPipe.

928ca91

Use runApPipe's data IDs in measurements.

6883e90

This change eliminates the need to parse the data ID twice, and will make it easy to switch to parseAndRun later.

kfindeisen force-pushed the tickets/DM-13887 branch from d28303b to 00016e5 Compare January 9, 2019 23:28

mrawls approved these changes Jan 15, 2019

View reviewed changes

mrawls reviewed Jan 15, 2019

View reviewed changes

python/lsst/ap/verify/pipeline_driver.py Outdated Show resolved Hide resolved

kfindeisen added 6 commits January 15, 2019 11:26

Allow Job filenames to be data ID specific.

3d736b3

Call ApPipeTask.parseAndRun.

47f852d

Support empty data IDs.

c781cfb

Remove unused or redundant code.

936a366

Update CI script to use multiprocessing.

eb1924f

kfindeisen force-pushed the tickets/DM-13887 branch from 00016e5 to eb1924f Compare January 15, 2019 19:51

kfindeisen merged commit eb1924f into master Jan 15, 2019

kfindeisen deleted the tickets/DM-13887 branch January 15, 2019 21:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-13887: Let ap_verify process multiple images per instance #57

DM-13887: Let ap_verify process multiple images per instance #57

kfindeisen commented Dec 13, 2018

mrawls Jan 15, 2019

kfindeisen Jan 15, 2019

DM-13887: Let ap_verify process multiple images per instance #57

DM-13887: Let ap_verify process multiple images per instance #57

Conversation

kfindeisen commented Dec 13, 2018

mrawls Jan 15, 2019

Choose a reason for hiding this comment

kfindeisen Jan 15, 2019

Choose a reason for hiding this comment