New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-13887: Let ap_verify process multiple images per instance #57
Conversation
d938616
to
d28303b
Compare
The original intent was that ap_verify would create a single Job for all metrics computed during execution. This is now neither necessary nor possible, so it's better to leave Job files created by the pipeline untouched.
This change eliminates the need to parse the data ID twice, and will make it easy to switch to parseAndRun later.
d28303b
to
00016e5
Compare
This argument can be used to run multiple instances of ``ap_verify`` concurrently, with each instance producing output to a different metrics file. | ||
The template for a file to contain metrics measured by ``ap_verify``, in a format readable by the :doc:`lsst.verify</modules/lsst.verify/index>` framework. | ||
The string ``{dataId}`` shall be replaced with the data ID associated with the job, and its use is strongly recommended. | ||
If omitted, the output will go to files named after ``ap_verify.{dataId}.verify.json`` in the user's working directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh... haha... still, I think a putting these in the "output" directory might be a better default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, but I'd rather do this on a new ticket, since it's a change from the pre-existing behavior and I'd want to double-check that custom filenames don't break things (including CI!) in surprising ways.
This change involves several coupled decisions, including moving responsibility for Job-level metadata to AutoJob. The main reason this commit can't be broken up is because Job metadata requires a data ID, which is only available during or after pipeline processing. Therefore, removing runApPipe's dependency on Job and moving Job creation inside the data ID loop needs to happen simultaneously. I have not added tests for AutoJob, because the class is being deprecated in favor of MetricsControllerTask.
00016e5
to
eb1924f
Compare
This PR adds multiprocessing support to
ap_verify
by making thepipeline_driver
module callApPipeTask.parseAndRun
instead ofApPipeTask.runDataRef
. This change requires a complete reorganization of howap_verify
passes information fromap_pipe
to the metrics code, but as a side benefitap_verify
no longer needs to parse data IDs itself.