New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-10729: support jointcal/meas_mosaic outputs and add a new CmdLineTask driver #49
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test to ensure that the correct version of photometry (jointcall vs meas_mosaic/single_frame) is being read? The current code looks clear enough, but it would be really good to have a test to make sure that any future refactorings don't get this wrong.
# performance; support for other cameras is DM-6927. | ||
oldSrc = butler.get('src', vId, flags=SOURCE_IO_NO_FOOTPRINTS) | ||
except: | ||
oldSrc = butler.get('src', vId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. That's why I didn't implement SOURCE_IO_NO_FOOTPRINTS in DM-5819.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the performance gain in using SOURCE_IO_NO_FOOTPRINTS? In memory footprint? In I/O time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big improvement in I/O time, mostly. It's a big relative change in memory consumption too, but I don't notice that as much.
calib = afwImage.Calib(calexpMetadata) | ||
|
||
# We don't want to put this above the first "if useJointCal block" | ||
# because we need to use it to quickly catch data IDs with no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"because we need to use the first butler.get above to quickly catch data IDs with no"
Resolve ambiguous "it".
3c29ed3
to
ca6100a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Definitely worth merging. Two main questions in addition to the small comments:
- Do you imagine eventually wanting to support several different calibration schemes? If so, would be it worth it to generalize from
useJointCal=True
tocalMethod='jointcal'
,calMethod='meas_mosaic',
calMethod='cool_thing_dominique_figures_out_in_2020'` This would preserve the ability to read historical datasets by allowing more flexibility in the config instead of in the code. - If you can make it not require --output when it doesn't even use it, that would be a nice usability improvment, particularly to the degree that
matchedVisitMetrics.py
could serve as a model for makingvalidateDrp.py
aTask
.
(config.outputPrefix). Because the CmdLineTask machinery always creates an | ||
output Butler repository, however, it is necessary to run this task with | ||
both an output directory and an output prefix, with the former essentially | ||
unused. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having to specify an unused --output
seemed annoying.
Can you override this requirement with a default set in the default config for MatchedVisitMetricsTask ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per comment in JIRA issue, no - this is unfortunately baked into pipe_base
, and is not controllable via config.
|
||
MatchedVisitMetricsTask is very much an incomplete CmdLineTask - it uses | ||
the usual metchanisms to define its inputs and read them using a Butler, | ||
but writes outputs manually to files with a configuration-defined prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: metchanisms -> mechanisms
This matches schema naming conventions and usage in PhotoCalib.
ca6100a
to
cf02a2a
Compare
This should yield a significant speedup in I/O.
cf02a2a
to
9afb4a5
Compare
This PR adds support for using jointcal/meas_mosaic outputs to the lower-level metric generation code (
runOneFilter
and below). I wasn't able to use the existing higher-level drivers for the tests I wanted to do, so I haven't tried to update those.Instead, I've added a new
CmdLineTask
driver that delegates most of its work torunOneFilter
, which makes it much easier to control the configuration and inputs from the command-line. It's not really a completeCmdLineTask
, because it doesn't useButler
for its outputs (since all of the code it calls just write JSON directly). But IMO it's already a more convenient driver for manual execution of jobs, and I think it could be a useful starting point for making more of this code use theTask
(and ultimatelySuperTask
) framework.I've also found a number of small things that appear to be minor bugs (unused arguments, mostly).