New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix slow jointcal output writing Tickets/DM-8978 #25
Conversation
1eff92f
to
3eeb94c
Compare
break | ||
visit = int(visit) | ||
ccd = int(ccd) | ||
visit_ccd_to_dataRef[(visit, ccd)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
for ref in dataRefs: | ||
result = self._build_ccdImage(ref, associations, jointcalControl) | ||
oldWcsList.append(result.wcs) | ||
visit_ccd_to_dataRef[result.key] = ref |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked at the code in detail, but is there a reason not to use result.Name()
here and avoid having to parse it below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused? result is a namedtuple, with "wcs" and "key", there's no "Name". I don't know what you're referring to with this comment.
ccd = int(ccd) | ||
visit_ccd_to_dataRef[(visit, ccd)] | ||
dataRef = visit_ccd_to_dataRef[(visit, ccd)] | ||
calexp = dataRef.get("calexp") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could get "calexp_md"
and use that to create the Calib
for more time savings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would much rather remove the building of the tanwcs
from the calexp_md
and just get it from ExposureInfo (the tanWcs code there is original to meas_simastrom, and I haven't taken the time to replace it yet). Why should I touch calexp_md
, if we've got pre-built objects in ExposureInfo?
3eeb94c
to
e107938
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I am getting a bad taste in my mouth that this change is even necessary. But given that it is, I have no problems with the way it was done, just one comment.
wcs : lsst.afw.image.TanWcs | ||
the TAN WCS of this image, constructed from the calexp_md | ||
key : (int, int) | ||
a key to identify this dataRef by its visit and ccd ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be useful to make the key a namedtuple with 'visitId' and 'ccdId' names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to argue that the key should be an opaque object, but I can see it being quite useful to have the parts named (may need to add "tract" or "filter" in the future, anyway). Done.
Build a dict of (visit, ccd): dataRef when I build the ccdImageList, and use it to match ccdImages to their dataRef when writing output.
e107938
to
0645a8c
Compare
New code is much faster: 37s vs. 47s to run the lsstSim 2 and 10 visit tests together. The speed bump will be even higher when working with more visits and ccds, since the original code was O(N^2), while this is O(N)+O(1), in addition to fewer butler accesses.