Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup jointcal and separate astrometry/photometry tickets/DM-8552 #27

Merged
merged 13 commits into from Feb 16, 2017

Conversation

parejkoj
Copy link
Collaborator

@parejkoj parejkoj commented Jan 20, 2017

Added the ability to run photometry and astrometry separately, with separate catalogs.

Though it wasn't part of the original work, speeding up jointcal made testing faster, and it's a good end goal anyway.

@parejkoj parejkoj changed the title Big speed improvement and cleanup dead ccdImage code tickets/DM-8552 Speedup jointcal and separate astrometry/photometry tickets/DM-8552 Jan 24, 2017
@parejkoj parejkoj force-pushed the tickets/DM-8552 branch 3 times, most recently from 3d1afc3 to 85ab402 Compare January 24, 2017 03:16
Only read the calexps when building the ccdImages (and not when
building the output to write) and save the Calib to its respective
ccdImage to use it when preparing output. Use that calib in ccdImage
to convert flux to magnitude, instead of custom code.

No need for the calexp_md now: just extract the values from the calexp
(eventually this should come from some metadata datasetType, e.g.
`get('info')`).

Add SOURCE_IO_NO_FOOTPRINTS flag to get('src') for a big speedup,
since I don't need the heavy footprints (thanks @TallJimbo for the
suggestion).

Add individual profiling for each slow step, activated via
profile_jointcal option.

Remove a bunch of variables and getters from ccdImage that are totally
unused.
With plot_jointcal_results, I can now extract statistics from any
jointcal run, so I don't need those tests sitting around here.
@parejkoj parejkoj force-pushed the tickets/DM-8552 branch 3 times, most recently from 9d1f50c to 47cefd2 Compare January 24, 2017 21:43
@parejkoj parejkoj force-pushed the tickets/DM-8552 branch 3 times, most recently from a0fbd2a to a4ba7cf Compare February 9, 2017 08:57
const bool UseFittedList = false,
const bool EnlargeFittedList = true);

//! Collect stars from an external reference catalog
void collectRefStars(lsst::afw::table::SortedCatalogT< lsst::afw::table::SimpleRecord > &Ref,
std::string fluxField);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const std::string &?



//! Set the color field of FittedStar 's from a colored catalog.
/* If Color is "g-i", then the color is assigned from columns "g" and "i" of the colored catalog. */
#ifdef TODO
void SetFittedStarColors(std::string DicStarListName,
void setFittedStarColors(std::string DicStarListName,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const std::string &? More cases below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually std::string const &, please.

* @return The catalog for fitting.
*/
const MeasuredStarList & getCatalogForFit() const { return catalogForFit;}
MeasuredStarList & getCatalogForFit() { return catalogForFit;}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not _ catalogForFit?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of seemingly inconsistent private class variable naming.
Will ignore for now I suppose?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I hope to clean up those names en-mass in DM-9135.

size++;
}
return size;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::count_if?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! (once I figured out the syntax)

//!
RefStar(const BaseStar &, const Point &RaDec);
RefStar(const BaseStar &baseStar);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bring in the Galactica!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That thought crosses my mind every time I see it.

{
if (refStar != nullptr && (R)) // TODO: should we raise an Exception in this case?
if ((_refStar != nullptr) && (refStar != nullptr)) // TODO: should we raise an Exception in this case?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say yes to this question. Just printing to stderr doesn't seem like a great idea. Don't we have a logging framework for non-fatal errors?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this should probably be log.warn level. It's not clear that it's actually fatal, and I haven't triggered it in any of my tests, so I'm not entirely sure what can cause it. Added a note about log level for when I convert to using log.

@@ -40,11 +40,11 @@ SimplePolyModel::SimplePolyModel(const CcdImageList &L,
{
/* first check that there are enough measurements for the
requested polynomial degree */
unsigned nObj = im.CatalogForFit().size();
unsigned nObj = im.getCatalogForFit().size();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't return size_t?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure it does!

src/AstromFit.cc Outdated
@@ -405,7 +405,7 @@ void AstromFit::LSDerivatives2(const FittedStarList &fsl, TripletList &tList, Ei
for (auto i = fsl.cbegin(); i != fsl.end(); ++i)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cend

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made ranged-based.

src/AstromFit.cc Outdated
@@ -1094,7 +1094,7 @@ void AstromFit::makeMeasResTuple(const std::string &tupleName) const
for (auto i = L.cbegin(); i != L.end() ; ++i)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cend also in some other places.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turned most of those (in other files, too) into range-based loops, in the process making many more of them const &. Made other const loops cbegin to cend where I couldn't obviously make the loop range-based.

@@ -78,7 +78,7 @@ SimplePolyModel::SimplePolyModel(const CcdImageList &L,
const Mapping* SimplePolyModel::getMapping(const CcdImage &C) const
{
mapType::const_iterator i = _myMap.find(&C);
if (i==_myMap.end()) throw LSST_EXCEPT(pex::exceptions::InvalidParameterError,"SimplePolyModel::GetMapping, never heard of CcdImage "+C.Name());
if (i==_myMap.end()) throw LSST_EXCEPT(pex::exceptions::InvalidParameterError,"SimplePolyModel::GetMapping, never heard of CcdImage "+C.getName());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically only guaranteed to compare equal in C++14 (const vs non-const iterator). But we ar nitpicking here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to _myMap.cend()

Copy link
Member

@mtpatter mtpatter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In jointcal.py in _build_ccdImage on line 194 the docstring refers to calexp_md but that looks like it no longer applies with these changes. Also, do you want to add filter to the docstring for the returned named tuple?

Also, in joincal.py _write_results and _fit_astrometry, the docstrings make reference to lsst.jointcal.AstromModel. Is AstromModel a real thing? I only see PhotomModel.

Also, in utils.py line 338 etc, the for loop is using 'filter' but that's a builtin class, right? Do you want to use 'filt' or something instead, like is done elsewhere in jointcal.py

@@ -102,7 +102,7 @@ def makeDataRefList(self, namespace):
self._addDataRef(namespace, ref.dataId, tract)

# Ensure all components of a visit are kept together by putting them all in the same set of tracts
for visit, tractSet in visitTract.items():
for visit, tractSet in sorted(visitTract.items()):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this for py2 vs py3, and if you want the dataRefs to be consistently ordered I think you need to also have sorted(tractSet) below. Otherwise I think this is only sorting visits.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! tractSet is a set(), so would have the same problem. I don't have any test data that crosses sets, so I never encountered this.

That said, maintaining order here is only to keep tests passing between py2 and py3 while I sort out why there's order dependence. I've added a note to that effect above the nested loops.


class JointcalRunner(pipeBase.TaskRunner):

class JointcalRunner(pipeBase.ButlerInitializedTaskRunner):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for this is also copied verbatim from HSC MosaicRunner, so it's making reference to things here that don't exist (MosaicTask) so I think this is confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, pipeBase says TaskRunner must be picklable for multi-processing and if not then do you want to set canMultiprocess = False in JointcalTask? That is what is done in MosaicTask, according to the docstring, which also adds confusion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cleaned up that docstring and removed the comment about multiprocessing. I don't know about pickleability, so I'll just remove that whole part of the note. I've got plans to try some multiprocessing work in jointcal for I/O.

butler : lsst.daf.persistence.Butler
The butler is passed to the refObjLoader constructor in case it is
needed. Ignored if the refObjLoader argument provides a loader directly.
Used to initialized the astrometry and photometry refObjLoaders.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo initialized -> initialize

@parejkoj
Copy link
Collaborator Author

I fixed all the points in your separate comment. Thanks for those.

Split photo/astro code with Config parameters, and a test for each option in the
lsstSim test class. The tests check for empty Wcs/Calib, if they shouldn't have
been written.

Pulled apart _testJointcalTask so I could use the run/plot bits in the "only
one" tests, since those have quite different assert conditions.

Also support the split in JointcalStatistics: the nested ifs are a bit
excessive, but they get the job done.

Fixed inconsistent capitalization of jointcal in tests.
Add astrometry/photometryRefObjLoaders as configurables and make
jointcal.__init__() take a butler argument to configure them. JointcalRunner
has to be derived from ButlerInitializedTaskRunner to get said butler.

Clean up "default" filter handling somewhat (see DM-9093 for what else needs to
be done), to use the most common filter in the dataRefs. CollectLSSTRefStars
now takes the actual fluxField, instead of building it on the fly.
Add jointcal._do_load_refcat_and_fit() to do both photometry and astrometry
via kwargs.

Add metrics dictionary in jointcal.py, for basic values to test, that may
turn into validation/verification SQUASH tests later.

associateCatalogs now properly cleans up when run a second time. This would
be best handled via better memory management, but that's DM-4043.

Fix some range-based for loops and cleaned up pointers/references.

Remove bit-rotted Jointcal class and other unused code.

More renaming of badly-named methods and variables. Filed DM-9135 to deal
with this in bulk.

Print some interesting values in when running JointcalStatistics verbosely.

Fixed a bug I had introduced in RefStar re:setFittedStar behavior with nullptr.
@parejkoj parejkoj force-pushed the tickets/DM-8552 branch 2 times, most recently from 6fde126 to 6c1e4b9 Compare February 16, 2017 06:19
Add hsc config for filterMap and tweak test to work with it.

Add every star number and chi2 metric for every test dataset.

Metrics tests always occur in JointcalTestBase._runJointcalTask(). Default
to checking every metric that gets returned, so they'll fail if that metric
isn't defined by the test method. A metric can be skipped by setting it to None.
Tweak metrics after solving common tangent plane problem.
tweak metrics for sorted dataRefs
@parejkoj parejkoj merged commit 6ec66b3 into master Feb 16, 2017
@ktlim ktlim deleted the tickets/DM-8552 branch August 25, 2018 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants