Speedup jointcal and separate astrometry/photometry tickets/DM-8552 #27

parejkoj · 2017-01-20T02:19:34Z

Added the ability to run photometry and astrometry separately, with separate catalogs.

Though it wasn't part of the original work, speeding up jointcal made testing faster, and it's a good end goal anyway.

@TallJimbo

Only read the calexps when building the ccdImages (and not when building the output to write) and save the Calib to its respective ccdImage to use it when preparing output. Use that calib in ccdImage to convert flux to magnitude, instead of custom code. No need for the calexp_md now: just extract the values from the calexp (eventually this should come from some metadata datasetType, e.g. `get('info')`). Add SOURCE_IO_NO_FOOTPRINTS flag to get('src') for a big speedup, since I don't need the heavy footprints (thanks @TallJimbo for the suggestion). Add individual profiling for each slow step, activated via profile_jointcal option. Remove a bunch of variables and getters from ccdImage that are totally unused.

With plot_jointcal_results, I can now extract statistics from any jointcal run, so I don't need those tests sitting around here.

pschella · 2017-02-09T17:05:48Z

include/lsst/jointcal/Associations.h

                           const bool UseFittedList = false,
                           const bool EnlargeFittedList = true);

+    //! Collect stars from an external reference catalog
+    void collectRefStars(lsst::afw::table::SortedCatalogT< lsst::afw::table::SimpleRecord > &Ref,
+                         std::string fluxField);


const std::string &?

pschella · 2017-02-09T17:06:25Z

include/lsst/jointcal/Associations.h



    //! Set the color field of FittedStar 's from a colored catalog.
    /* If Color is "g-i", then the color is assigned from columns "g" and "i" of the colored catalog. */
 #ifdef TODO
-    void SetFittedStarColors(std::string DicStarListName,
+    void setFittedStarColors(std::string DicStarListName,


const std::string &? More cases below.

It's actually std::string const &, please.

pschella · 2017-02-09T17:08:28Z

include/lsst/jointcal/CcdImage.h

+     * @return     The catalog for fitting.
+     */
+    const MeasuredStarList & getCatalogForFit() const { return catalogForFit;}
+    MeasuredStarList & getCatalogForFit()  { return catalogForFit;}


why not _ catalogForFit?

Lots of seemingly inconsistent private class variable naming.
Will ignore for now I suppose?

Yes, I hope to clean up those names en-mass in DM-9135.

pschella · 2017-02-09T17:09:42Z

include/lsst/jointcal/CcdImage.h

+                size++;
+        }
+        return size;
+    }


std::count_if?

Neat! (once I figured out the syntax)

pschella · 2017-02-09T17:28:37Z

include/lsst/jointcal/RefStar.h

  //!
-  RefStar(const BaseStar &, const Point &RaDec);
+  RefStar(const BaseStar &baseStar);


Bring in the Galactica!

That thought crosses my mind every time I see it.

pschella · 2017-02-09T17:32:36Z

src/FittedStar.cc

 {
-  if (refStar != nullptr && (R)) // TODO: should we raise an Exception in this case?
+  if ((_refStar != nullptr) && (refStar != nullptr)) // TODO: should we raise an Exception in this case?


I would say yes to this question. Just printing to stderr doesn't seem like a great idea. Don't we have a logging framework for non-fatal errors?

Yeah, this should probably be log.warn level. It's not clear that it's actually fatal, and I haven't triggered it in any of my tests, so I'm not entirely sure what can cause it. Added a note about log level for when I convert to using log.

pschella · 2017-02-09T17:39:52Z

src/SimplePolyModel.cc

@@ -40,11 +40,11 @@ SimplePolyModel::SimplePolyModel(const CcdImageList &L,
 	{
 		/* first check that there are enough measurements for the
 	  requested polynomial degree */
-	  unsigned nObj = im.CatalogForFit().size();
+	  unsigned nObj = im.getCatalogForFit().size();


This doesn't return size_t?

I'm sure it does!

pschella · 2017-02-09T17:43:03Z

src/AstromFit.cc

@@ -405,7 +405,7 @@ void AstromFit::LSDerivatives2(const FittedStarList &fsl, TripletList &tList, Ei
    for (auto i = fsl.cbegin(); i != fsl.end(); ++i)


Made ranged-based.

pschella · 2017-02-09T17:43:29Z

src/AstromFit.cc

@@ -1094,7 +1094,7 @@ void AstromFit::makeMeasResTuple(const std::string &tupleName) const
    for (auto i = L.cbegin(); i != L.end() ; ++i)


cend also in some other places.

Turned most of those (in other files, too) into range-based loops, in the process making many more of them const &. Made other const loops cbegin to cend where I couldn't obviously make the loop range-based.

pschella · 2017-02-09T18:02:51Z

src/SimplePolyModel.cc

@@ -78,7 +78,7 @@ SimplePolyModel::SimplePolyModel(const CcdImageList &L,
 const Mapping* SimplePolyModel::getMapping(const CcdImage &C) const
 {
  mapType::const_iterator i = _myMap.find(&C);
-  if  (i==_myMap.end()) throw LSST_EXCEPT(pex::exceptions::InvalidParameterError,"SimplePolyModel::GetMapping, never heard of CcdImage "+C.Name());
+  if  (i==_myMap.end()) throw LSST_EXCEPT(pex::exceptions::InvalidParameterError,"SimplePolyModel::GetMapping, never heard of CcdImage "+C.getName());


Technically only guaranteed to compare equal in C++14 (const vs non-const iterator). But we ar nitpicking here.

Changed to _myMap.cend()

mtpatter

In jointcal.py in _build_ccdImage on line 194 the docstring refers to calexp_md but that looks like it no longer applies with these changes. Also, do you want to add filter to the docstring for the returned named tuple?

Also, in joincal.py _write_results and _fit_astrometry, the docstrings make reference to lsst.jointcal.AstromModel. Is AstromModel a real thing? I only see PhotomModel.

Also, in utils.py line 338 etc, the for loop is using 'filter' but that's a builtin class, right? Do you want to use 'filt' or something instead, like is done elsewhere in jointcal.py

mtpatter · 2017-02-15T23:35:09Z

python/lsst/jointcal/dataIds.py

@@ -102,7 +102,7 @@ def makeDataRefList(self, namespace):
                    self._addDataRef(namespace, ref.dataId, tract)

        # Ensure all components of a visit are kept together by putting them all in the same set of tracts
-        for visit, tractSet in visitTract.items():
+        for visit, tractSet in sorted(visitTract.items()):


I checked this for py2 vs py3, and if you want the dataRefs to be consistently ordered I think you need to also have sorted(tractSet) below. Otherwise I think this is only sorting visits.

Good catch! tractSet is a set(), so would have the same problem. I don't have any test data that crosses sets, so I never encountered this.

That said, maintaining order here is only to keep tests passing between py2 and py3 while I sort out why there's order dependence. I've added a note to that effect above the nested loops.

mtpatter · 2017-02-15T23:40:07Z

python/lsst/jointcal/jointcal.py


-class JointcalRunner(pipeBase.TaskRunner):
+
+class JointcalRunner(pipeBase.ButlerInitializedTaskRunner):


The docstring for this is also copied verbatim from HSC MosaicRunner, so it's making reference to things here that don't exist (MosaicTask) so I think this is confusing.

Also, pipeBase says TaskRunner must be picklable for multi-processing and if not then do you want to set canMultiprocess = False in JointcalTask? That is what is done in MosaicTask, according to the docstring, which also adds confusion.

I cleaned up that docstring and removed the comment about multiprocessing. I don't know about pickleability, so I'll just remove that whole part of the note. I've got plans to try some multiprocessing work in jointcal for I/O.

mtpatter · 2017-02-15T23:42:58Z

python/lsst/jointcal/jointcal.py

+        butler : lsst.daf.persistence.Butler
+            The butler is passed to the refObjLoader constructor in case it is
+            needed. Ignored if the refObjLoader argument provides a loader directly.
+            Used to initialized the astrometry and photometry refObjLoaders.


typo initialized -> initialize

parejkoj · 2017-02-16T01:40:52Z

I fixed all the points in your separate comment. Thanks for those.

Split photo/astro code with Config parameters, and a test for each option in the lsstSim test class. The tests check for empty Wcs/Calib, if they shouldn't have been written. Pulled apart _testJointcalTask so I could use the run/plot bits in the "only one" tests, since those have quite different assert conditions. Also support the split in JointcalStatistics: the nested ifs are a bit excessive, but they get the job done. Fixed inconsistent capitalization of jointcal in tests.

Add astrometry/photometryRefObjLoaders as configurables and make jointcal.__init__() take a butler argument to configure them. JointcalRunner has to be derived from ButlerInitializedTaskRunner to get said butler. Clean up "default" filter handling somewhat (see DM-9093 for what else needs to be done), to use the most common filter in the dataRefs. CollectLSSTRefStars now takes the actual fluxField, instead of building it on the fly.

Add jointcal._do_load_refcat_and_fit() to do both photometry and astrometry via kwargs. Add metrics dictionary in jointcal.py, for basic values to test, that may turn into validation/verification SQUASH tests later. associateCatalogs now properly cleans up when run a second time. This would be best handled via better memory management, but that's DM-4043. Fix some range-based for loops and cleaned up pointers/references. Remove bit-rotted Jointcal class and other unused code. More renaming of badly-named methods and variables. Filed DM-9135 to deal with this in bulk. Print some interesting values in when running JointcalStatistics verbosely. Fixed a bug I had introduced in RefStar re:setFittedStar behavior with nullptr.

Add hsc config for filterMap and tweak test to work with it. Add every star number and chi2 metric for every test dataset. Metrics tests always occur in JointcalTestBase._runJointcalTask(). Default to checking every metric that gets returned, so they'll fail if that metric isn't defined by the test method. A metric can be skipped by setting it to None.

Tweak metrics after solving common tangent plane problem.

tweak metrics for sorted dataRefs

parejkoj force-pushed the tickets/DM-8552 branch from 701b946 to a9e1342 Compare January 24, 2017 02:13

parejkoj changed the title ~~Big speed improvement and cleanup dead ccdImage code tickets/DM-8552~~ Speedup jointcal and separate astrometry/photometry tickets/DM-8552 Jan 24, 2017

parejkoj force-pushed the tickets/DM-8552 branch 3 times, most recently from 3d1afc3 to 85ab402 Compare January 24, 2017 03:16

parejkoj added 2 commits January 23, 2017 19:26

Remove skipped tests for fewer visits.

cf60c48

With plot_jointcal_results, I can now extract statistics from any jointcal run, so I don't need those tests sitting around here.

parejkoj force-pushed the tickets/DM-8552 branch 3 times, most recently from 9d1f50c to 47cefd2 Compare January 24, 2017 21:43

parejkoj force-pushed the tickets/DM-8552 branch 3 times, most recently from a0fbd2a to a4ba7cf Compare February 9, 2017 08:57

pschella reviewed Feb 9, 2017

View reviewed changes

parejkoj force-pushed the tickets/DM-8552 branch from 9112be2 to b0ecb53 Compare February 10, 2017 08:17

mtpatter reviewed Feb 16, 2017

View reviewed changes

parejkoj force-pushed the tickets/DM-8552 branch from fe386da to 01d371b Compare February 16, 2017 04:01

parejkoj added 3 commits February 15, 2017 22:18

Remove unused MeasuredStar-related code

f5ea857

parejkoj force-pushed the tickets/DM-8552 branch 2 times, most recently from 6fde126 to 6c1e4b9 Compare February 16, 2017 06:19

parejkoj added 7 commits February 15, 2017 22:24

Added example script to run validation data with profiling.

cf00394

Add .test directory to .gitignore

55c7fb3

C++ review cleanup: lots of new range-based for, etc.

5f70e89

make assert failures more verbose.

2473da4

Compute common tangent plane after all ccdImages are loaded

16c7c0d

Tweak metrics after solving common tangent plane problem.

sort dataRefs for consistent py2/3 ordering

6ec66b3

tweak metrics for sorted dataRefs

parejkoj force-pushed the tickets/DM-8552 branch from 6c1e4b9 to 6ec66b3 Compare February 16, 2017 06:24

parejkoj merged commit 6ec66b3 into master Feb 16, 2017

ktlim deleted the tickets/DM-8552 branch August 25, 2018 06:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speedup jointcal and separate astrometry/photometry tickets/DM-8552 #27

Speedup jointcal and separate astrometry/photometry tickets/DM-8552 #27

parejkoj commented Jan 20, 2017 •

edited

pschella Feb 9, 2017

pschella Feb 9, 2017

TallJimbo Feb 9, 2017

pschella Feb 9, 2017

pschella Feb 9, 2017

parejkoj Feb 9, 2017

pschella Feb 9, 2017

parejkoj Feb 9, 2017

pschella Feb 9, 2017

parejkoj Feb 9, 2017

pschella Feb 9, 2017

parejkoj Feb 9, 2017

pschella Feb 9, 2017

parejkoj Feb 9, 2017

pschella Feb 9, 2017

parejkoj Feb 9, 2017

pschella Feb 9, 2017

parejkoj Feb 10, 2017

pschella Feb 9, 2017

parejkoj Feb 10, 2017

mtpatter left a comment

mtpatter Feb 15, 2017

parejkoj Feb 16, 2017

mtpatter Feb 15, 2017

mtpatter Feb 15, 2017

parejkoj Feb 16, 2017

mtpatter Feb 15, 2017

parejkoj commented Feb 16, 2017

		@@ -405,7 +405,7 @@ void AstromFit::LSDerivatives2(const FittedStarList &fsl, TripletList &tList, Ei
		for (auto i = fsl.cbegin(); i != fsl.end(); ++i)

		@@ -1094,7 +1094,7 @@ void AstromFit::makeMeasResTuple(const std::string &tupleName) const
		for (auto i = L.cbegin(); i != L.end() ; ++i)


		class JointcalRunner(pipeBase.TaskRunner):

		class JointcalRunner(pipeBase.ButlerInitializedTaskRunner):

Speedup jointcal and separate astrometry/photometry tickets/DM-8552 #27

Speedup jointcal and separate astrometry/photometry tickets/DM-8552 #27

Conversation

parejkoj commented Jan 20, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtpatter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parejkoj commented Feb 16, 2017

parejkoj commented Jan 20, 2017 •

edited