DM-10155: Implement in-place SourceCatalog operators in PhotoCalib #457

parejkoj · 2019-04-20T06:49:16Z

Also vectorize catalog array operations for a big speedup on AST-backed BoundedFields.

AST-backed BoundedFields (e.g. TransformBoundedField) perform much better on arrays of points, than single operations. This change gives about a 17x speedup for `photoCalib.instFluxToNanojansky(src, 'slot_PsfFlux')` for a TransformBoundedField-based photoCalib. It is about 25% slower for spatially- constant PhotoCalibs but still below a millisecond, so that difference should not matter.

TallJimbo

Logic looks great; thanks in particular for taking the time to vectorize the AST calls in a way that doesn't require contiguous catalogs.

I've made a bunch of minor stylistic comments that I hope will be helpful.

TallJimbo · 2019-04-23T15:17:31Z

include/lsst/afw/image/PhotoCalib.h

+     *
+     * @return The calibrated catalog, with new flux and magnitude (and their errors) fields.
+     *
+     * @throws lsst::pex::exceptions::NotFoundError if any item in `instfluxField` does not have a


instfluxField -> instFluxFields.

TallJimbo · 2019-04-23T15:28:12Z

include/lsst/afw/image/PhotoCalib.h

+    /**
+     * Return the calibration evaluated at the centroids of a SourceCatalog.
+     */
+    ndarray::Array<double, 1> evaluateCatalog(afw::table::SourceCatalog const &sourceCatalog) const;


Please add blank lines between methods.

TallJimbo · 2019-04-23T15:32:03Z

src/image/PhotoCalib.cc

+    std::vector<table::Key<double>> magKeys;
+    magKeys.reserve(instFluxFields.size());
+    std::vector<table::Key<double>> magErrKeys;
+    magErrKeys.reserve(instFluxFields.size());


This isn't bad by any means, but it strikes me as a place where a little local struct to hold those keys might be a bit better - i.e. you'd just have a single vector of that struct, and then the inner for loop could just be a range based for, and there'd be a lot more simple attribute assignments instead of paren-filled push_back() and back() calls.

Thanks for that suggestion. I think that does improve clarity.

TallJimbo · 2019-04-23T15:33:52Z

src/image/PhotoCalib.cc

+        instFluxKeys.push_back(inSchema.find<double>(field + "_instFlux").key);
+        try {
+            instFluxErrKeys.push_back(inSchema.find<double>(field + "_instFluxErr").key);
+        } catch (pex::exceptions::NotFoundError) {


Even though you're not doing anything with this exception, I think you still want to catch it by reference, i.e. catch (pex::exceptions::NotFoundError&). I'm not positive there would be any ill effects from this as is, and I don't even remember in detail what the ill effects of catching by value are, but it's safest to just always catch by reference.

TallJimbo · 2019-04-23T15:40:05Z

src/image/PhotoCalib.cc

+    std::vector<table::Key<double>> magErrKeys;
+    magErrKeys.reserve(instFluxFields.size());
+    for (auto const &field : instFluxFields) {
+        instFluxKeys.push_back(inSchema.find<double>(field + "_instFlux").key);


Once this is switched to struct field assigment (per above comment), you could actually use an even nicer syntax for this:

s.instFlux = inSchema[field + "_instFlux"]; // assuming s is the struct, and instFlux is a field of type Key<double>.

That is, with the assignment form, it can infer the fact that it's a double from what you're assigning it to (which is why this kind of syntax isn't usable in Python as often).

I'd also say it'd be slightly better to use inSchema.join(field, "instFlux") instead of field + "_instFlux" to avoid referring directly to the delimiter. We don't do that across the board, but I'm trying to encourage it in new code.

TallJimbo · 2019-04-23T15:41:59Z

src/image/PhotoCalib.cc

+    for (auto const &name : catalog.getSchema().getNames()) {
+        // Pick every field ending in "_instFlux", grabbing everything before that prefix.
+        if (name.size() > 10 && name.compare(name.size() - 9, 9, "_instFlux") == 0) {
+            instFluxFields.emplace_back(name.substr(0, name.size() - 9));


You could do something like

constexpr std::string SUFFIX = "_instFlux";

above the for loop, and then refer to SUFFIX.size() instead of using 9 and 10 explicitly.

constexpr std::string is not allowed. Is it better to make it a non-constexpr string, or use a constexpr char[]? The latter makes for less nice syntax for computing the size.

I think I'd use static std::string const SUFFIX = "_instFlux";, then.

TallJimbo · 2019-04-23T15:44:17Z

src/image/PhotoCalib.cc

@@ -465,38 +544,68 @@ double PhotoCalib::evaluate(lsst::geom::Point<double, 2> const &point) const {
        return _calibration->evaluate(point);
 }

+ndarray::Array<double, 1> PhotoCalib::evaluateArray(ndarray::Array<double, 1> xx,
+                                                    ndarray::Array<double, 1> yy) const {


Convention is to pass arrays by const reference: ndarray::Array<double, 1> const & xx.

It's not a big deal either way (neither is our opposite convention of passing shared_ptr by value), but best to be consistent.

TallJimbo · 2019-04-23T15:46:56Z

src/image/PhotoCalib.cc

 void PhotoCalib::instFluxToNanojanskyArray(afw::table::SourceCatalog const &sourceCatalog,
                                           std::string const &instFluxField,
                                           ndarray::Array<double, 2, 2> result) const {
-    double instFlux, instFluxErr, nanojansky, calibration;
+    double instFlux, instFluxErr, nanojansky;  //, calibration;


It's fine (i.e. just as efficient) and probably slightly better form to declare these on first use, even though that puts them inside the loop - it's always good to avoid uninitialized variables, and for stack variables of trivial types like double you don't need to worry about the cost of construction being different from the cost of assignment.

TallJimbo · 2019-04-23T15:48:32Z

src/image/PhotoCalib.cc

    }
 }

 void PhotoCalib::instFluxToMagnitudeArray(afw::table::SourceCatalog const &sourceCatalog,
                                          std::string const &instFluxField,
                                          ndarray::Array<double, 2, 2> result) const {
-    double instFlux, instFluxErr, calibration;
+    double instFlux, instFluxErr;  //, calibration;


same comment as above about declaring these on first use inside the loop.

This code block had already turned up twice, and I'll need it again for `calibrateCatalog`, so pull it out into a private method.

Refactor catalog tests to make them safer.

parejkoj added 4 commits April 22, 2019 16:23

Cleanup docstrings and remove redundant comment

b6fa80f

Remove incorrect docstring warnings

b17f127

Add second instFlux field and cleanup other instFlux fields in test

0faae3a

parejkoj force-pushed the tickets/DM-10155 branch from 3cc133a to 2807ca8 Compare April 22, 2019 23:37

parejkoj changed the title ~~DM-10155 Implement in-place SourceCatalog operators in PhotoCalib~~ DM-10155: Implement in-place SourceCatalog operators in PhotoCalib Apr 22, 2019

TallJimbo approved these changes Apr 23, 2019

View reviewed changes

parejkoj added 2 commits April 23, 2019 17:29

Add evaluateCatalog

bea9032

This code block had already turned up twice, and I'll need it again for `calibrateCatalog`, so pull it out into a private method.

Add calibrateCatalog with tests

5abe592

Refactor catalog tests to make them safer.

parejkoj force-pushed the tickets/DM-10155 branch from 2807ca8 to 5abe592 Compare April 24, 2019 00:31

parejkoj merged commit 23220f5 into master Apr 24, 2019

timj deleted the tickets/DM-10155 branch February 18, 2021 00:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-10155: Implement in-place SourceCatalog operators in PhotoCalib #457

DM-10155: Implement in-place SourceCatalog operators in PhotoCalib #457

parejkoj commented Apr 20, 2019

TallJimbo left a comment

TallJimbo Apr 23, 2019

TallJimbo Apr 23, 2019

TallJimbo Apr 23, 2019 •

edited

parejkoj Apr 23, 2019

TallJimbo Apr 23, 2019

TallJimbo Apr 23, 2019

TallJimbo Apr 23, 2019

parejkoj Apr 23, 2019 •

edited

TallJimbo Apr 23, 2019

TallJimbo Apr 23, 2019 •

edited

TallJimbo Apr 23, 2019

TallJimbo Apr 23, 2019

DM-10155: Implement in-place SourceCatalog operators in PhotoCalib #457

DM-10155: Implement in-place SourceCatalog operators in PhotoCalib #457

Conversation

parejkoj commented Apr 20, 2019

TallJimbo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallJimbo Apr 23, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parejkoj Apr 23, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallJimbo Apr 23, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallJimbo Apr 23, 2019 •

edited

parejkoj Apr 23, 2019 •

edited

TallJimbo Apr 23, 2019 •

edited