New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-10155: Implement in-place SourceCatalog operators in PhotoCalib #457
Conversation
AST-backed BoundedFields (e.g. TransformBoundedField) perform much better on arrays of points, than single operations. This change gives about a 17x speedup for `photoCalib.instFluxToNanojansky(src, 'slot_PsfFlux')` for a TransformBoundedField-based photoCalib. It is about 25% slower for spatially- constant PhotoCalibs but still below a millisecond, so that difference should not matter.
3cc133a
to
2807ca8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic looks great; thanks in particular for taking the time to vectorize the AST calls in a way that doesn't require contiguous catalogs.
I've made a bunch of minor stylistic comments that I hope will be helpful.
include/lsst/afw/image/PhotoCalib.h
Outdated
* | ||
* @return The calibrated catalog, with new flux and magnitude (and their errors) fields. | ||
* | ||
* @throws lsst::pex::exceptions::NotFoundError if any item in `instfluxField` does not have a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instfluxField
-> instFluxFields
.
/** | ||
* Return the calibration evaluated at the centroids of a SourceCatalog. | ||
*/ | ||
ndarray::Array<double, 1> evaluateCatalog(afw::table::SourceCatalog const &sourceCatalog) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add blank lines between methods.
src/image/PhotoCalib.cc
Outdated
std::vector<table::Key<double>> magKeys; | ||
magKeys.reserve(instFluxFields.size()); | ||
std::vector<table::Key<double>> magErrKeys; | ||
magErrKeys.reserve(instFluxFields.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't bad by any means, but it strikes me as a place where a little local struct
to hold those keys might be a bit better - i.e. you'd just have a single vector
of that struct, and then the inner for
loop could just be a range based for
, and there'd be a lot more simple attribute assignments instead of paren-filled push_back()
and back()
calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that suggestion. I think that does improve clarity.
src/image/PhotoCalib.cc
Outdated
instFluxKeys.push_back(inSchema.find<double>(field + "_instFlux").key); | ||
try { | ||
instFluxErrKeys.push_back(inSchema.find<double>(field + "_instFluxErr").key); | ||
} catch (pex::exceptions::NotFoundError) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though you're not doing anything with this exception, I think you still want to catch it by reference, i.e. catch (pex::exceptions::NotFoundError&)
. I'm not positive there would be any ill effects from this as is, and I don't even remember in detail what the ill effects of catching by value are, but it's safest to just always catch by reference.
src/image/PhotoCalib.cc
Outdated
std::vector<table::Key<double>> magErrKeys; | ||
magErrKeys.reserve(instFluxFields.size()); | ||
for (auto const &field : instFluxFields) { | ||
instFluxKeys.push_back(inSchema.find<double>(field + "_instFlux").key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once this is switched to struct field assigment (per above comment), you could actually use an even nicer syntax for this:
s.instFlux = inSchema[field + "_instFlux"]; // assuming s is the struct, and instFlux is a field of type Key<double>.
That is, with the assignment form, it can infer the fact that it's a double
from what you're assigning it to (which is why this kind of syntax isn't usable in Python as often).
I'd also say it'd be slightly better to use inSchema.join(field, "instFlux")
instead of field + "_instFlux"
to avoid referring directly to the delimiter. We don't do that across the board, but I'm trying to encourage it in new code.
for (auto const &name : catalog.getSchema().getNames()) { | ||
// Pick every field ending in "_instFlux", grabbing everything before that prefix. | ||
if (name.size() > 10 && name.compare(name.size() - 9, 9, "_instFlux") == 0) { | ||
instFluxFields.emplace_back(name.substr(0, name.size() - 9)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could do something like
constexpr std::string SUFFIX = "_instFlux";
above the for loop, and then refer to SUFFIX.size()
instead of using 9
and 10
explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
constexpr std::string
is not allowed. Is it better to make it a non-constexpr string, or use a constexpr char[]
? The latter makes for less nice syntax for computing the size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd use static std::string const SUFFIX = "_instFlux";
, then.
src/image/PhotoCalib.cc
Outdated
@@ -465,38 +544,68 @@ double PhotoCalib::evaluate(lsst::geom::Point<double, 2> const &point) const { | |||
return _calibration->evaluate(point); | |||
} | |||
|
|||
ndarray::Array<double, 1> PhotoCalib::evaluateArray(ndarray::Array<double, 1> xx, | |||
ndarray::Array<double, 1> yy) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convention is to pass arrays by const reference: ndarray::Array<double, 1> const & xx
.
It's not a big deal either way (neither is our opposite convention of passing shared_ptr
by value), but best to be consistent.
src/image/PhotoCalib.cc
Outdated
void PhotoCalib::instFluxToNanojanskyArray(afw::table::SourceCatalog const &sourceCatalog, | ||
std::string const &instFluxField, | ||
ndarray::Array<double, 2, 2> result) const { | ||
double instFlux, instFluxErr, nanojansky, calibration; | ||
double instFlux, instFluxErr, nanojansky; //, calibration; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine (i.e. just as efficient) and probably slightly better form to declare these on first use, even though that puts them inside the loop - it's always good to avoid uninitialized variables, and for stack variables of trivial types like double
you don't need to worry about the cost of construction being different from the cost of assignment.
src/image/PhotoCalib.cc
Outdated
} | ||
} | ||
|
||
void PhotoCalib::instFluxToMagnitudeArray(afw::table::SourceCatalog const &sourceCatalog, | ||
std::string const &instFluxField, | ||
ndarray::Array<double, 2, 2> result) const { | ||
double instFlux, instFluxErr, calibration; | ||
double instFlux, instFluxErr; //, calibration; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above about declaring these on first use inside the loop.
This code block had already turned up twice, and I'll need it again for `calibrateCatalog`, so pull it out into a private method.
Refactor catalog tests to make them safer.
2807ca8
to
5abe592
Compare
Also vectorize catalog array operations for a big speedup on AST-backed BoundedFields.