Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-10155: Implement in-place SourceCatalog operators in PhotoCalib #457

Merged
merged 6 commits into from Apr 24, 2019

Conversation

parejkoj
Copy link
Contributor

Also vectorize catalog array operations for a big speedup on AST-backed BoundedFields.

AST-backed BoundedFields (e.g. TransformBoundedField) perform much better on
arrays of points, than single operations. This change gives about a 17x
speedup for `photoCalib.instFluxToNanojansky(src, 'slot_PsfFlux')` for a
TransformBoundedField-based photoCalib. It is about 25% slower for spatially-
constant PhotoCalibs but still below a millisecond, so that difference
should not matter.
@parejkoj parejkoj changed the title DM-10155 Implement in-place SourceCatalog operators in PhotoCalib DM-10155: Implement in-place SourceCatalog operators in PhotoCalib Apr 22, 2019
Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks great; thanks in particular for taking the time to vectorize the AST calls in a way that doesn't require contiguous catalogs.

I've made a bunch of minor stylistic comments that I hope will be helpful.

*
* @return The calibrated catalog, with new flux and magnitude (and their errors) fields.
*
* @throws lsst::pex::exceptions::NotFoundError if any item in `instfluxField` does not have a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instfluxField -> instFluxFields.

/**
* Return the calibration evaluated at the centroids of a SourceCatalog.
*/
ndarray::Array<double, 1> evaluateCatalog(afw::table::SourceCatalog const &sourceCatalog) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add blank lines between methods.

std::vector<table::Key<double>> magKeys;
magKeys.reserve(instFluxFields.size());
std::vector<table::Key<double>> magErrKeys;
magErrKeys.reserve(instFluxFields.size());
Copy link
Member

@TallJimbo TallJimbo Apr 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't bad by any means, but it strikes me as a place where a little local struct to hold those keys might be a bit better - i.e. you'd just have a single vector of that struct, and then the inner for loop could just be a range based for, and there'd be a lot more simple attribute assignments instead of paren-filled push_back() and back() calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that suggestion. I think that does improve clarity.

instFluxKeys.push_back(inSchema.find<double>(field + "_instFlux").key);
try {
instFluxErrKeys.push_back(inSchema.find<double>(field + "_instFluxErr").key);
} catch (pex::exceptions::NotFoundError) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though you're not doing anything with this exception, I think you still want to catch it by reference, i.e. catch (pex::exceptions::NotFoundError&). I'm not positive there would be any ill effects from this as is, and I don't even remember in detail what the ill effects of catching by value are, but it's safest to just always catch by reference.

std::vector<table::Key<double>> magErrKeys;
magErrKeys.reserve(instFluxFields.size());
for (auto const &field : instFluxFields) {
instFluxKeys.push_back(inSchema.find<double>(field + "_instFlux").key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this is switched to struct field assigment (per above comment), you could actually use an even nicer syntax for this:

s.instFlux = inSchema[field + "_instFlux"];  // assuming s is the struct, and instFlux is a field of type Key<double>.

That is, with the assignment form, it can infer the fact that it's a double from what you're assigning it to (which is why this kind of syntax isn't usable in Python as often).

I'd also say it'd be slightly better to use inSchema.join(field, "instFlux") instead of field + "_instFlux" to avoid referring directly to the delimiter. We don't do that across the board, but I'm trying to encourage it in new code.

for (auto const &name : catalog.getSchema().getNames()) {
// Pick every field ending in "_instFlux", grabbing everything before that prefix.
if (name.size() > 10 && name.compare(name.size() - 9, 9, "_instFlux") == 0) {
instFluxFields.emplace_back(name.substr(0, name.size() - 9));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do something like

constexpr std::string SUFFIX = "_instFlux";

above the for loop, and then refer to SUFFIX.size() instead of using 9 and 10 explicitly.

Copy link
Contributor Author

@parejkoj parejkoj Apr 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constexpr std::string is not allowed. Is it better to make it a non-constexpr string, or use a constexpr char[]? The latter makes for less nice syntax for computing the size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd use static std::string const SUFFIX = "_instFlux";, then.

@@ -465,38 +544,68 @@ double PhotoCalib::evaluate(lsst::geom::Point<double, 2> const &point) const {
return _calibration->evaluate(point);
}

ndarray::Array<double, 1> PhotoCalib::evaluateArray(ndarray::Array<double, 1> xx,
ndarray::Array<double, 1> yy) const {
Copy link
Member

@TallJimbo TallJimbo Apr 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convention is to pass arrays by const reference: ndarray::Array<double, 1> const & xx.

It's not a big deal either way (neither is our opposite convention of passing shared_ptr by value), but best to be consistent.

void PhotoCalib::instFluxToNanojanskyArray(afw::table::SourceCatalog const &sourceCatalog,
std::string const &instFluxField,
ndarray::Array<double, 2, 2> result) const {
double instFlux, instFluxErr, nanojansky, calibration;
double instFlux, instFluxErr, nanojansky; //, calibration;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine (i.e. just as efficient) and probably slightly better form to declare these on first use, even though that puts them inside the loop - it's always good to avoid uninitialized variables, and for stack variables of trivial types like double you don't need to worry about the cost of construction being different from the cost of assignment.

}
}

void PhotoCalib::instFluxToMagnitudeArray(afw::table::SourceCatalog const &sourceCatalog,
std::string const &instFluxField,
ndarray::Array<double, 2, 2> result) const {
double instFlux, instFluxErr, calibration;
double instFlux, instFluxErr; //, calibration;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above about declaring these on first use inside the loop.

This code block had already turned up twice, and I'll need it again for
`calibrateCatalog`, so pull it out into a private method.
Refactor catalog tests to make them safer.
@parejkoj parejkoj merged commit 23220f5 into master Apr 24, 2019
@timj timj deleted the tickets/DM-10155 branch February 18, 2021 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants