Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-12207: Add color selection for photocal #286

Merged
merged 3 commits into from Nov 7, 2017
Merged

Conversation

PaulPrice
Copy link
Contributor

No description provided.

Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several comments, though I've kept my grumbling about adding functionality to Calib to myself (until now).

ndarray::Array<T, 1> abMagErrFromFluxErr(ndarray::Array<T, 1> const& fluxErr,
ndarray::Array<T, 1> const& flux) {
if (flux.getNumElements() != fluxErr.getNumElements()) {
throw LSST_EXCEPT(pex::exceptions::LengthError, "Length mismatch");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more explicit "Length of flux and fluxErr arrays are mismatched: XX vs. YY" might be nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was in a header file, and I was both being lazy and wanting to avoid the extra includes that stringifying the lengths requires. I've now moved the definitions out of the header and expanded the error string to include the lengths.

@@ -127,8 +127,18 @@ def __getitem__(self, key):
stop = len(self)
return self.subset(start, stop, step)
elif isinstance(key, np.ndarray):
if not self.isContiguous():
cat = self.__class__(self.table)
for rr, ok in zip(self, key):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block feels somewhat magical: can we have a comment explaining how it's supposed to work?

if not self.isContiguous():
if isinstance(key, basestring):
key = self.schema[key].asKey()
return np.array([rr.get(key) for rr in self])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is returning a numpy array instead of an afw table. That seems bad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't the whole idea that when you index a Catalog with a Key you do in fact get a numpy array? Or are you objecting that mutating the returned array would not mutate the Catalog?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would appreciate @TallJimbo's view on these changes to afw table too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're worried about the mutation problem I mentioned, what if we made the returned array read only?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I didn't know that these returned a numpy array instead of an afw.Table. That's somewhat surprising to me: I would expect Thing.__getitem__[slice] to return another Thing.

(thinking out loud here) But I guess Thing[Key] gets a column, which is like an array. That's not obvious from the alternative return self.subset(key), which I had trouble finding because it's defined in C++. Maybe we should make the docstring for __getitem__ more clear about the types of the possible return values? (says the python person who's clearly been staring at too much C++ lately...)

You are correct though that the mutation question is important: subset is explicitly a shallow copy, so this would cause a different behavior.

@@ -26,16 +26,18 @@
import unittest
import math
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need math anymore, now that you've imported numpy?

def testVector(self):
flux = np.array([1.0, 210.0, 3210.0, 43210.0, 543210.0])
abMag = afwImage.abMagFromFlux(flux)
np.testing.assert_array_almost_equal(abMag, refABMagFromFlux(flux))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest using self.assertFloatsAlmostEqual for your comparisons here: it's more strict that np.testing.

https://developer.lsst.io/coding/python_testing.html#special-asserts

@PaulPrice
Copy link
Contributor Author

Revised according to comments.

@PaulPrice PaulPrice force-pushed the tickets/DM-12207 branch 2 times, most recently from 8d39c50 to ab2fab3 Compare November 3, 2017 05:29

/// Compute AB magnitude from flux in Janskys
template <typename T>
ndarray::Array<T, 1> abMagFromFlux(ndarray::Array<T const, 1> const& flux);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I think about it, I think these functions should be abMagFromJansky, to make the units very explicit (especially given RFC-356).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an API change that would require an RFC. I would support such an RFC, but do not want to lead one, at least not now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's fair. I guess we can just leave it until we throw out all of Calib for PhotoCalib.

@TallJimbo
Copy link
Member

There two reasons why single-column array access hasn't been supported for noncontiguous catalogs thus far:

  • I was worried about sometimes returning a view and sometimes not returning a view being surprising behavior to users. The actual usage patterns have convinced me this fear was unfounded, and returning const arrays (as you have done here) mitigates that further.
  • I didn't want users to think they were doing something efficient when they weren't. I was originally worried about having a lot of unexpected memory allocations (because people were getting deep copies instead of views), but I think that fear was also unfounded. However, this is definitely a problem with the Python implementation here - people will think they're doing NumPy vectorization, but they're actually writing a lot of pure-Python for loops.

So, I'm hesitant but willing to sign off on this as is, with the provision that we open an issue to move the implementation to C++. If you have the time, I'd prefer to do that on this ticket, even if it just goes into a templated free function in the pybind11 wrappers (there isn't really a need for this functionality at all in C++, because for loops over records there are just as fast).

@PaulPrice
Copy link
Contributor Author

Moved the iteration into the pybind layer.

Without it, we can't index a SimpleCatalog by a boolean numpy array.
We can extract what we need by iterating over the catalog, giving the
user what he wants without him having to worry about contiguity.
@PaulPrice PaulPrice merged commit 03e6c45 into master Nov 7, 2017
@ktlim ktlim deleted the tickets/DM-12207 branch August 25, 2018 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants