DM-13055 Reject NaN centroid sigmas #98

parejkoj · 2017-12-19T02:22:42Z

No description provided.

TallJimbo

A couple minor comments; nothing critical.

My only big-picture critique is that these SourceSelectors seem quite rigid, and I think the proximate cause is that not enough of these tests are configurable - perhaps this new test should be, at least, so code that doesn't care about centroid uncertainty will see no change in behavior?

I said "proximate" because I think ultimate cause of this rigidity is that Python polymorphism and pex_config are just not an ideal way to express a parameterized filter expression on tabular data (compared to, say, a user-specified SQL where clause). Of course, fixing that would require a lot more work on our table infrastructure, even if it's "just" finding a way to move to Pandas.

TallJimbo · 2017-12-19T14:58:48Z

python/lsst/meas/algorithms/astrometrySourceSelector.py

@@ -122,12 +124,16 @@ def _hasCentroid_vector(self, sourceCat):
        """Return True for each source that has a valid centroid"""
        return np.isfinite(sourceCat.get(self.centroidXKey)) \
            & np.isfinite(sourceCat.get(self.centroidYKey)) \
+            & np.isfinite(sourceCat.get(self.centroidXSigmaKey)) \
+            & np.isfinite(sourceCat.get(self.centroidYSigmaKey)) \
            & ~sourceCat.get(self.centroidFlagKey)


I've always been a little uncomfortable with using binary & as a replacement for numpy.logical_and (I'd probably use logical_and.reduce here); we're sure nothing in this expression could cause one of the arguments to be promoted from bool to int and cause all the logic to go awry?

I've never really had an issue with the binary & for np expressions, but I can convert to using np.logical_and

I actually don't know of anyone having trouble either, so maybe I'm just paranoid. Just thought I'd bring it up.

I just tried some tests, and it looks like np.logical_and.reduce() and np.bitwise_and.reduce() is significantly slower than chained &:

... In [27]: aa=np.random.random(1000000) > .5 In [28]: %timeit np.logical_and.reduce((xx,yy,zz,aa)) 669 µs ± 47 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) In [29]: %timeit xx & yy & zz & aa 182 µs ± 1.82 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) In [35]: %timeit np.bitwise_and.reduce((xx,yy,zz,aa)) 651 µs ± 9.46 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

So, I'd rather go with &, even though your concern isn't crazy.

TallJimbo · 2017-12-19T15:00:19Z

python/lsst/meas/algorithms/astrometrySourceSelector.py

-        return np.all(np.isfinite(centroid)) and not source.getCentroidFlag()
+        centroidErr = source.getCentroidErr()
+        return np.all(np.isfinite(centroid)) and np.all(np.isfinite(centroidErr)) \
+            and not source.getCentroidFlag()


np.all(np.isfinite(centroid)) should always be unnecessary. If you really want to check, you could assert on it - it should be finite even when source.getCentroidFlag() is true.

Ok, I guess I can turn the centroid finiteness checks into asserts, but I'm going to argue in an RFC shortly that the errors should also be finite for unflagged sources.

parejkoj · 2017-12-19T19:19:17Z

My only big-picture critique is that these SourceSelectors seem quite rigid, and I think the proximate cause is that not enough of these tests are configurable - perhaps this new test should be, at least, so code that doesn't care about centroid uncertainty will see no change in behavior?

astrometrySourceSelector is rigid by design: its job is to select sources that are useful for astrometric measurements, which has a pretty clear definition that includes "reliable centroids". I don't see how "reliable centroid" doesn't include "reliable centroid error".

TallJimbo · 2017-12-20T15:59:37Z

python/lsst/meas/algorithms/astrometrySourceSelector.py

+                    ~np.isfinite(sourceCat.get(self.centroidYKey))) &
+                    ~sourceCat.get(self.centroidFlagKey))
+        assert ~centroid_nonfinite_but_not_flagged().any(), \
+            "Centroids not finite for %d unflagged sources." % (centroid_nonfinite_but_not_flagged().sum())


You can actually just do

assert np.isfinite(sourceCat.get(self.centroidXKey)).all() and np.isfinite(sourceCat.get(self.centroidYKey)).all()

In other words, centroid values should never be NaN, even when the flag is set. And that's why I wanted to make that an assert: if a centroid algorithm disobeys that rule, it's a serious bug.

Also, I'm obligated to remind you that snake case isn't allowed :(

TallJimbo · 2017-12-20T16:01:15Z

python/lsst/meas/algorithms/astrometrySourceSelector.py

-        centroid = source.getCentroid()
-        return np.all(np.isfinite(centroid)) and not source.getCentroidFlag()
+        assert np.all(np.isfinite(source.getCentroid())) and not source.getCentroidFlag(), \
+            'Centroid not finite for unflagged source: %s'%source


Same as above: no need to check for centroid flag in this assert.

TallJimbo · 2017-12-20T16:03:33Z

tests/test_astrometrySourceSelector.py

+        Currently, there is no guarantee that a valid centroid_flag implies
+        a finite variance."""
+        add_good_source(self.src, 1)
+        add_good_source(self.src, 2)


More snake case. Looks like a lot of it predates these changes; maybe file a ticket to fix that later?

assert on nonfinite centroids, flagged or not.

TallJimbo approved these changes Dec 19, 2017

View reviewed changes

TallJimbo reviewed Dec 20, 2017

View reviewed changes

Reject NaN centroid sigmas, with test

9d36d91

assert on nonfinite centroids, flagged or not.

parejkoj force-pushed the tickets/DM-13055 branch from fff379f to 9d36d91 Compare December 20, 2017 22:29

parejkoj merged commit 9d36d91 into master Dec 21, 2017

ktlim deleted the tickets/DM-13055 branch August 25, 2018 06:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-13055 Reject NaN centroid sigmas #98

DM-13055 Reject NaN centroid sigmas #98

parejkoj commented Dec 19, 2017

TallJimbo left a comment

TallJimbo Dec 19, 2017

parejkoj Dec 19, 2017

TallJimbo Dec 19, 2017

parejkoj Dec 19, 2017

TallJimbo Dec 19, 2017

parejkoj Dec 19, 2017

parejkoj commented Dec 19, 2017

TallJimbo Dec 20, 2017

TallJimbo Dec 20, 2017

TallJimbo Dec 20, 2017

TallJimbo Dec 20, 2017

DM-13055 Reject NaN centroid sigmas #98

DM-13055 Reject NaN centroid sigmas #98

Conversation

parejkoj commented Dec 19, 2017

TallJimbo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parejkoj commented Dec 19, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment