DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22

czwa · 2019-09-18T21:20:48Z

Add Interval classes
Add vectorized contains() and transforms to Boxes and Intervals
Add Interval APIs to Boxes()

kfindeisen

I've only gotten through 42c5464 so far, but as this one commit is essentially an entire ticket I thought it best to post what I have for now.

The biggest issues:

Consider breaking up this commit into smaller pieces, if possible. One natural division would be core functionality versus bells and whistles.
The IntervalI and IntervalD classes have a very long list of methods, most of which seem nonessential to the concept of an "interval". Please consider cleaning up the API by separating these methods from the class definition; it will make the classes easier to learn and maintain. Already approved on RFC-593.
There are a number of edge cases (zero-length intervals, infinite intervals, half-infinite intervals, integer overflow) that don't appear to be picked up by current tests. This includes the converting constructors between IntervalI and IntervalD, which don't seem to be tested at all. I've found a few bugs by algorithm analysis; please expand test coverage to include these cases.
Both intervals' implementations rely on some class invariants. Please document these invariants explicitly and verify that each method preserves them:
- IntervalI assumes that if _size == 0, then _min == 0.
- IntervalD assumes that _min and _max are either both NaN or both not NaN.

include/lsst/geom/Interval.h

kfindeisen · 2019-09-25T17:30:43Z

include/lsst/geom/Interval.h

+ *
+ *  All IntervalI methods that mutate self or return a new instance (and are
+ *  not marked `noexcept`) throw OverflowError if either bound or the size
+ *  would be too large to fit in `int`.


Missing backticks for bound and size? I'm not clear on what "bound" is.

I was trying to use "bound" as shorthand for "upper or lower bound", but apparently that doesn't work; I'll write out the longer version. I don't think backticks are appropriate because there isn't actually a C++ symbol named "size".

include/lsst/geom/Interval.h

tests/test_interval.py

python/lsst/geom/_Interval.py

tests/test_interval.py

include/lsst/geom/Box.h

python/lsst/geom/_Box.cc

python/lsst/geom/_coordinates.cc

src/Box.cc

include/lsst/geom/AffineTransform.h

kfindeisen · 2019-09-26T20:07:43Z

python/lsst/geom/_AffineTransform.cc

@@ -64,6 +64,12 @@ void wrapAffineTransform(utils::python::WrapperCollection & wrappers) {
                    py::overload_cast<Point2D const &>(&AffineTransform::operator(), py::const_));
            cls.def("__call__",
                    py::overload_cast<Extent2D const &>(&AffineTransform::operator(), py::const_));
+            cls.def("__call__",
+                    [](py::object self, py::object x, py::object y) mutable {


Redundant mutable.

This seems like a really roundabout way of supporting two array arguments. Why not just have a lambda that takes two py::array_t and calls vectorize internally?

~~This approach lets numpy do broadcasting when the shapes are compatible but not identical. I'll add a comment to that effect.~~

Nevermind. Was confusing this with something else. But why is a lambda that calls vectorize better? What you describe seems a lot easier to get wrong, because this delegates more to existing APIs - it only looks awkward at all because it's written in C++ rather than pure Python (and if there was already a .py file for this class, I'd have added it there - but I don't think it's enough reason on its own to add one).

The code looks like it jumps into Python, then goes back to C++, then back to Python, before returning in C++. 😵. vectorize is an existing function that basically does what you're trying to do.

I think you can at least have the parameter types be py::array_t instead of py::object.

I don't think "jumps into Python" is actually a meaningful thing - it's just calling the CPython runtime, and I don't think that's a big deal in a pybind11 file. But I think the right resolution is to just create _AffineTransform.py and put it there, and prevent the poor readability of this from causing any more trouble.

Ugh, scratch that. The fact that this is overloaded changes things - it makes it much more complicated to move just one overload to Python. But using vectorize doesn't actually simplify things (I didn't realize this until now), because it doesn't actually call the function you give it - it creates a Python callable that calls the function you give it, so the lambda would actually look something like this:

[](py::object self, py::object x, py::object y) { auto applyX = py::vectorize(&AffineTransform::applyX); auto applyY = py::vectorize(&AffineTransform::applyY); return py::make_tuple(applyX(x, y), applyY(x, y)); }

That of course involves creating the vectorized callables in every call, which is bad, so a better variant would be to define applyX and applyY as above outside the lambda, and then both capture them in the lambda and use them in the definition of the Python-side applyX and applyY methods (if we decide to keep those).

But I don't think getting those callables via lambda captures is any better than looking them up as attributes of self - if anything, it think it makes the reader think about whether the capture is safe from a Python reference-counting perspective (I'm quite confident it is safe, but I don't think it's at all obvious from looking at the code).

I'll add some comments explaining why things are the way they are both here and in LinearTransform (and, of course, remove the errant mutable). I would rather not switch to array_t instead of object as that will just make pybind11 do the same type-checking on those arguments multiple times.

tests/test_transforms.py

TallJimbo · 2019-10-03T15:55:27Z

@kfindeisen, thanks for the careful review, and apologies for the state of the code when you started. I think I've now addressed most concerns. In the hopes of making the next stage of this review easier on you, I've put all changes on new commits, most of which follow and will eventually be squashed into the original commits you commented on. That means even the originals have been rebased, but their diffs should be the same. I'm also happy to squash now if you'd prefer. The improvements to remove in-place transformation operations (for RFC-593) and to fix the handling of overflow and non-finite values were pretty far-reaching, and I could imagine those changes in particular being easier to look at after squashing.

Here are what I think are the remaining open questions:

Consider breaking up this commit into smaller pieces, if possible. One natural division would be core functionality versus bells and whistles.

While this no doubt would have made the original review easier had I done it before, I don't really see the gain now (and I don't really see a clear dividing line for "core functionality"). It's all new, so it's still an atomic commit (or will be, after squashing fixes the explicit issues you caught with some changes being on the wrong commits). That said, the interval classes are already a bit smaller (because the in-place transformers have been removed) and I've got another proposal below to make them a bit smaller still.
Several in-line comment threads are about whether APIs should exist in C++ or Python just for consistency between languages or between Interval and Box, including:
- Box.contains(x, y) in C++ (DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22 (comment))
- IntervalI.getSlices in C++ (DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22 (comment))
- Transform.applyX, Transform.applyY in C++ (DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22 (comment))
- Interval.contains (not just __contains__) in Python (DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22 (comment))
I would like to resolve those the same way. I don't have a super strong opinion, but I do tend to think that consistency is more important than reducing clutter.
Should IntervalI define __iter__ and __len__ like the range built-in, or have a method to obtain a range (DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22 (comment))? I'm now leaning towards the latter, but have not made the change.
Should IntervalI.slice (and Box2I.slices) exist as properties (DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22 (comment))? Happy to go with your recommendation on this, but wanted to make you aware of my original reasoning first.
Should the reflect operations be removed? Your comment on the incorrectness of the previous implementation (DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22 (comment)) made me realize that these don't satisfy the use case I originally had in mind for them - an eventual replacement of old operations whose behavior I don't understand at all (DM-21487).
I've fixed the incorrect implementation of IntervalD -> IntervalI you noted at DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22 (comment). I've also added a new commit (28e7006) to change the corresponding Box conversion to delegate to the Interval one. That changes Box behavior in an unanticipated way that I think is both more correct and a bit inconvenient (in that some conversions no longer round-trip). My current plan is to leave it in if it doesn't break anything in Jenkins, but to move it to a different ticket if it does.

kfindeisen

Thanks for your patience, the new code looks much better. The biggest outstanding issue is that the IntervalD to IntervalI conversion still doesn't work in a lot of edge cases.

kfindeisen · 2019-10-07T19:52:48Z

include/lsst/geom/Interval.h

+ *  IntervalI sets the minimum point to the origin for an empty interval, and
+ *  returns -1 for both elements of the maximum point in that case.


Given your claim in the review discussion that you don't want to guarantee a particular position for IntervalI, consider removing this statement.

👍 I'll move the whole @internal block to a code comment, as it's for maintainers and not for users, and we haven't established any conventions for using @internal for that (and this is just a relic of me haphazardly using it for that once upon a time).

I'm confused now. Is empty IntervalI any state with _size <= 0, or specifically _min = _size = 0?

An empty IntervalI is guaranteed to have getSize() == 0, but the results of calls to getMin or getMax are unspecified for an empty interval. APIs that construct an IntervalI given a size can be expected to produce an empty interval if the given size is not positive.

Internally, IntervalI also maintains the invariant that empty implies _min = _size = 0, but this is not part of its documented interface.

kfindeisen · 2019-10-07T19:59:00Z

include/lsst/geom/Interval.h

+     *  auto array = ndarray::copy(ndarray::arange(5);
+     *  auto interval = IntervalI::fromMinMax(2, 4);
+     *  auto subarray = array[interval.getSlice()];
+     *  @endcode


The example helps a lot, thanks. Note that the style guide recommends using Markdown (blank line and 4-space indent for code blocks); it's more readable than Doxygen tags. The guide includes an example examples section.

👍 I always forget that Doxygen can do full Markdown now.

kfindeisen · 2019-10-07T20:01:45Z

include/lsst/geom/Interval.h


-    //@{
+    //@{x


kfindeisen · 2019-10-07T20:38:24Z

include/lsst/geom/Interval.h

@@ -444,43 +451,53 @@ class IntervalD final {
     *  @param[in] max       Maximum coordinate (inclusive).
     *
     *  If `min > max` or either is NaN, an empty interval is returned.
+     *
+     *  @throws lsst::pex::exceptions::InvalidParameterError  Thrown if `min`
+     *      and `max` are both +infinity or both -infinity.


Given your statement that you specifically want half-bounded intervals when they can be defined unambiguously, I'm surprised you're disallowing the unbounded interval.

I'm only disallowing min == max == -inf and min == max == +inf; min == -inf, max > -inf and min < inf, max == inf yield valid intervals with infinite size, and min > max -> empty covers the remaining cases. Will clarify.

kfindeisen · 2019-10-07T20:41:53Z

include/lsst/geom/Interval.h

-     * @param center The desired center of the interval.
-     * @param size   Number of pixels in interval.
+     *  @param center The desired center of the interval.  May not be infinite.
+     *  @param size   Number of pixels in interval.  May not be infinite.


Why the double spacing in this comment? Editing artifact?

It might be a good idea to run clang-format on these files before merging. I know you normally avoid it, but for such a far-reaching change not running it will cause a lot of surprise changes for later developers.

Why the double spacing in this comment? Editing artifact?

I was actually making these lines consistent with all of the other comments in the file (and at least most of the package).

It might be a good idea to run clang-format on these files before merging. I know you normally avoid it, but for such a far-reaching change not running it will cause a lot of surprise changes for later developers.

Will do, provided it's not too much of a pain for me to get it set up with the right configurations and the results aren't too awful.

kfindeisen · 2019-10-07T22:39:43Z

include/lsst/geom/Box.h

    Box2I dilatedBy(Extent const & buffer) const;
-    Box2I & dilateBy(Extent const & buffer) {
-        return (*this = dilatedBy(buffer)); // delegate mutator to factory for exception safety.
-    }


Don't forget to update the RFC implementation tickets registered in Jira, since I'm pretty sure this is out of the original scope.

👍 I synced the only ticket with branches prior to starting work on addressing review comments here, so I'll just blow away its (older) versions of these commits when rebasing. That will pick up these changes automatically, and leave only the couple of commits that weren't ever on this branch.

kfindeisen · 2019-10-07T22:43:59Z

include/lsst/geom/Box.h

+     *
+     *  Expanding an empty box with a second box is equivalent to assignment.
+     */
+    Box2D expandedTo(Box2D const & other) const noexcept;


Is it possible for a Box2D to be half-infinite, the way an IntervalD can? Thinking of the exception thrown when expanding by a non-finite point.

Good point. Box2D doesn't have the same kind of carefulness IntervalD (now) has about what kinds of infinites are valid, so it's entirely possible trying to extract an x or y interval from the box will throw. I'll drop the noexcept.

src/Box.cc

kfindeisen · 2019-10-07T22:50:21Z

src/Box.cc

    }
+    throw pex::exceptions::LogicError("Invalid enum value.");


I think a default case would be more idiomatic.

tests/test_box.py

kfindeisen · 2019-10-07T23:06:13Z

Consider breaking up this commit into smaller pieces, if possible. One natural division would be core functionality versus bells and whistles.

While this no doubt would have made the original review easier had I done it before, I don't really see the gain now (and I don't really see a clear dividing line for "core functionality").

It will make it easier for anybody reviewing the changes in the future, or who wants to roll back a specific change (which is the original motivation for small commits in the first place, ease of review is just a bonus).

Should the reflect operations be removed? Your comment on the incorrectness of the previous implementation (#22 (comment)) made me realize that these don't satisfy the use case I originally had in mind for them - an eventual replacement of old operations whose behavior I don't understand at all (DM-21487).

YAGNI seems like a good principle for these classes, but I would wait for the resolution of DM-21487 before deciding what to do. It wouldn't surprise me if the the old operations were supposed to mean what you think they do, and the test cases were wrong. (And yes, I realize that we'd break a lot of code if we tried to change the behavior of flip* now, I'm not quite that cleanup-happy.)

TallJimbo · 2019-10-09T15:22:12Z

Ok, I think I'm through the second phase of addressing review comments. I still need to squash commits before merge, and as per discussion above we're waiting to see if there's any action on DM-21487 before deciding what to do in one case.

@czwa , once the rest of the branches for this ticket are ready (or nearly ready) to merge, let me know and I'll both squash and take that pending decision even if we haven't gotten any input from the other ticket yet.

For now, I'll see you all over on the afw PR.

These are intended to be used as replacement implementations for the flipLR and flipTB methods, but the behavior of those is different for reasons I don't understand (they just seem wrong, but the documentation is unclear); this is DM-21487. While the behavior of these methods makes sense (and is better documentated), it's not clear we have a use case for them if they can't be used to back the flip methods.

This notably does not include properties for Box getters that return Points or Extents, or SpherePoint getters that return Angle. That's because Point, Extent, and Angle are mutable from Python, so e.g. box.min.x = 3 would be a confusing, silent no-op.

These mostly expose functionality the Boxes already expose some other way, but consistent interfaces are important. Despite the fact that Boxes are already mutable in Python, I have not exposed the interval/sphgeom-style mutation APIs, in case we decide we want to make Box immutable in Python in the future.

This now delegates to Interval, which fixes some serious bugs but changes the behavior such that Box2I -> Box2D -> Box2I no longer roundtrips if EXPAND is used twice.

Use overload_cast whenever possible. Remove invalid Box2D::overlaps overload.

Box2D doesn't maintain the same guarantees w.r.t. infinite values, so it's possible that accessing an x or y interval from a box will throw. Ideally we'll fix that some day, but these are not edge cases we hit often (if ever), so it may be a low priority.

These are intended to be used as replacement implementations for the flipLR and flipTB methods, but the behavior of those is different for reasons I don't understand (they just seem wrong, but the documentation is unclear); this is DM-21487. While the behavior of these methods makes sense (and is better documentated), it's not clear we have a use case for them if they can't be used to back the flip methods.

TallJimbo · 2019-10-15T18:07:58Z

I've rebased and squashed, run clang-format on the new files, and split the big Interval commit into five commits. One of those is just the reflectedAbout methods whose future is in doubt due to DM-21487; I've also split the addition of those methods on Box off onto a new commit, so if the resolution of DM-21487 is that we should remove them, it'll just be a matter of reverting those two commits.

I have no other changes planned for this branch; @czwa , you're welcome to merge this when other branches and testing are complete.

czwa requested a review from kfindeisen September 24, 2019 20:12

kfindeisen requested changes Sep 26, 2019

View reviewed changes

kfindeisen reviewed Sep 26, 2019

View reviewed changes

Bump Travis Python version for __future__ annotations.

d022d60

TallJimbo force-pushed the tickets/DM-18610 branch 2 times, most recently from 00726d4 to f394748 Compare October 3, 2019 15:54

TallJimbo force-pushed the tickets/DM-18610 branch from f394748 to 28e7006 Compare October 3, 2019 15:59

kfindeisen approved these changes Oct 7, 2019

View reviewed changes

TallJimbo force-pushed the tickets/DM-18610 branch 3 times, most recently from 5eee54b to 189ff5f Compare October 9, 2019 15:13

TallJimbo force-pushed the tickets/DM-18610 branch 2 times, most recently from 1bcbf12 to 6d11650 Compare October 15, 2019 15:48

TallJimbo added 14 commits October 15, 2019 12:31

Initial versions of Interval classes.

abb6ffc

Add slice, range, and arange methods to IntervalI.

fb7ef52

Add converting constructors between IntervalI and IntervalD.

8fae924

Add factory operations to Interval classes.

326213b

Add Interval constructors and accessors to Boxes.

0028d61

Add vectorized contains() to Boxes and Intervals.

48a93bc

Use subTest and fix silly line breaks in box test.

9bfb905

Rewrite conversion from Box2D to Box2I.

168ef1c

This now delegates to Interval, which fixes some serious bugs but changes the behavior such that Box2I -> Box2D -> Box2I no longer roundtrips if EXPAND is used twice.

Clean up Box pybind11 wrappers.

3bb64f3

Use overload_cast whenever possible. Remove invalid Box2D::overlaps overload.

Add vectorized evaluation to transforms.

70c47c5

TallJimbo force-pushed the tickets/DM-18610 branch from 6d11650 to 68a739d Compare October 15, 2019 18:03

czwa merged commit 68a739d into master Nov 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22

DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22

czwa commented Sep 18, 2019

kfindeisen left a comment •

edited

kfindeisen Sep 25, 2019

TallJimbo Sep 30, 2019

kfindeisen Sep 26, 2019

kfindeisen Sep 26, 2019

TallJimbo Oct 3, 2019 •

edited

kfindeisen Oct 7, 2019 •

edited

TallJimbo Oct 8, 2019

TallJimbo Oct 8, 2019

TallJimbo commented Oct 3, 2019 •

edited

kfindeisen left a comment

kfindeisen Oct 7, 2019

TallJimbo Oct 8, 2019

kfindeisen Oct 8, 2019 •

edited

TallJimbo Oct 9, 2019

kfindeisen Oct 7, 2019

TallJimbo Oct 8, 2019

kfindeisen Oct 7, 2019

kfindeisen Oct 7, 2019

TallJimbo Oct 8, 2019

kfindeisen Oct 7, 2019

kfindeisen Oct 7, 2019

TallJimbo Oct 8, 2019

kfindeisen Oct 7, 2019 •

edited

TallJimbo Oct 9, 2019

kfindeisen Oct 7, 2019

TallJimbo Oct 9, 2019

kfindeisen Oct 7, 2019

kfindeisen commented Oct 7, 2019 •

edited

TallJimbo commented Oct 9, 2019

TallJimbo commented Oct 15, 2019

		* IntervalI sets the minimum point to the origin for an empty interval, and
		* returns -1 for both elements of the maximum point in that case.

DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22

DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22

Conversation

czwa commented Sep 18, 2019

kfindeisen left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallJimbo Oct 3, 2019 • edited

Choose a reason for hiding this comment

kfindeisen Oct 7, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallJimbo commented Oct 3, 2019 • edited

kfindeisen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfindeisen Oct 8, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfindeisen Oct 7, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfindeisen commented Oct 7, 2019 • edited

TallJimbo commented Oct 9, 2019

TallJimbo commented Oct 15, 2019

kfindeisen left a comment •

edited

TallJimbo Oct 3, 2019 •

edited

kfindeisen Oct 7, 2019 •

edited

TallJimbo commented Oct 3, 2019 •

edited

kfindeisen Oct 8, 2019 •

edited

kfindeisen Oct 7, 2019 •

edited

kfindeisen commented Oct 7, 2019 •

edited