New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-18610: Add fields, limited mutability, and trim/assembly-state tracking to cameraGeom #22
Conversation
czwa
commented
Sep 18, 2019
- Add Interval classes
- Add vectorized contains() and transforms to Boxes and Intervals
- Add Interval APIs to Boxes()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only gotten through 42c5464 so far, but as this one commit is essentially an entire ticket I thought it best to post what I have for now.
The biggest issues:
- Consider breaking up this commit into smaller pieces, if possible. One natural division would be core functionality versus bells and whistles.
TheAlready approved on RFC-593.IntervalI
andIntervalD
classes have a very long list of methods, most of which seem nonessential to the concept of an "interval". Please consider cleaning up the API by separating these methods from the class definition; it will make the classes easier to learn and maintain.- There are a number of edge cases (zero-length intervals, infinite intervals, half-infinite intervals, integer overflow) that don't appear to be picked up by current tests. This includes the converting constructors between
IntervalI
andIntervalD
, which don't seem to be tested at all. I've found a few bugs by algorithm analysis; please expand test coverage to include these cases. - Both intervals' implementations rely on some class invariants. Please document these invariants explicitly and verify that each method preserves them:
IntervalI
assumes that if_size == 0
, then_min == 0
.IntervalD
assumes that_min
and_max
are either both NaN or both not NaN.
include/lsst/geom/Interval.h
Outdated
* | ||
* All IntervalI methods that mutate self or return a new instance (and are | ||
* not marked `noexcept`) throw OverflowError if either bound or the size | ||
* would be too large to fit in `int`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing backticks for bound
and size
? I'm not clear on what "bound" is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to use "bound" as shorthand for "upper or lower bound", but apparently that doesn't work; I'll write out the longer version. I don't think backticks are appropriate because there isn't actually a C++ symbol named "size".
python/lsst/geom/_AffineTransform.cc
Outdated
@@ -64,6 +64,12 @@ void wrapAffineTransform(utils::python::WrapperCollection & wrappers) { | |||
py::overload_cast<Point2D const &>(&AffineTransform::operator(), py::const_)); | |||
cls.def("__call__", | |||
py::overload_cast<Extent2D const &>(&AffineTransform::operator(), py::const_)); | |||
cls.def("__call__", | |||
[](py::object self, py::object x, py::object y) mutable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant mutable
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a really roundabout way of supporting two array arguments. Why not just have a lambda that takes two py::array_t
and calls vectorize
internally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach lets numpy do broadcasting when the shapes are compatible but not identical. I'll add a comment to that effect.
Nevermind. Was confusing this with something else. But why is a lambda that calls vectorize better? What you describe seems a lot easier to get wrong, because this delegates more to existing APIs - it only looks awkward at all because it's written in C++ rather than pure Python (and if there was already a .py file for this class, I'd have added it there - but I don't think it's enough reason on its own to add one).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks like it jumps into Python, then goes back to C++, then back to Python, before returning in C++. 😵. vectorize
is an existing function that basically does what you're trying to do.
I think you can at least have the parameter types be py::array_t
instead of py::object
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think "jumps into Python" is actually a meaningful thing - it's just calling the CPython runtime, and I don't think that's a big deal in a pybind11 file. But I think the right resolution is to just create _AffineTransform.py
and put it there, and prevent the poor readability of this from causing any more trouble.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, scratch that. The fact that this is overloaded changes things - it makes it much more complicated to move just one overload to Python. But using vectorize
doesn't actually simplify things (I didn't realize this until now), because it doesn't actually call the function you give it - it creates a Python callable that calls the function you give it, so the lambda would actually look something like this:
[](py::object self, py::object x, py::object y) {
auto applyX = py::vectorize(&AffineTransform::applyX);
auto applyY = py::vectorize(&AffineTransform::applyY);
return py::make_tuple(applyX(x, y), applyY(x, y));
}
That of course involves creating the vectorized callables in every call, which is bad, so a better variant would be to define applyX
and applyY
as above outside the lambda, and then both capture them in the lambda and use them in the definition of the Python-side applyX
and applyY
methods (if we decide to keep those).
But I don't think getting those callables via lambda captures is any better than looking them up as attributes of self
- if anything, it think it makes the reader think about whether the capture is safe from a Python reference-counting perspective (I'm quite confident it is safe, but I don't think it's at all obvious from looking at the code).
I'll add some comments explaining why things are the way they are both here and in LinearTransform
(and, of course, remove the errant mutable
). I would rather not switch to array_t
instead of object
as that will just make pybind11 do the same type-checking on those arguments multiple times.
00726d4
to
f394748
Compare
@kfindeisen, thanks for the careful review, and apologies for the state of the code when you started. I think I've now addressed most concerns. In the hopes of making the next stage of this review easier on you, I've put all changes on new commits, most of which follow and will eventually be squashed into the original commits you commented on. That means even the originals have been rebased, but their diffs should be the same. I'm also happy to squash now if you'd prefer. The improvements to remove in-place transformation operations (for RFC-593) and to fix the handling of overflow and non-finite values were pretty far-reaching, and I could imagine those changes in particular being easier to look at after squashing. Here are what I think are the remaining open questions:
|
f394748
to
28e7006
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience, the new code looks much better. The biggest outstanding issue is that the IntervalD
to IntervalI
conversion still doesn't work in a lot of edge cases.
include/lsst/geom/Interval.h
Outdated
* IntervalI sets the minimum point to the origin for an empty interval, and | ||
* returns -1 for both elements of the maximum point in that case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given your claim in the review discussion that you don't want to guarantee a particular position for IntervalI
, consider removing this statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I'll move the whole @internal
block to a code comment, as it's for maintainers and not for users, and we haven't established any conventions for using @internal
for that (and this is just a relic of me haphazardly using it for that once upon a time).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused now. Is empty IntervalI
any state with _size <= 0
, or specifically _min = _size = 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An empty IntervalI
is guaranteed to have getSize() == 0
, but the results of calls to getMin
or getMax
are unspecified for an empty interval. APIs that construct an IntervalI
given a size can be expected to produce an empty interval if the given size is not positive.
Internally, IntervalI
also maintains the invariant that empty implies _min = _size = 0
, but this is not part of its documented interface.
include/lsst/geom/Interval.h
Outdated
* auto array = ndarray::copy(ndarray::arange(5); | ||
* auto interval = IntervalI::fromMinMax(2, 4); | ||
* auto subarray = array[interval.getSlice()]; | ||
* @endcode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example helps a lot, thanks. Note that the style guide recommends using Markdown (blank line and 4-space indent for code blocks); it's more readable than Doxygen tags. The guide includes an example examples section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I always forget that Doxygen can do full Markdown now.
include/lsst/geom/Interval.h
Outdated
|
||
//@{ | ||
//@{x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo?
include/lsst/geom/Interval.h
Outdated
@@ -444,43 +451,53 @@ class IntervalD final { | |||
* @param[in] max Maximum coordinate (inclusive). | |||
* | |||
* If `min > max` or either is NaN, an empty interval is returned. | |||
* | |||
* @throws lsst::pex::exceptions::InvalidParameterError Thrown if `min` | |||
* and `max` are both +infinity or both -infinity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given your statement that you specifically want half-bounded intervals when they can be defined unambiguously, I'm surprised you're disallowing the unbounded interval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm only disallowing min == max == -inf
and min == max == +inf
; min == -inf, max > -inf
and min < inf, max == inf
yield valid intervals with infinite size, and min > max
-> empty covers the remaining cases. Will clarify.
* @param center The desired center of the interval. | ||
* @param size Number of pixels in interval. | ||
* @param center The desired center of the interval. May not be infinite. | ||
* @param size Number of pixels in interval. May not be infinite. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the double spacing in this comment? Editing artifact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a good idea to run clang-format
on these files before merging. I know you normally avoid it, but for such a far-reaching change not running it will cause a lot of surprise changes for later developers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the double spacing in this comment? Editing artifact?
I was actually making these lines consistent with all of the other comments in the file (and at least most of the package).
It might be a good idea to run clang-format on these files before merging. I know you normally avoid it, but for such a far-reaching change not running it will cause a lot of surprise changes for later developers.
Will do, provided it's not too much of a pain for me to get it set up with the right configurations and the results aren't too awful.
include/lsst/geom/Box.h
Outdated
Box2I dilatedBy(Extent const & buffer) const; | ||
Box2I & dilateBy(Extent const & buffer) { | ||
return (*this = dilatedBy(buffer)); // delegate mutator to factory for exception safety. | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to update the RFC implementation tickets registered in Jira, since I'm pretty sure this is out of the original scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I synced the only ticket with branches prior to starting work on addressing review comments here, so I'll just blow away its (older) versions of these commits when rebasing. That will pick up these changes automatically, and leave only the couple of commits that weren't ever on this branch.
include/lsst/geom/Box.h
Outdated
* | ||
* Expanding an empty box with a second box is equivalent to assignment. | ||
*/ | ||
Box2D expandedTo(Box2D const & other) const noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for a Box2D
to be half-infinite, the way an IntervalD
can? Thinking of the exception thrown when expanding by a non-finite point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Box2D
doesn't have the same kind of carefulness IntervalD
(now) has about what kinds of infinites are valid, so it's entirely possible trying to extract an x or y interval from the box will throw. I'll drop the noexcept
.
src/Box.cc
Outdated
} | ||
throw pex::exceptions::LogicError("Invalid enum value."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a default
case would be more idiomatic.
It will make it easier for anybody reviewing the changes in the future, or who wants to roll back a specific change (which is the original motivation for small commits in the first place, ease of review is just a bonus).
YAGNI seems like a good principle for these classes, but I would wait for the resolution of DM-21487 before deciding what to do. It wouldn't surprise me if the the old operations were supposed to mean what you think they do, and the test cases were wrong. (And yes, I realize that we'd break a lot of code if we tried to change the behavior of |
5eee54b
to
189ff5f
Compare
Ok, I think I'm through the second phase of addressing review comments. I still need to squash commits before merge, and as per discussion above we're waiting to see if there's any action on DM-21487 before deciding what to do in one case. @czwa , once the rest of the branches for this ticket are ready (or nearly ready) to merge, let me know and I'll both squash and take that pending decision even if we haven't gotten any input from the other ticket yet. For now, I'll see you all over on the afw PR. |
1bcbf12
to
6d11650
Compare
These are intended to be used as replacement implementations for the flipLR and flipTB methods, but the behavior of those is different for reasons I don't understand (they just seem wrong, but the documentation is unclear); this is DM-21487. While the behavior of these methods makes sense (and is better documentated), it's not clear we have a use case for them if they can't be used to back the flip methods.
This notably does not include properties for Box getters that return Points or Extents, or SpherePoint getters that return Angle. That's because Point, Extent, and Angle are mutable from Python, so e.g. box.min.x = 3 would be a confusing, silent no-op.
These mostly expose functionality the Boxes already expose some other way, but consistent interfaces are important. Despite the fact that Boxes are already mutable in Python, I have not exposed the interval/sphgeom-style mutation APIs, in case we decide we want to make Box immutable in Python in the future.
This now delegates to Interval, which fixes some serious bugs but changes the behavior such that Box2I -> Box2D -> Box2I no longer roundtrips if EXPAND is used twice.
Use overload_cast whenever possible. Remove invalid Box2D::overlaps overload.
Box2D doesn't maintain the same guarantees w.r.t. infinite values, so it's possible that accessing an x or y interval from a box will throw. Ideally we'll fix that some day, but these are not edge cases we hit often (if ever), so it may be a low priority.
These are intended to be used as replacement implementations for the flipLR and flipTB methods, but the behavior of those is different for reasons I don't understand (they just seem wrong, but the documentation is unclear); this is DM-21487. While the behavior of these methods makes sense (and is better documentated), it's not clear we have a use case for them if they can't be used to back the flip methods.
6d11650
to
68a739d
Compare
I've rebased and squashed, run clang-format on the new files, and split the big I have no other changes planned for this branch; @czwa , you're welcome to merge this when other branches and testing are complete. |