Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-19575: Add Storable mixin to ExposureInfo components #465

Merged
merged 20 commits into from Jun 7, 2019

Conversation

kfindeisen
Copy link
Member

This PR adds the Storable mixin to all afw classes in ExposureInfo (note that it does not modify daf::base::PropertySet, which as noted on Jira will need special treatment for now). This amounts to adding the following four methods, though some keep their default implementation of throwing an UnsupportedOperationException:

  • cloneStorable: creates a shared_ptr<Storable>; implemented in terms of a more specific clone if applicable. Using a virtual cloneStorable instead of shadowing (as was done with e.g. BaseTable) is less prone to developer error, because only one implementation is required (e.g., in Psf) rather than one for every subclass.
  • equals: compares the object to any Storable, usually returning false for mixed-class comparisons. Implemented in terms of operator== if applicable and appropriate.
  • toString: returns a string representation of an object. I found it easier to reimplement operator<< in terms of toString rather than vice versa (the latter would have involved a method template in Storable.h that would introduce extra dependencies).
  • hash_value: conforms to the convention I introduced in DM-9938: Make some afw types hashable #410, so no delegation required.

A couple of design issues for which I would like feedback:

  • I'm really unhappy about making cloneStorable return shared_ptr rather than unique_ptr (which is more flexible for the client; see Effective Modern C++), especially since doing so imposes awkward implementation constraints on PolymorphicValue. However, it's not possible to implement a unique_ptr clone in terms of a shared_ptr clone if you don't already have a unique_ptr clone, and trying instead to change Psf::clone to use unique_ptr triggers a pybind11 bug that, judging from the discussion in related PRs, will never be fixed. A third option would be appreciated.
  • the singleClassEquals helper is a static method template so that subclasses of Storable can cast to the correct type, and to duck-type around the fact that Storable itself does not implement operator==. The code works and is clean, but I feel like there should be a more natural solution.
  • Storable imposes a lot of API clutter on its subclasses, especially since the "unsupported" versions of methods still appear in the documentation. Currently cloneStorable is absolutely essential, and equals is needed for C++ equality comparisons of GenericMap. We do not use toString (no GenericMap printing yet) or hash_value. Can we afford to remove these methods? If we do, they will be hard to add later, scaling to the number of subclasses Storable has.
  • Adding Storable to Filter required implementing Persistable in Filter. I cargo-culted the code from ApCorrMap, so a careful review would be appreciated.

@kfindeisen kfindeisen requested a review from TallJimbo May 15, 2019 18:51
@kfindeisen
Copy link
Member Author

kfindeisen commented May 18, 2019

Update on the API clutter question: perhaps it's best to support the methods, and find some automated way to hide the unimplemented ones. While working on GenericMap iteration I realized that it's a lot of work (casting) to perform operations that aren't supported by Storable. On the other hand, needing to handle UnsupportedOperationException for everything is also a lot of work...

Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I took so long to get to this.

Looks great overall; I think the only thing that probably needs to be changed is just backing out the attempt to make Filter Storable; it's just too flawed for that to work well now (more on that in-line).

I think I've addressed all of your big questions (not necessarily satisfactorily) in-line as well, except for the API clutter question. On that subject, I think reducing Python API clutter is probably more important than C++ API clutter - because Python APIs are actually introspectable via things like tab-completion (though perhaps that attitude really just reflects my lack of C++ editor/IDE sophistication). We could consider either not wrapping the Storable base API for Python (relying on derived classes that implement those methods to do so where appropriate), or wrapping it as free functions in some utility module if that turns out to be necessary. We could also do something similar in C++ if you think the API clutter there is a big problem, by making the entire Storable API protected virtual with friend-function public access. In a team with more C++ experience (where I think free functions as part of class interfaces are more accepted, due in large part to argument-dependent lookup making them work well), I think that'd actually be a pretty nice idea, but it might be more confusing than API clutter with the more Python-focused team that we have.

python/lsst/afw/typehandling/_Storable.cc Outdated Show resolved Hide resolved
src/typehandling/PolymorphicValue.cc Show resolved Hide resolved
@@ -137,7 +137,8 @@ class PolymorphicValue final {
std::size_t hash_value() const;

private:
std::unique_ptr<Storable> _value;
// unique_ptr would be more appropriate, but Storable::cloneStorable must return shared_ptr
std::shared_ptr<Storable> _value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Commenting here on the general shared_ptr vs. unique_ptr question, just to create a separate thread for that.)

I think there are a few ways to deal with this, and I don't think any of them are great, and I don't have much preference between them. So I'll just list some related thoughts, in case something there either helps convince you which approach is best (or least bad, I guess) or spurs some further conversation that convinces me:

  • Just going with shared_ptr has the advantage that it permits concrete Storables to use make_shared, which can be an important optimization in some contexts (see e.g. DM-19461). I doubt the Exposure refactor is such a context, but perhaps other container of some concrete Storable that also wants to use clone might be.
  • If Psf::clone() is the primary blocker for using unique_ptr here, I'm not personally opposed to using a manual lambda wrapper in Psf's and/or Storable's pybind11 to translate the return type to shared_ptr and avoid the pybind11 bug. Unfortunately, that would probably require doing the same in some derived class wrappers or (better) just removing their unnecessary wrappers for clone[Storable] (as an aside, this really does make me wish I'd pushed harder against our rules advising the opposite, as this case makes it clear that duplication has a real cost).
  • We could also consider just removing Psf::clone: Psf's documentation states that all derived classes should be immutable, and that means there's no reason to want to copy one (aside from perhaps trying to manipulate the MRU cache in the base class, but that'd really be an abuse of the API). That in turn raises the question of what cloneStorable should do for immutable concrete classes. I think the most important point on that is that such classes are unlikely to be held by PolymorphicValue at all, because we'll put them in GenericMap via shared_ptr, and it wouldn't be bad to enforce that by just not implementing cloneStorable for them.
  • While the general C++ case for preferring unique_ptr to shared_ptr is quite strong, I think it weakens quite a bit in a hybrid C++/Python codebase, because there is simply no way to express transferring ownership from Python and that greatly limits the utility of unique_ptr and move semantics in general. Aside from small test programs (especially for running in valgrind) I don't see our C++ libraries used outside of calling from Python, so I think I've resigned myself to adopting "best hybrid codebase practices" in preference to "best C++ practices" when there's tension between them.

Copy link
Member Author

@kfindeisen kfindeisen Jun 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand your first point. How does making a factory method return shared_ptr "permit" make_shared? Can't you call make_shared for any class that has a public constructor?

Re: points 2 and 3, I don't think Psf is the problem. It's simply the first time we've tried to retrofit Storable onto an existing class hierarchy. I already rejected the manual lambda idea on the grounds that it would make it hard to use Storable correctly -- anybody trying to make another cloneable class storable would need to know to repeat the trick.

I don't think the fourth point is relevant because we don't want to transfer ownership from Python (a C++ function that takes unique_ptr input would have unintuitive side effects anyway). Having a unique_ptr return value corresponds to transferring ownership to Python, which we do often.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On point (1), I just mean that if you define clone to return unique_ptr, but always in practice convert that unique_ptr into a shared_ptr, you've missed out on the optimization that make_shared provides, while if you define clone to return shared_ptr, you can implement it via make_shared.

include/lsst/afw/typehandling/Storable.h Outdated Show resolved Hide resolved
include/lsst/afw/geom/polygon/Polygon.h Show resolved Hide resolved
src/image/Filter.cc Show resolved Hide resolved
include/lsst/afw/typehandling/Storable.h Show resolved Hide resolved
@kfindeisen
Copy link
Member Author

kfindeisen commented Jun 3, 2019

On that subject, I think reducing Python API clutter is probably more important than C++ API clutter - because Python APIs are actually introspectable via things like tab-completion (though perhaps that attitude really just reflects my lack of C++ editor/IDE sophistication).

I wasn't thinking in terms of tab completion at all; I was motivated by looking at the Doxygen docs for the Storable classes.

We could consider either not wrapping the Storable base API for Python (relying on derived classes that implement those methods to do so where appropriate)

I'm a little worried that this would interfere with adding pure-Python subclasses of Storable, but it might be best to do that now and revert if DM-19582 proves it necessary.

We could also do something similar in C++ if you think the API clutter there is a big problem, by making the entire Storable API protected virtual with friend-function public access.

I don't think doing that would reduce API clutter at all -- the number of methods/functions a user would need to be aware of, including almost-overloads like clone vs. cloneStorable, would be the same.

@kfindeisen
Copy link
Member Author

I've gone ahead and removed the pybind11 wrappers for the optional Storable methods, as well as for UnsupportedOperationException.

@TallJimbo
Copy link
Member

TallJimbo commented Jun 7, 2019

I don't think doing that would reduce API clutter at all -- the number of methods/functions a user would need to be aware of, including almost-overloads like clone vs. cloneStorable, would be the same.

I think the relevant question is which user(s); to me, at least, regular methods say, "this is an integral part of this class", which means the clutter affects all users of the class. Depending on how and where you put them, friends can be used to say "this is a specialized corner of the API targeted at a specific audience". Depending on how much Storable is there really just for GenericMap (and I don't think we've really decided strongly either way what our philosophy is on that), that may be more appropriate.

@TallJimbo
Copy link
Member

Sorry for the delay in responding. If I've failed to comment on something blocking you, please do ping me again.

Wrapping Storable's optional methods leads to a confusing Python API.
In practice, any subclass of Storable that is, for example, printable
or hashable will provide its own wrappers for those operations anyway.
Non-cloneable Storables cannot be used (practically) with
PolymorphicValue, and therefore with Key<T extends Storable>.
The requirement to return a smart pointer to Storable made it
difficult to use Storable as a mixin for classes that already have a
clone method. Renaming the method minimizes the number of times that
cloneStorable must be implemented (typically only in the direct
subclass of Storable).
While unique_ptr is a better return type for a factory method, a
unique_ptr return cannot be implemented in terms of a shared_ptr
return (though the reverse is true), and our existing classes cannot
be modified to clone by unique_ptr without running into a pybind11 bug.
At present it looks like the pybind11 team is leaning towards
forbidding such code rather than supporting unique_ptr to
shared_ptr conversion.
Equals was returning `false` for comparisons to self.
The template handles the most common behavior, where an object of a
class implementing Storable is comparable to other objects of the same
class, but should always be unequal to other Storables.
@kfindeisen kfindeisen merged commit f6b608f into master Jun 7, 2019
@kfindeisen kfindeisen deleted the tickets/DM-19575 branch June 7, 2019 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants