Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-19467: Add C++ iteration to GenericMap #466

Merged
merged 10 commits into from Jun 7, 2019
Merged

Conversation

kfindeisen
Copy link
Member

This PR introduces the visitor/"iteration" mechanism originally planned for GenericMap, adds some iteration-related features to GenericMap and SimpleGenericMap, and performs some refactoring of the code added in #461 that should make GenericMap easier to use and maintain. Because there are a lot of changes, I recommend reading the PR commit by commit (and possibly skipping the two "refactor" commits entirely, as they involve pushing a lot of TMP into private code).

One design decision that I'd like feedback on involves changing GenericMap::keys from a snapshot to a view. There needs to be some API that lets GenericMap know how to iterate over its implementation class, and it's best if that API is non-throwing. I see three solutions that satisfy this requirement:

  1. Make keys a view, as I've done. In C++, this forces the implementation class to maintain a vector of keys, and (if a MutableGenericMap) to keep it in sync with the "main" storage in an exception-safe way. The need for exception safety makes modifying methods more complicated than they would otherwise need to be, but this approach has the advantage of being conceptually simple.
  2. Return a polymorphic iterator, whose subclass is specific to a particular GenericMap implementation. This avoids the exception-safety problems of a view and is straightforward to implement. However, to support polymorphism the iterator would need to be returned by smart pointer and passed around GenericMap code by pointer or reference, which is unC++ic (for example, the standard for loop expects iterators to be passed by value) and error-prone.
  3. Create a non-polymorphic, non-template "iterator" class (though it behaves more like a Python generator) that will be created by the implementation class and used to implement GenericMap methods that need it. The "iterator" contains some storage (a boost::any or std::any memento) for the state of the iteration, as well as a callable that defines how to update the state and return the current key. The implementation class is responsible for defining the "iterator"'s initial state and the callable. This approach doesn't have the exception-safety issues of the key view and doesn't pretend to be compatible with the built-in iterator framework, but it's both unpythonic and unC++ic.

Which of these three approaches will make it easier to modify and create GenericMap subclasses in the long run?

@kfindeisen kfindeisen requested a review from parejkoj May 20, 2019 20:32
Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per instructions, I didn't really look at the two "refactor" commits, so hopefully my other comments will still make sense (done per-commit).

Is the choice of using _keyView here instead of directly using a map is to keep the faster lookup of unordered_map? Is the performance difference really worth the added complexity of having to manage two containers, instead of just using map? I think that's my biggest concern of this approach: two containers are rarely better than one.

The template magic in the new visitation is certainly non-obvious to this non-expert, but it was not entirely impenetrable. I do worry that there are only a few people in LSST who would be ok with modifying that code (I certainly wouldn't have been able to come up with this approach myself).

Given your alternate suggestions above, I'm not sure there is a perfect approach, and this way I think hides most of the scary stuff in private, so it is unlikely that others will have to worry about it. I think, but am not certain, that this approach might actually be the most understandable and provide the least confusion to those adding subclasses of GenericMap. Unfortunately, I'm not sure I can properly envision what your option 2 would look like, and option 3 just feels excessively complicated to me.

include/lsst/afw/typehandling/GenericMap.h Show resolved Hide resolved
include/lsst/afw/typehandling/GenericMap.h Show resolved Hide resolved
include/lsst/afw/typehandling/GenericMap.h Outdated Show resolved Hide resolved
include/lsst/afw/typehandling/GenericMap.h Outdated Show resolved Hide resolved
include/lsst/afw/typehandling/GenericMap.h Show resolved Hide resolved
include/lsst/afw/typehandling/SimpleGenericMap.h Outdated Show resolved Hide resolved
tests/test_simpleGenericMap.cc Show resolved Hide resolved
python/lsst/afw/typehandling/_GenericMap.cc Show resolved Hide resolved
python/lsst/afw/typehandling/_GenericMap.cc Outdated Show resolved Hide resolved
@kfindeisen
Copy link
Member Author

Is the choice of using _keyView here instead of directly using a map is to keep the faster lookup of unordered_map?

I'm not sure what you mean by directly using a map (C++ maps do not expose any kind of key view, as useful as it would be). My motivation was that the original keys() function required constructing a new vector, which could throw an exception from memory allocations. Those exceptions can still be thrown in the new approach, of course, but now they'd happen as part of an insertion, which is potentially throwing anyway.

@kfindeisen
Copy link
Member Author

kfindeisen commented Jun 6, 2019

Unfortunately, I'm not sure I can properly envision what your option 2 would look like

Something like:

protected:
    virtual unique_ptr<GenericIterator> begin() const = 0;

    void _apply(Visitor&& visitor) const {
        …
        for (auto itPtr = begin(); *itPtr != *end(); ++(*itPtr)) {
            K const& key = **itPtr;
            ...
        }
    }

There's a few ways to simplify the code (e.g., assign *itPtr to a GenericIterator&, then use that in the loop), but either way it's very unidiomatic.

Edit: just realized that option 2 can't be non-throwing, because you need to dynamically allocate the GenericIterator to get polymorphism and safe memory usage simultaneously. So it doesn't actually solve the problem I was trying to solve.

@parejkoj
Copy link
Contributor

parejkoj commented Jun 6, 2019

C++ maps do not expose any kind of key view, as useful as it would be

Oh, ugh. I wasn't aware of that. The rest of your rationale makes sense. I wonder if it is worthwhile writing down some of this rationale in the class docstring for GenericMap, explaining why there is an extra vector? Although not useful to users of the class, it might help those trying to understand the reason for the implementation.

The private method was only needed to implement GenericMap equality,
but there are more direct ways to do so.
The interface makes heavy use of templating for flexibility;
fortunately, almost all the template metaprogramming necessary is in
private code, and none of it appears in the API.
The GenericMap protected API was originally centered on a const
reference variant, which is the only way it needed to identify its
values. Now that GenericMap also has value and non-const-reference
variants, the code is more readable if the value variant is the
"official" list of supported types and the two reference variants are
derived from it.
Delegating to private method templates reduces contains() to a single
public template and at() to three (kept distinct because they have
fundamentally different return types).
Making PolymorphicValue nothrow-swappable makes it much easier to swap
values inside GenericMap.
This change has the benefit of making GenericMap::operator==
non-throwing (it no longer needs dynamic memory allocation), but at
the cost of requiring every implementation of GenericMap to provide a
long-lived view object. The only alternative non-throwing API I could
think of involved passing around callable objects that specified how
each subclass could be iterated over; this API seems easier to
develop against, even if the view does make exception safety trickier.

As a side benefit, this change gives SimpleGenericMap a FIFO iteration
order, a feature that was requested by Python users.
Having GenericMap::keys() be a view makes it easy to iterate from C++,
which the style guide recommends.
Newer(?) versions of clang complain about importing a member type
without using typename.
@kfindeisen kfindeisen merged commit 4541b6e into master Jun 7, 2019
@kfindeisen kfindeisen deleted the tickets/DM-19467 branch June 7, 2019 23:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants