Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

framework: Remove some allocations from event collections #15081

Merged

Conversation

rpoyner-tri
Copy link
Contributor

@rpoyner-tri rpoyner-tri commented May 24, 2021

Relevant to: #14543

Rewrite the storage of event collections to avoid most allocations,
while maintaining most of the pre-existing public API. To support this
change, allow fully-derived-type events to be copied and assigned. The
Clone() mechanism is still supported, primarily for interfacing with
Python.

This patch also deprecates EventCollection::add_event() and any
overrides.


This change is Reviewable

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+@SeanCurtis-TRI for feature review.
FYI @edrumwri

Reviewable status: LGTM missing from assignee SeanCurtis-TRI(platform), needs at least two assigned reviewers (waiting on @SeanCurtis-TRI)

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass done. Lots of big, possibly controversial opinions.

Reviewed 8 of 8 files at r1.
Reviewable status: 22 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform), needs at least two assigned reviewers (waiting on @rpoyner-tri)

a discussion (no related file):
I'm struggling a bit with this PR. Here's the reasoning as I infer it.

The problem

EventCollection has historically owned a bunch of pointers to EventType that individual leaf systems would actually own. That ownership was manifest as unique_ptr<EventType> objects. Every new event would lead to a new heap allocation in the leaves.

The Solution

We'll have the EventCollection actually own its events. And its event storage will be a blob of memory that gets pre-allocated to a "good" default size and perpeptually re-used. It is possible that during simulation, the total number of simultaneously declared events may exceed the pre-allocated block. At that point a heap allocation (by the EventCollection) will happen. But over the lifespan of the collection, its storage capacity won't shrink. That amortizes the cost of heap allocations to functionally zero.

The constraints/challenges

We have a legacy API that directly reflected the old un-owned relationship between collection and events. The events would be accessed via a vector<EventType*>. We are preserving that API and therefore need to be able to serve it. This means whatever EventCollection-local storage we use as part of the solution must either a) provide stable addresses for all of the owned events, or b) implement maintenance to confirm pointers are always correlated with owned entities. This is what has led to using std::list and inspired the Block to both increase memory coherency in the owned events and limit the frequency in which we might have the list allocate new list elements.

My personal observations/concerns

Now, assuming that I haven't missed anything of import in my description above, I'll enumerate my concerns. I fully recognize that if my previous description is incorrect, that these observations might be rendered moot. I'll rely on your judgment to decide that.

  1. The lists, blocks, strides, etc., seems to be introducing a great deal of complexity in code and documentation.
  2. The use of a vector<EventType*> is proof that, generally, the heap allocation patterns of vector are compatible with what we're attempting to achieve in this PR for EventCollection. Between its strategy for initial storage allocation and its strategy for growing it, we aren't suffering badly from those allocations. So, it'd be nice if we could simply rely on vector<EventType>for owned storage.
  3. The fly in the ointment is the legacy API. This vector of pointers. I would hope that returning vector<EventType> would be equally efficacious and eliminate all of the overhead and complexity.
    • I do note that this would require related API changes. Specifically, the DoEventType() families of methods. They are declared to take a vector<EventType*> and not vector<EventType>. These are protected virtual methods of LeafSystem and attempting to change this signature in a non-breaking/disciplined way would probably be arduous. (Although the actual Drake code is pretty light in this regard). We are deprecating the purely virtual add_event() method, so further deprecation would not go amiss. We could deprecate get_events() in favor of events().
    • I believe the resulting python API would be largely untouched (although I can imagine the bindings would be tweaked a bit).
    • I'm also in favor of returning a range iterator.

Moral of the story: supporting an API that no longer makes sense is introducing a surplus of madness to the code. I can endure temporary madness, and I'd be resigned if the code documents why it must be considered inevitable madness. But I'd like to be convinced that we're not de facto accepting of the madness.


a discussion (no related file):

One last thought on std::list vs std::vector

If we definitely are keeping the vector<EventType*> get_events() API, here's a proposal (and implied argument) that maintenance might be better than blocks of code.

The maintenance would be conceptually very simple.

  • In AddEvent we add the new Eventto a vector<EventType> storage_.
  • If &storage_[0] == *events_.begin(), storage_ hasn't moved and we can simply append the address to events_.
  • Else, we simply need to clear events_ and re-write all pointer addresses.

The actual test that they are still aligned is dirt cheap. The work to correct misalignment would happen only on those occasions when we'd be doing a heap allocation. So, does it really matter?

That feels more direct and obvious than the code in AddEvent regarding tracking blocks and the supporting documentation.



systems/framework/event.h, line 547 at r1 (raw file):

 private:
  TriggerType trigger_type_;
  drake::copyable_unique_ptr<EventData> event_data_{nullptr};

nit: Surely the drake:: qualifier is redundant?


systems/framework/event_collection.h, line 202 at r1 (raw file):

   * Throws if called, because no events should be added at the Diagram level.
   */
  void AddEvent(const EventType&) override {

btw: Declaring this override is not strictly a defect. However, the class is declared final which creates some cognitive dissonance here. Perhaps calling this final as well would be sanest.

That said, I recognize that the final-override oxymoron is all over this file.


systems/framework/event_collection.h, line 202 at r1 (raw file):

   * Throws if called, because no events should be added at the Diagram level.
   */
  void AddEvent(const EventType&) override {

I think this would be better as void AddEvent(EventType) override;. Generally, this is getting invoked with r-value instances of events. So, passing by value in the signature and doing a std::move in the body would eliminate unnecessary copies. Particularly valueable where the vent has non-null data (currently owned by a copyable_unique_ptr which is very amenable to std::move.


systems/framework/event_collection.h, line 363 at r1 (raw file):

   * Add `event` to the existing collection.
   */
  void AddEvent(const EventType& event) override {

BTW See previous note on override vs final here.


systems/framework/event_collection.h, line 367 at r1 (raw file):

    auto it = std::next(events_storage_.begin(), last_block_);
    if (it->size == kStride) {
      last_block_++;

nit Always odd to see post-increment (both here and it++). I don't know how clever the compiler is about recognizing it doesn't actually use the pre-increment value, but why tax it needlessly?


systems/framework/event_collection.h, line 375 at r1 (raw file):

    auto& block = *it;
    int index = block.size++;
    block.entries[index] = event;

nit: Related to the other note: this would become:

block.entries[index] = std:move(event);


systems/framework/event_collection.h, line 440 at r1 (raw file):

  // Clear(). Hence, there is no need to support arbitrary out-of-order
  // insertion and deletion. We only need to choose an underlying storage
  // container that guarantees not to invalidate iterators, as long as we

nit: It was difficult for me to follow this reasoning. And I think the phrase "not to invalidate iterators" is ultimately the culprit that caused me the most confusion.

After reasoning about it, it's not the iterators that get invalidated. It's the pointers stored in the corresponding vector. We'd like to write a pointer to that vector and then forget about it, knowing that it points to the right address no matter how many subsequent events were to be added. If we used a vector<EventType> as storage, we can't guarantee that, upon automatic resizing, that the entire contents of the vector wouldn't be relocated to accommodate the larger size (and this resizing would largely be under the hood, so we'd have to check all the time to confirm that the two data structures didn't fall out of sync.


systems/framework/event_collection.h, line 447 at r1 (raw file):

  // allocate when the first one overflows.

  // Assert if the representation invariant is violated.

nit: "Assert if" is an odd grammatical structure. How about, simply, "Assert the representation..."


systems/framework/event_collection.h, line 459 at r1 (raw file):

  // TODO(rpoyner-tri): consider some empirical study to tune the size of
  // kStride.
  static constexpr int kStride = 32;

BTW "stride" is a pretty high-level abstraction to apply here. You could pick names that more directly reflect the role it serves: kBlockCapacity. The formal decomposition into "blocks" seems to undercut the context in which "stride" would typically be used.

It looks particularly better where it's used:

if (it->size == kStride) {

vs

if (it->size == kBlockCapacity) {


systems/framework/event_collection.h, line 459 at r1 (raw file):

  // TODO(rpoyner-tri): consider some empirical study to tune the size of
  // kStride.
  static constexpr int kStride = 32;

BTW I can imagine a world where a user has a Diagram that they know will exceed 32 events and a strong need to not heap allocate after initialization. Some mechanism where they could easily "reserve" size at initialization time would be good. Even if they don't know the magic number 32, knowing what their event counts would be, calling event_collection.reserve(my_minimum_count)would be nice sugar.


systems/framework/event_collection.cc, line 9 at r1 (raw file):

void LeafEventCollection<EventType>::CheckInvariant() const {
  // Always retain at least one block.
  DRAKE_ASSERT(!events_storage_.empty());

nit: Rather than asserting in the body of this function, perhaps it would be better to demand. And then you can rely on whether any testing is done by the caller (as you have done).


systems/framework/event_collection.cc, line 32 at r1 (raw file):

    event_count += block.size;
    block_index++;

nit: Just tagging other post-increments.


systems/framework/event_collection.cc, line 36 at r1 (raw file):

  // Check event pointer vector.
  int event_ptr_count = static_cast<int>(events_.size());

nit const


systems/framework/event_collection.cc, line 42 at r1 (raw file):

  for (auto ptr : events_) {
    DRAKE_ASSERT(ptr == &block_iterator->entries[event_index]);
    event_index++;

nit: Just tagging other post-increments.


systems/framework/leaf_system.h, line 951 at r1 (raw file):

    // Instantiate the event.
    auto forced = PublishEvent<T>(

nit:

PublishEvent<T> force(

Why introduce auto where it provides no value?


systems/framework/leaf_system.h, line 959 at r1 (raw file):

    // Add the event to the collection of forced publish events.
    this->get_mutable_forced_publish_events().AddEvent(forced);

nit: If you change AddEvent() to pass-by-value, this should be std::move(forced).


systems/framework/leaf_system.h, line 989 at r1 (raw file):

    // Instantiate the event.
    auto forced = DiscreteUpdateEvent<T>(

nit:

DiscreteUpdateEvent<T> force(

Why introduce auto where it provides no value?


systems/framework/leaf_system.h, line 1000 at r1 (raw file):

    // Add the event to the collection of forced discrete update events.
    this->get_mutable_forced_discrete_update_events().AddEvent(forced);

nit: If you change AddEvent() to pass-by-value, this should be std::move(forced).


systems/framework/leaf_system.h, line 1030 at r1 (raw file):

    // Instantiate the event.
    auto forced = UnrestrictedUpdateEvent<T>(

nit:

UnrestrictedUpdateEvent<T> force(

Why introduce auto where it provides no value?


systems/framework/leaf_system.h, line 1039 at r1 (raw file):

    // Add the event to the collection of forced unrestricted update events.
    this->get_mutable_forced_unrestricted_update_events().AddEvent(forced);

nit: If you change AddEvent() to pass-by-value, this should be std::move(forced).

@sherm1
Copy link
Member

sherm1 commented May 25, 2021


systems/framework/event_collection.h, line 367 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

nit Always odd to see post-increment (both here and it++). I don't know how clever the compiler is about recognizing it doesn't actually use the pre-increment value, but why tax it needlessly?

FYI That's actually a style guide violation these days. See this section.

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 22 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform), needs at least two assigned reviewers (waiting on @rpoyner-tri and @sherm1)


systems/framework/event_collection.h, line 367 at r1 (raw file):

Previously, sherm1 (Michael Sherman) wrote…

FYI That's actually a style guide violation these days. See this section.

Yeah. That. :)

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 22 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform), needs at least two assigned reviewers (waiting on @jwnimmer-tri, @rpoyner-tri, and @SeanCurtis-TRI)

a discussion (no related file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

I'm struggling a bit with this PR. Here's the reasoning as I infer it.

The problem

EventCollection has historically owned a bunch of pointers to EventType that individual leaf systems would actually own. That ownership was manifest as unique_ptr<EventType> objects. Every new event would lead to a new heap allocation in the leaves.

The Solution

We'll have the EventCollection actually own its events. And its event storage will be a blob of memory that gets pre-allocated to a "good" default size and perpeptually re-used. It is possible that during simulation, the total number of simultaneously declared events may exceed the pre-allocated block. At that point a heap allocation (by the EventCollection) will happen. But over the lifespan of the collection, its storage capacity won't shrink. That amortizes the cost of heap allocations to functionally zero.

The constraints/challenges

We have a legacy API that directly reflected the old un-owned relationship between collection and events. The events would be accessed via a vector<EventType*>. We are preserving that API and therefore need to be able to serve it. This means whatever EventCollection-local storage we use as part of the solution must either a) provide stable addresses for all of the owned events, or b) implement maintenance to confirm pointers are always correlated with owned entities. This is what has led to using std::list and inspired the Block to both increase memory coherency in the owned events and limit the frequency in which we might have the list allocate new list elements.

My personal observations/concerns

Now, assuming that I haven't missed anything of import in my description above, I'll enumerate my concerns. I fully recognize that if my previous description is incorrect, that these observations might be rendered moot. I'll rely on your judgment to decide that.

  1. The lists, blocks, strides, etc., seems to be introducing a great deal of complexity in code and documentation.
  2. The use of a vector<EventType*> is proof that, generally, the heap allocation patterns of vector are compatible with what we're attempting to achieve in this PR for EventCollection. Between its strategy for initial storage allocation and its strategy for growing it, we aren't suffering badly from those allocations. So, it'd be nice if we could simply rely on vector<EventType>for owned storage.
  3. The fly in the ointment is the legacy API. This vector of pointers. I would hope that returning vector<EventType> would be equally efficacious and eliminate all of the overhead and complexity.
    • I do note that this would require related API changes. Specifically, the DoEventType() families of methods. They are declared to take a vector<EventType*> and not vector<EventType>. These are protected virtual methods of LeafSystem and attempting to change this signature in a non-breaking/disciplined way would probably be arduous. (Although the actual Drake code is pretty light in this regard). We are deprecating the purely virtual add_event() method, so further deprecation would not go amiss. We could deprecate get_events() in favor of events().
    • I believe the resulting python API would be largely untouched (although I can imagine the bindings would be tweaked a bit).
    • I'm also in favor of returning a range iterator.

Moral of the story: supporting an API that no longer makes sense is introducing a surplus of madness to the code. I can endure temporary madness, and I'd be resigned if the code documents why it must be considered inevitable madness. But I'd like to be convinced that we're not de facto accepting of the madness.

  1. Complexity? Sure, but consider the paths not taken (below) before drawing conclusions.
  2. The use of a vector<EventType*> as an iteration API is the fundamental flaw of the entire mess. Your observation does raise an issue not addressed in the patch; to be completely heap-averse, the construction should reserve() kStride worth of entries in the pointer vector. But I digress.
  3. Returning vector as an iteration API is just as bad; it still leaks implementation detail. This is the route Evan took in his draft PR [DRAFT] Removes most heap allocations from the Event system #14707. It led to a giant swath of API breakage, because it's not just the event collection class that is affected; the iteration scheme bleeds through to a bunch of callbacks and virtual methods for which there is no reasonable deprecation strategy.

As you note, the correct interface is C++20 ranges. We are not there yet, and $DEITY knows when we will be. Also possible is a traditional begin()/end() iterator interface, but @jwnimmer-tri has suggested (in some discussion thread I'll never find again) that introducing begin()/end() style interfaces at this late date is just churn, and probably wouldn't survive review.

So, paths not taken:
(1) vector iteration -- Evan's #14707 demonstrates this path, without any effort spent on deprecation possibilities.
(2) C++20 ranges -- desired, but not available
(3) begin()/end() iterators -- probably too late to be worth it

Given the choice between introducing new broken iteration APIs, punting until C++20, or repairing the implementation behind the current API, I chose the latter.

Re: deprecating add_event: the affected derived classes are all present in the same file and final, so the damage is contained. It is trivial to implement add_event() on top of AddEvent(). None of these mitigations apply to the APIs broken by replacing get_events().


a discussion (no related file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

One last thought on std::list vs std::vector

If we definitely are keeping the vector<EventType*> get_events() API, here's a proposal (and implied argument) that maintenance might be better than blocks of code.

The maintenance would be conceptually very simple.

  • In AddEvent we add the new Eventto a vector<EventType> storage_.
  • If &storage_[0] == *events_.begin(), storage_ hasn't moved and we can simply append the address to events_.
  • Else, we simply need to clear events_ and re-write all pointer addresses.

The actual test that they are still aligned is dirt cheap. The work to correct misalignment would happen only on those occasions when we'd be doing a heap allocation. So, does it really matter?

That feels more direct and obvious than the code in AddEvent regarding tracking blocks and the supporting documentation.

It could work, but it does raise more questions.

To avoid allocation in most cases, we would need to reserve() storage ahead of time, so we still have something like kStride lurking around.

Is it cheaper to rewrite the pointers proactively, in AddEvent, or lazily, in get_events?

We still have issues of iterator invalidation, and a fairly non-obvious representation invariant, so I wouldn't be surprised if the patch weighed in at similar size to the present one.



systems/framework/event_collection.h, line 440 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

nit: It was difficult for me to follow this reasoning. And I think the phrase "not to invalidate iterators" is ultimately the culprit that caused me the most confusion.

After reasoning about it, it's not the iterators that get invalidated. It's the pointers stored in the corresponding vector. We'd like to write a pointer to that vector and then forget about it, knowing that it points to the right address no matter how many subsequent events were to be added. If we used a vector<EventType> as storage, we can't guarantee that, upon automatic resizing, that the entire contents of the vector wouldn't be relocated to accommodate the larger size (and this resizing would largely be under the hood, so we'd have to check all the time to confirm that the two data structures didn't fall out of sync.

In theory, there is no difference between pointers and iterators; in practice, there is. :)

I guess I thought that parallel would be clear. The reason the C++ standard has to rattle on about iterator invalidation at all is exactly because storage moves, and most iterators are just pointers in fancy clothes.


systems/framework/event_collection.h, line 459 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

BTW I can imagine a world where a user has a Diagram that they know will exceed 32 events and a strong need to not heap allocate after initialization. Some mechanism where they could easily "reserve" size at initialization time would be good. Even if they don't know the magic number 32, knowing what their event counts would be, calling event_collection.reserve(my_minimum_count)would be nice sugar.

I thought about that, but decided I couldn't justify it. Maybe worth a revisit.

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 22 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform), needs at least two assigned reviewers (waiting on @jwnimmer-tri and @rpoyner-tri)

a discussion (no related file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…
  1. Complexity? Sure, but consider the paths not taken (below) before drawing conclusions.
  2. The use of a vector<EventType*> as an iteration API is the fundamental flaw of the entire mess. Your observation does raise an issue not addressed in the patch; to be completely heap-averse, the construction should reserve() kStride worth of entries in the pointer vector. But I digress.
  3. Returning vector as an iteration API is just as bad; it still leaks implementation detail. This is the route Evan took in his draft PR [DRAFT] Removes most heap allocations from the Event system #14707. It led to a giant swath of API breakage, because it's not just the event collection class that is affected; the iteration scheme bleeds through to a bunch of callbacks and virtual methods for which there is no reasonable deprecation strategy.

As you note, the correct interface is C++20 ranges. We are not there yet, and $DEITY knows when we will be. Also possible is a traditional begin()/end() iterator interface, but @jwnimmer-tri has suggested (in some discussion thread I'll never find again) that introducing begin()/end() style interfaces at this late date is just churn, and probably wouldn't survive review.

So, paths not taken:
(1) vector iteration -- Evan's #14707 demonstrates this path, without any effort spent on deprecation possibilities.
(2) C++20 ranges -- desired, but not available
(3) begin()/end() iterators -- probably too late to be worth it

Given the choice between introducing new broken iteration APIs, punting until C++20, or repairing the implementation behind the current API, I chose the latter.

Re: deprecating add_event: the affected derived classes are all present in the same file and final, so the damage is contained. It is trivial to implement add_event() on top of AddEvent(). None of these mitigations apply to the APIs broken by replacing get_events().

We don't have to wait until C++20 to support range iteration. In geometry we use the MapKeyRange to provide a range iterator through the keys of a map. Essentially, any object that has a begin() and end() method can be used in range iteration. So, a little struct that walks through your internal vector and returns the pointers in turn would hide internals and be ready for C++20.

But that's the easy bit. The hard bit is changing the API that we want others to call. I did a once over the use of the current API. We certainly have a number of regrettable artifacts (APIs explicitly declared to take a vector<EventType*> stands out in my mind). But, as you say, "The use of a vector<EventType*> as an iteration API is the fundamental flaw of the entire mess." So, we're introducing more mess to support a flaw.

Having not tried to address the bad API with a deprecation strategy, I don't have a firm sense of the cost. But I do see the cost of supporting the bad API, and it's pretty undesirable. How would you feel if I tried doing the deprecation? I'm willing to put my money where my mouth is.

But, that said, even if we are stuck supporting the old API, there are simpler, less fraught ways of doing it.


a discussion (no related file):

To avoid allocation in most cases, we would need to reserve() storage ahead of time, so we still have something like kStride lurking around.

As you pointed out, we should be doing that for the pointer vector already. :) And doing that in the constructor for both vectors is good and proper in the paradigm indicated above.

Is it cheaper to rewrite the pointers proactively, in AddEvent, or lazily, in get_events?

I don't know, I merely suspect that it won't matter. We know that get_events() will get called after accumulating. We also are doing this with the expectation that we will seldom resize, so, it's not like we risk getting multiple resizings/relocations in a single allocation that would be replaced by a single handling at get_events(). From that perspective, maintaining an invariant as a result of calling AddEvent() holds the most apeal.

We still have issues of iterator invalidation, and a fairly non-obvious representation invariant, so I wouldn't be surprised if the patch weighed in at similar size to the present one.

I don't think we are referring to the same thing when we talk about "iterator invalidation". Right now, the act of calling AddEvent() has the potential of invalidating iterators on the vector returned by get_events(). Consider the following code:

const auto& events = collection.get_events();
auto itr = events.begin();
collection.AddEvent(my_event_defined_elsewhere);  // <-- Contract of vector says itr is now invalid!

This is what I think about when I think about "iterator invalidation". We've got an iterator to a collection, we perform an operation on that collection which is documented to invalidate all iterators. You've got that problem whether you have a list<Block> or a vector<EventType>.



systems/framework/event_collection.h, line 440 at r1 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

In theory, there is no difference between pointers and iterators; in practice, there is. :)

I guess I thought that parallel would be clear. The reason the C++ standard has to rattle on about iterator invalidation at all is exactly because storage moves, and most iterators are just pointers in fancy clothes.

See my note above about iterator invalidation and how none of this code keeps you from suffering from those issues.

In your case, there's a world of difference between iterators and pointers. By "iterator" I'm referring to a vector::iterator into a vector of values (where the values happen to be pointers). Your list guarantees that the values in the vector will point to actual events owned by the collection, but it does nothing for the vector::const_iterator that may be used.


systems/framework/event_collection.h, line 459 at r1 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

I thought about that, but decided I couldn't justify it. Maybe worth a revisit.

Yeah. In my mind, it's worth capturing with a TODO where the trigger is people saying, "I've got a big system and I don't want to allocate! What can I do?"

@jwnimmer-tri
Copy link
Collaborator

I do agree with @SeanCurtis-TRI that this PR is much more complicated than necessary. I don't think "pointer invalidation" is so acute of a problem as you're treating it here -- just change the documentation of get_events to say that calling any other method afterward invalidates its returned reference. It would also be easy to deprecate it -- where its deprecated, vestigial implementation (currently used outside the framework only by tests) lazily calculates the return value on-demand.

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform), needs at least two assigned reviewers (waiting on @jwnimmer-tri and @SeanCurtis-TRI)

a discussion (no related file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

We don't have to wait until C++20 to support range iteration. In geometry we use the MapKeyRange to provide a range iterator through the keys of a map. Essentially, any object that has a begin() and end() method can be used in range iteration. So, a little struct that walks through your internal vector and returns the pointers in turn would hide internals and be ready for C++20.

But that's the easy bit. The hard bit is changing the API that we want others to call. I did a once over the use of the current API. We certainly have a number of regrettable artifacts (APIs explicitly declared to take a vector<EventType*> stands out in my mind). But, as you say, "The use of a vector<EventType*> as an iteration API is the fundamental flaw of the entire mess." So, we're introducing more mess to support a flaw.

Having not tried to address the bad API with a deprecation strategy, I don't have a firm sense of the cost. But I do see the cost of supporting the bad API, and it's pretty undesirable. How would you feel if I tried doing the deprecation? I'm willing to put my money where my mouth is.

But, that said, even if we are stuck supporting the old API, there are simpler, less fraught ways of doing it.

r2 continues to avoid API evolution; your arguments that building pointer lists are cheap have convinced me it's not necessary for the immediate goal. The move from add_event to AddEvent is more important. PTAL


a discussion (no related file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

To avoid allocation in most cases, we would need to reserve() storage ahead of time, so we still have something like kStride lurking around.

As you pointed out, we should be doing that for the pointer vector already. :) And doing that in the constructor for both vectors is good and proper in the paradigm indicated above.

Is it cheaper to rewrite the pointers proactively, in AddEvent, or lazily, in get_events?

I don't know, I merely suspect that it won't matter. We know that get_events() will get called after accumulating. We also are doing this with the expectation that we will seldom resize, so, it's not like we risk getting multiple resizings/relocations in a single allocation that would be replaced by a single handling at get_events(). From that perspective, maintaining an invariant as a result of calling AddEvent() holds the most apeal.

We still have issues of iterator invalidation, and a fairly non-obvious representation invariant, so I wouldn't be surprised if the patch weighed in at similar size to the present one.

I don't think we are referring to the same thing when we talk about "iterator invalidation". Right now, the act of calling AddEvent() has the potential of invalidating iterators on the vector returned by get_events(). Consider the following code:

const auto& events = collection.get_events();
auto itr = events.begin();
collection.AddEvent(my_event_defined_elsewhere);  // <-- Contract of vector says itr is now invalid!

This is what I think about when I think about "iterator invalidation". We've got an iterator to a collection, we perform an operation on that collection which is documented to invalidate all iterators. You've got that problem whether you have a list<Block> or a vector<EventType>.

list punted.



systems/framework/event_collection.h, line 202 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

I think this would be better as void AddEvent(EventType) override;. Generally, this is getting invoked with r-value instances of events. So, passing by value in the signature and doing a std::move in the body would eliminate unnecessary copies. Particularly valueable where the vent has non-null data (currently owned by a copyable_unique_ptr which is very amenable to std::move.

Done.


systems/framework/event_collection.h, line 459 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

Yeah. In my mind, it's worth capturing with a TODO where the trigger is people saying, "I've got a big system and I don't want to allocate! What can I do?"

r2 provides Reserve(). No changes to LeafSystem are yet made to make it more useful. I'm going to claim a LeafSystem with more than 32 events (when there are only something like 9 or 12 combinations of type and schedule) has deeper problems.


systems/framework/leaf_system.h, line 959 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

nit: If you change AddEvent() to pass-by-value, this should be std::move(forced).

Maybe not? https://abseil.io/tips/117 https://herbsutter.com/2020/02/17/move-simply/


systems/framework/leaf_system.h, line 1000 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

nit: If you change AddEvent() to pass-by-value, this should be std::move(forced).

see above.


systems/framework/leaf_system.h, line 1039 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

nit: If you change AddEvent() to pass-by-value, this should be std::move(forced).

see above.

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The teensiest of disagreements. :)

Reviewed 4 of 4 files at r2.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee SeanCurtis-TRI(platform), needs at least two assigned reviewers (waiting on @rpoyner-tri)


systems/framework/event_collection.h, line 459 at r1 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

r2 provides Reserve(). No changes to LeafSystem are yet made to make it more useful. I'm going to claim a LeafSystem with more than 32 events (when there are only something like 9 or 12 combinations of type and schedule) has deeper problems.

The key is that it's not an individual leaf system that will cause overflow. It's a large diagram with many leaf systems. They all accumulate into a common event collection. That said, the fact that we're now using vector instead of array means there's a very low bar in the future if we chose to give runtime control over the initial reserved size. Happily, that's tomorrow's problem.


systems/framework/event_collection.h, line 363 at r2 (raw file):

  // TODO(siyuan): provide an iterator instead.
  const std::vector<const EventType*>& get_events() const {
    events_.clear();

BTW This surprised me. I hadn't thought you'd rebuild the list every time you asked for it. I thought you'd rebuild it when you detected the two vectors came out of sync since it's such a trivial test. The advantage of doing it in AddEvent() is that you are guaranteed you can skip the events_.empty() predicate. :)

It may well be that the cost of copying N pointers for each round of "clear-add-add-...-add-get_events" is essentially the same cost as testing for synchronicity at each "add". (My gut says it's "more", but I have no idea how much more.)

Still, this has the wonderful elegance of being obviously correct without tricks. :)


systems/framework/leaf_system.h, line 959 at r1 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

Maybe not? https://abseil.io/tips/117 https://herbsutter.com/2020/02/17/move-simply/

Definitely. forced is an l-value variable. The only way to pass it by value to AddEvent is to copy-construct the parameter. However, by creating an r-value reference, the parameter will use move construction and then move again into the collection's data.


systems/framework/leaf_system.h, line 1000 at r1 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

see above.

likewise.


systems/framework/leaf_system.h, line 1039 at r1 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

see above.

Ditto

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, LGTM missing from assignee SeanCurtis-TRI(platform), needs at least two assigned reviewers (waiting on @rpoyner-tri)


systems/framework/leaf_system.h, line 959 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

Definitely. forced is an l-value variable. The only way to pass it by value to AddEvent is to copy-construct the parameter. However, by creating an r-value reference, the parameter will use move construction and then move again into the collection's data.

Herb does feel strongly about compiler intelligence.....now I'm curious to try. BRB.

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, LGTM missing from assignee SeanCurtis-TRI(platform), needs at least two assigned reviewers (waiting on @rpoyner-tri)


systems/framework/leaf_system.h, line 959 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

Herb does feel strongly about compiler intelligence.....now I'm curious to try. BRB.

The following code:

#include <iostream>
#include <utility>

class Foo {
public:
  Foo() {
    std::cout << "Constructr\n";
  }
  Foo(const Foo&) {
    std::cout << "Copy constructor\n";
  }
  Foo& operator=(const Foo&) {
    std::cout << "Copy assignment\n";
    return *this;
  }
  Foo(Foo&&) {
    std::cout << "Move constructor\n";
  }
  Foo& operator=(Foo&&) {
    std::cout << "Move assignement\n";
    return *this;
  }
};

void Bar(Foo f) {
  std::cerr << "Called bar!\n";
}

void transient() {
  std::cerr << "transient\n";
  Foo f;
  Bar(f);
}

int main() {
  Foo f;
  Bar(f);
  transient();
  Bar(std::move(f));
  return 0;
}

produces the following output (for both clang++ and g++):

Constructr
Copy constructor
Called bar!
transient
Constructr
Copy constructor
Called bar!
Move constructor
Called bar!

So, I'm not sure which compiler Herb has, but the compilers we're using don't seem to make any such promise.

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, LGTM missing from assignee SeanCurtis-TRI(platform), needs at least two assigned reviewers (waiting on @rpoyner-tri)


systems/framework/leaf_system.h, line 959 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

The following code:

#include <iostream>
#include <utility>

class Foo {
public:
  Foo() {
    std::cout << "Constructr\n";
  }
  Foo(const Foo&) {
    std::cout << "Copy constructor\n";
  }
  Foo& operator=(const Foo&) {
    std::cout << "Copy assignment\n";
    return *this;
  }
  Foo(Foo&&) {
    std::cout << "Move constructor\n";
  }
  Foo& operator=(Foo&&) {
    std::cout << "Move assignement\n";
    return *this;
  }
};

void Bar(Foo f) {
  std::cerr << "Called bar!\n";
}

void transient() {
  std::cerr << "transient\n";
  Foo f;
  Bar(f);
}

int main() {
  Foo f;
  Bar(f);
  transient();
  Bar(std::move(f));
  return 0;
}

produces the following output (for both clang++ and g++):

Constructr
Copy constructor
Called bar!
transient
Constructr
Copy constructor
Called bar!
Move constructor
Called bar!

So, I'm not sure which compiler Herb has, but the compilers we're using don't seem to make any such promise.

Actually, the thought just popped in my head. I think he's warning against people using std::move on returned or thrown values. I.e., those values that get propagated out of the function, not those passed downwards.

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:LGTM:

+@soonho-tri for platform review.

Reviewed 1 of 1 files at r3.
Reviewable status: LGTM missing from assignee soonho-tri(platform) (waiting on @soonho-tri)

@soonho-tri soonho-tri assigned ggould-tri and unassigned soonho-tri May 26, 2021
Copy link
Member

@soonho-tri soonho-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-@soonho-tri
+@ggould-tri for platform review (punting to tomorrow's reviewer)

Reviewable status: LGTM missing from assignee ggould-tri(platform) (waiting on @ggould-tri)

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: with some naming/const concerns.

Reviewed 4 of 8 files at r1, 2 of 4 files at r2, 1 of 1 files at r3.
Reviewable status: 2 unresolved discussions (waiting on @rpoyner-tri)


systems/analysis/test/simulator_limit_malloc_test.cc, line 79 at r3 (raw file):

    // construction.
    // TODO(rpoyner-tri): whittle allocations down to 0.
    test::LimitMalloc heap_alloc_checker({.max_num_allocations = 21});

:applause:


systems/framework/event_collection.h, line 359 at r3 (raw file):

  /**
   * Returns a const reference to the vector of const pointers to all of the
   * events.

minor: Big uncommented hazard here.

This notionally const return value will nonetheless change out from under the caller, and not only when they expect it (actual mutation of the events) but upon any call to the notionally const "simple getter" get_events.


systems/framework/event_collection.h, line 362 at r3 (raw file):

   */
  // TODO(siyuan): provide an iterator instead.
  const std::vector<const EventType*>& get_events() const {

This method seems to violate both the letter and the spirit of https://drake.mit.edu/styleguide/cppguide.html#Function_Names , albeit in different ways: It is far from a simple accessor, it contains an O(n) loop, and it is six lines long.

I'm not inclined to be a stickler for length, but the const and mutability semantics here make me concerned that calling code that uses this as a simple getter as its name suggests will be sad.

Might it be better to deprecate-and-rename?

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 unresolved discussions (waiting on @ggould-tri)


systems/framework/event_collection.h, line 359 at r3 (raw file):

Previously, ggould-tri wrote…

minor: Big uncommented hazard here.

This notionally const return value will nonetheless change out from under the caller, and not only when they expect it (actual mutation of the events) but upon any call to the notionally const "simple getter" get_events.

Excellent point. The following code:

auto itr1 = collection.get_events().begin();
auto itr2 = collection.get_events().end();

will have the surprising effect of sometimes invalidating itr1. It'd be a bit like playing Russian roulette -- most of the time itr1 would be fine. But if we ever need to increase the size of the vector and that requires moving the supporting data array, itr1 would no longer be valid. The worst kind of bug: the apparently random, rare kind.


systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, ggould-tri wrote…

This method seems to violate both the letter and the spirit of https://drake.mit.edu/styleguide/cppguide.html#Function_Names , albeit in different ways: It is far from a simple accessor, it contains an O(n) loop, and it is six lines long.

I'm not inclined to be a stickler for length, but the const and mutability semantics here make me concerned that calling code that uses this as a simple getter as its name suggests will be sad.

Might it be better to deprecate-and-rename?

Amusing observation. This method is the center and crux of all of the difficulty and previous discussion of this PR. The function is fundamentally flawed (before it was flawed in that it exposed too much of the implementation details) and now it's flawed because it's fundamentally incompatible with the desire to reduce/eliminate heap allocations and defies the solution (collection-ownership of events).

In an ideal world, deprecation wouldn't just change the name, it would change the return type as well. But that's a very big fish to fry.

One way to mitigate your specific concerns above is to move the update of the vector of pointers to the end of AddEvent() (a comment I've made elsewhere). That would address most of your concerns:

  1. It wouldn't mutate simply by calling get_events()
  2. get_events() would, again, be justified in its snake_case.

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 unresolved discussions (waiting on @rpoyner-tri)


systems/framework/event_collection.h, line 359 at r3 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

Excellent point. The following code:

auto itr1 = collection.get_events().begin();
auto itr2 = collection.get_events().end();

will have the surprising effect of sometimes invalidating itr1. It'd be a bit like playing Russian roulette -- most of the time itr1 would be fine. But if we ever need to increase the size of the vector and that requires moving the supporting data array, itr1 would no longer be valid. The worst kind of bug: the apparently random, rare kind.

I'm not entirely convinced that itr1 would be invalidated without an intervening call to AddEvent, but no matter. If anyone is confused, it is confusing.


systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

Amusing observation. This method is the center and crux of all of the difficulty and previous discussion of this PR. The function is fundamentally flawed (before it was flawed in that it exposed too much of the implementation details) and now it's flawed because it's fundamentally incompatible with the desire to reduce/eliminate heap allocations and defies the solution (collection-ownership of events).

In an ideal world, deprecation wouldn't just change the name, it would change the return type as well. But that's a very big fish to fry.

One way to mitigate your specific concerns above is to move the update of the vector of pointers to the end of AddEvent() (a comment I've made elsewhere). That would address most of your concerns:

  1. It wouldn't mutate simply by calling get_events()
  2. get_events() would, again, be justified in its snake_case.

Sigh, https://martinfowler.com/bliki/TwoHardThings.html. It appears here we have, or will have, both.

@jwnimmer-tri
Copy link
Collaborator

jwnimmer-tri commented May 27, 2021

I'll reinforce what I've already said above. I still think it's an easy way out. What am I missing?

  • Change the documentation of get_events to say that calling any other method afterward invalidates its returned reference.
  • Deprecate get_events.
  • Change all of the call sites within Drake to use a version that access the other vector flavor instead const std::vector<EventType>& events() const

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only point I'd quibble about is: "...calling any other method afterward invalidates..." (emphasis added). The point I made below is that calling get_events() again also has the power to "invalidate" its previously returned reference.

Reviewable status: 2 unresolved discussions (waiting on @rpoyner-tri)

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what Jeremy says there is correct (except the word "other" -- at least per the standard, if not in practice, get_events is as guilty as the rest).

Reviewable status: :shipit: complete! all discussions resolved, LGTM from assignees ggould-tri(platform),SeanCurtis-TRI(platform) (waiting on @ggould-tri and @SeanCurtis-TRI)


systems/framework/event_collection.h, line 359 at r3 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

I'm not entirely convinced that itr1 would be invalidated without an intervening call to AddEvent, but no matter. If anyone is confused, it is confusing.

FYI per the spec: "clear()...Invalidates any references, pointers, or iterators referring to contained elements. Any past-the-end iterators are also invalidated."

I suspect that in practice because invalid iterator use is UB, our ubsan would catch this if the standard library ever actually made use of the invalidation allowance in the spec.

@jwnimmer-tri
Copy link
Collaborator

Sure, nix the other from my the comment.

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See r4. I'm not against getting rid of get_events(); I just don't believe it has to be tackled in this PR. My primary goal in the entire #14543 PR train was to separate functional changes needed to reorganize allocations from (as seen in #14707) pretty arbitrary and obfuscating API breakage.

It's hard enough to get a 100-line patch right. Let's save the 1000-line patch for later, and have it be clear that no logic or properties are changed by it.

Reviewable status: :shipit: complete! all discussions resolved, LGTM from assignees ggould-tri(platform),SeanCurtis-TRI(platform) (waiting on @ggould-tri and @SeanCurtis-TRI)

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4.
Reviewable status: :shipit: complete! all discussions resolved, LGTM from assignees ggould-tri(platform),SeanCurtis-TRI(platform) (waiting on @rpoyner-tri)

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 unresolved discussions (waiting on @rpoyner-tri)

a discussion (no related file):

Previously, ggould-tri wrote…

Blocking comment while I look into the runtime complexity of this change.

Closing this as I opened discussions below for this: I think this PR leaks quadratic logic into the Simulator step loop.


Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 unresolved discussions (waiting on @rpoyner-tri)

a discussion (no related file):
At this point, I think we've seen most of the ideas that could make sense here. The original objection to r1 was that it was too complex. However, the "simpler" implementations have other problems.

Seems to me there are some possible paths forward:

  1. Continue to try to defer API breakage, which probably leads back to a variation on r1.
  2. Blow up the API and write a proper one, which (for me) means probably both abandoning functions that require the return value of get_events(), and implementing a range iterator.


systems/framework/event_collection.h, line 373 at r4 (raw file):

Previously, ggould-tri wrote…

A snake_case method calling a CamelCase method is a bad smell; calling a linear-time one is a standards violation. I think this method needs to be deprecated or respelled.
Also this caused me to look up the call chain on this; see an additional objection below.

It's already deprecated. And yes, the performance consequence is unfortunate, but only in r4. I'm starting to like r1 (with some tweaks) a bit more.


systems/framework/event_collection.h, line 428 at r4 (raw file):

Previously, ggould-tri wrote…

This method is now quadratic (well, O(n*m) for the two collection sizes) with the new AddEvent implementation. And it is called inside the inner loop of Simulator::AdvanceTo.

I don't think we should add quadratic code to the simulator inner loop.

Is there some performance bound guarantee that I'm missing here?

Nope. You are right. FWIW, Evan's primary goal is determinism (allocation lock avoidance) rather than pure speed, but still the (in principle) unbounded quadratic is no good.

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 unresolved discussions (waiting on @rpoyner-tri)

a discussion (no related file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

At this point, I think we've seen most of the ideas that could make sense here. The original objection to r1 was that it was too complex. However, the "simpler" implementations have other problems.

Seems to me there are some possible paths forward:

  1. Continue to try to defer API breakage, which probably leads back to a variation on r1.
  2. Blow up the API and write a proper one, which (for me) means probably both abandoning functions that require the return value of get_events(), and implementing a range iterator.

I'm definitely at 2 right now. It seems more propitious to write the correct API, make everything use that API, and then reimplement the deprecated API in terms of the new one.

But I'm also happy to get on a call about this if you want to chat it through.



systems/framework/event_collection.h, line 373 at r4 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

It's already deprecated. And yes, the performance consequence is unfortunate, but only in r4. I'm starting to like r1 (with some tweaks) a bit more.

Oh, it's inheriting the deprecation from superclass and my brain didn't catch "final" as a flavor of "override". Quite right.

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 unresolved discussions (waiting on @ggould-tri and @rpoyner-tri)

a discussion (no related file):

Previously, ggould-tri wrote…

I'm definitely at 2 right now. It seems more propitious to write the correct API, make everything use that API, and then reimplement the deprecated API in terms of the new one.

But I'm also happy to get on a call about this if you want to chat it through.

I think my only reservation with 2 is that the damage will not be easily contained by deprecation, since it leaks into virtual methods (already hard to deprecate) that are open to be implemented by anyone.


Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+(status: do not merge) -- there may or may not be an API-preserving implementation that passes review, but r4 is not it. There's an open thread below for discussions if you like.

Reviewable status: 2 unresolved discussions, labeled "do not merge" (waiting on @ggould-tri and @rpoyner-tri)

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, labeled "do not merge" (waiting on @ggould-tri)


systems/framework/event_collection.h, line 428 at r4 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

Nope. You are right. FWIW, Evan's primary goal is determinism (allocation lock avoidance) rather than pure speed, but still the (in principle) unbounded quadratic is no good.

I've thrown the DNM flag against r4 while discussions continue. I can do some work to see how containable the necessary API rework can be. Marking myself "satisfied" here that any successors will not have this particular defect.

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: labeled "do not merge" (waiting on @rpoyner-tri)

a discussion (no related file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

I think my only reservation with 2 is that the damage will not be easily contained by deprecation, since it leaks into virtual methods (already hard to deprecate) that are open to be implemented by anyone.

If some of these have to be "changed without deprecation" it's probably not too bad because I doubt any users are subclassing the event mechanism.


Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: labeled "do not merge" (waiting on @rpoyner-tri)

a discussion (no related file):

Previously, ggould-tri wrote…

If some of these have to be "changed without deprecation" it's probably not too bad because I doubt any users are subclassing the event mechanism.

It's not subclassing the event mechanism. It's the leakage of the iteration defect into LeafSystem virtuals: DoPublish, DoCalcDiscreteVariableUpdates, and DoCalcUnrestrictedUpdate. See #14707 for a sampling of the blast radius.


Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: labeled "do not merge" (waiting on @ggould-tri)

a discussion (no related file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

It's not subclassing the event mechanism. It's the leakage of the iteration defect into LeafSystem virtuals: DoPublish, DoCalcDiscreteVariableUpdates, and DoCalcUnrestrictedUpdate. See #14707 for a sampling of the blast radius.

r5 is probably my last shot at path 1. path 2 is always open if we need it.



systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

Sigh, https://martinfowler.com/bliki/TwoHardThings.html. It appears here we have, or will have, both.

Arguably r5 still violates the letter of the function names law, but it maintains the return signature. Deprecating just the name would not be too bad.


systems/framework/event_collection.h, line 428 at r4 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

I've thrown the DNM flag against r4 while discussions continue. I can do some work to see how containable the necessary API rework can be. Marking myself "satisfied" here that any successors will not have this particular defect.

r5 removes the quadratic explosion here. PTAL

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r5.
Reviewable status: labeled "do not merge" (waiting on @rpoyner-tri)

a discussion (no related file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

r5 is probably my last shot at path 1. path 2 is always open if we need it.

I like r5 but cannot figure out how you would test-cover its soundness; it has the characteristics of a landmine laid before our future feet.



systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

Arguably r5 still violates the letter of the function names law, but it maintains the return signature. Deprecating just the name would not be too bad.

Explicit invalidity bits: Still the ugliest solution, still the most reliable solution.

I'm a bit brain-fried right now so let me make sure I'm reasoning through this proof correctly:

  • get_event will be constant-time unless an AddEvent intervenes, in which case it's linear time.
  • AddEvent itself is amortized constant time regardless of frequency.
  • Therefore a loop containing exactly one get_event call is linear time in its event work iff a constant fraction of its iterations contain a linear number of AddEvents.
  • So the AddEvent loop in DoAddToEnd is linear because its linear-manyAddEvent calls occur without an intervening get_event.

I think this implementation is sound. But I definitely don't want to reason through it again so I think you need implementation comments explaining it.

Also, and I apologize for being the wet blanket on this PR, I cannot find any unit testing of this class, which now has enough complexity to be wrong in annoying and subtle ways. I'm not sure how this can get test coverage, but I am not confident in the ability of subsequent readers to prove it statically correct.


systems/framework/event_collection.h, line 428 at r4 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

r5 removes the quadratic explosion here. PTAL

See above: Looks correct to me, needs commentary and if possible testing or it will break the first time someone who doesn't understand it makes a change.

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r5.
Reviewable status: labeled "do not merge" (waiting on @rpoyner-tri)


systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

Arguably r5 still violates the letter of the function names law, but it maintains the return signature. Deprecating just the name would not be too bad.

I will never understand your insistence to put this code in this function. I think it complicates everything (and now has given birth to yet another member field). However, I also recognize that if I can't persuade you to my side, then my idea or my persuasion is defective and I can't do anything to address either of those. So, while I'm not happy with this, but I'm not going to block.

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 unresolved discussions, labeled "do not merge" (waiting on @rpoyner-tri)

a discussion (no related file):

Previously, ggould-tri wrote…

I like r5 but cannot figure out how you would test-cover its soundness; it has the characteristics of a landmine laid before our future feet.

My first thought is rather than (or in addition to) write external tests, to write an invariant checker similar to the one abandoned after r1. Of course the invariants are different, but the principle is the same. For r5, you have a fairly simple set of clauses: if the dirty bit is set, few or no invariants apply, otherwise, the vectors must be same-sized, pointers must point to proper storage entries, etc. Structural induction, yes?



systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

I will never understand your insistence to put this code in this function. I think it complicates everything (and now has given birth to yet another member field). However, I also recognize that if I can't persuade you to my side, then my idea or my persuasion is defective and I can't do anything to address either of those. So, while I'm not happy with this, but I'm not going to block.

Grant: I believe your analysis is correct. Perhaps invariant checking, and some tests, may allay your concerns.

Sean: I used to be a homeowner. "Hey, let's hook up to the splufty new town sewer!" "But to do that, we'll have to cut up the driveway." "Hey let's re-pave the driveway!" "Huh, the front walk is looking lousy, let's re- do that too..."

I don't particularly insist on putting code in that function. What I insist on is a limited change set that can be understood to be correct, to accomplish the immediate goal, and not borrow trouble from all of the other shiny objects lying nearby. The entire point of r1 was to not introduce code into get_events(), but you didn't like that solution either.


systems/framework/event_collection.h, line 428 at r4 (raw file):

Previously, ggould-tri wrote…

See above: Looks correct to me, needs commentary and if possible testing or it will break the first time someone who doesn't understand it makes a change.

Agreed. Will add invariant checks and round out tests, after i've pondered existing (borrowed from elsewhere) coverage a bit.

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 unresolved discussions, labeled "do not merge" (waiting on @rpoyner-tri)


systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

Grant: I believe your analysis is correct. Perhaps invariant checking, and some tests, may allay your concerns.

Sean: I used to be a homeowner. "Hey, let's hook up to the splufty new town sewer!" "But to do that, we'll have to cut up the driveway." "Hey let's re-pave the driveway!" "Huh, the front walk is looking lousy, let's re- do that too..."

I don't particularly insist on putting code in that function. What I insist on is a limited change set that can be understood to be correct, to accomplish the immediate goal, and not borrow trouble from all of the other shiny objects lying nearby. The entire point of r1 was to not introduce code into get_events(), but you didn't like that solution either.

Your response is proof that I've failed to communicate. We don't have different values, we're not trying to achieve different ends. What I'm suggesting is a smaller change than the one you have and, I believe, simpler. I'll recreate it here.

  const std::vector<const EventType*>& get_events() const {
    return events_;
  }

  void AddEvent(EventType event) final {
    events_storage_.push_back(std::move(event));
    if (&events_storage_[0] == events_[0]) {
      events_.push_back(&events_storage_.back());
    } else {
      events_.clear();
      for (const auto& entry : events_storage_) {
        events_.push_back(&entry);
      }
    }
  }

This code:

  1. Has no extra tracking bits.
  2. Guarantees that at the conclusion of each AddEvent() invocation that events_storage_ and events_ are in sync.
  3. If events_storage_ gets resized there are two cases:
    • It gets resized in place, which means we only write one pointer to events_, or
    • It gets moved during resize, in case we write the whole of events_ (again, guaranteeing synchronization).
  4. Requires nothing to be declared mutable.
  5. Depending on what you believe about branch prediction, the assumption that the branch predicate is true is, by design, generally true. :)

This is all I've been suggesting all along. It's got some nice invariants, nice computation properties, and is compact.

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 unresolved discussions, labeled "do not merge" (waiting on @rpoyner-tri)


systems/framework/event_collection.h, line 380 at r5 (raw file):

  void add_event(std::unique_ptr<EventType> event) final {
    DRAKE_DEMAND(event != nullptr);
    AddEvent(*event);

nit: to be consistent this should be std::move(*event), otherwise we're copying the contents of *event.

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 unresolved discussions, labeled "do not merge" (waiting on @rpoyner-tri)


systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

Your response is proof that I've failed to communicate. We don't have different values, we're not trying to achieve different ends. What I'm suggesting is a smaller change than the one you have and, I believe, simpler. I'll recreate it here.

  const std::vector<const EventType*>& get_events() const {
    return events_;
  }

  void AddEvent(EventType event) final {
    events_storage_.push_back(std::move(event));
    if (&events_storage_[0] == events_[0]) {
      events_.push_back(&events_storage_.back());
    } else {
      events_.clear();
      for (const auto& entry : events_storage_) {
        events_.push_back(&entry);
      }
    }
  }

This code:

  1. Has no extra tracking bits.
  2. Guarantees that at the conclusion of each AddEvent() invocation that events_storage_ and events_ are in sync.
  3. If events_storage_ gets resized there are two cases:
    • It gets resized in place, which means we only write one pointer to events_, or
    • It gets moved during resize, in case we write the whole of events_ (again, guaranteeing synchronization).
  4. Requires nothing to be declared mutable.
  5. Depending on what you believe about branch prediction, the assumption that the branch predicate is true is, by design, generally true. :)

This is all I've been suggesting all along. It's got some nice invariants, nice computation properties, and is compact.

Dawn breaks over Marblehead . Thank you.

Also, taking else branch would probably be a clue that kDefaultCapacity should be larger.

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 unresolved discussions, labeled "do not merge" (waiting on @rpoyner-tri)


systems/framework/event_collection.h, line 428 at r4 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

Agreed. Will add invariant checks and round out tests, after i've pondered existing (borrowed from elsewhere) coverage a bit.

It just got simpler; maybe r6 won't require so much commentary.

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: labeled "do not merge" (waiting on @ggould-tri and @SeanCurtis-TRI)


systems/framework/event_collection.h, line 428 at r4 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

It just got simpler; maybe r6 won't require so much commentary.

New claim for r6; AddEvent is amortized O(1), with a linear worst case that we should approximately never hit, and can avoid by doing a larger Reserve() if necessary.

Copy link
Contributor

@SeanCurtis-TRI SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r6.
Reviewable status: labeled "do not merge" (waiting on @rpoyner-tri)


systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

Dawn breaks over Marblehead . Thank you.

Also, taking else branch would probably be a clue that kDefaultCapacity should be larger.

Your statement inspired me. If you really wanted to get ambitious, you could put a

static const logging::Warn log_once(
        "Simulator event collection had to be resized while adding events. {} was not large enough, try calling `Reserve()` with a larger value.",
        events_.size());

right before you call `events_clear_.

That said, the primary reason to not give that warning is that we could have resized in place -- we still asked the heap for memory (the thing we're trying to avoid). So, this might capture some instances of detecting a heap allocation, but not all of them.

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, labeled "do not merge" (waiting on @rpoyner-tri)

a discussion (no related file):
Adding a work item to investigate the value of kDefaultCapacity; at least with the systems already available in the Drake test suite.



systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

Your statement inspired me. If you really wanted to get ambitious, you could put a

static const logging::Warn log_once(
        "Simulator event collection had to be resized while adding events. {} was not large enough, try calling `Reserve()` with a larger value.",
        events_.size());

right before you call `events_clear_.

That said, the primary reason to not give that warning is that we could have resized in place -- we still asked the heap for memory (the thing we're trying to avoid). So, this might capture some instances of detecting a heap allocation, but not all of them.

Testing also reveals a missing case in the r6 code; we don't handle the first step from empty properly. Easily fixed by amending the if condition.

Relevant to: RobotLocomotion#14543

Rewrite the storage of event collections to avoid most allocations,
while maintaining most of the pre-existing public API. To support this
change, allow fully-derived-type events to be copied and assigned. The
Clone() mechanism is still supported, primarily for interfacing with
Python.

This patch also deprecates EventCollection<E>::add_event() and any
overrides.
Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, labeled "do not merge" (waiting on @rpoyner-tri and @SeanCurtis-TRI)

a discussion (no related file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

Adding a work item to investigate the value of kDefaultCapacity; at least with the systems already available in the Drake test suite.

FWIW, the current value of 32 is enough to avoid reallocations against any of the systems exercised in bazel test //...
The test setup involved putting a hard-stop assertion in the else clause of the r7 AddEvent implementation.



systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

Testing also reveals a missing case in the r6 code; we don't handle the first step from empty properly. Easily fixed by amending the if condition.

I'm not sure resizing in place is a thing? Imma go with the invalidation advice from https://en.cppreference.com/w/cpp/container/vector/push_back : either all the iterators are dead, or just end().

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, labeled "do not merge" (waiting on @rpoyner-tri and @SeanCurtis-TRI)


systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

I'm not sure resizing in place is a thing? Imma go with the invalidation advice from https://en.cppreference.com/w/cpp/container/vector/push_back : either all the iterators are dead, or just end().

I think my reason to not implement the warning string is that it maybe doesn't (yet) give the user a clear path to action. Is this class visible enough to users that they can actually do the relevant Reserve() call? I took a brief tour of the relevant code, but I'm not confident that the path to Reserve() is particularly easy.

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: labeled "do not merge" (waiting on @SeanCurtis-TRI)


systems/framework/event_collection.h, line 362 at r3 (raw file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

I think my reason to not implement the warning string is that it maybe doesn't (yet) give the user a clear path to action. Is this class visible enough to users that they can actually do the relevant Reserve() call? I took a brief tour of the relevant code, but I'm not confident that the path to Reserve() is particularly easy.

FWIW, LimitMalloc has gained the ability to dump all allocation stacks at once: #14899. Perhaps that and the other commentary in this class will be enough for now.

Copy link
Contributor Author

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-(status: do not merge) I'm satisfied with the current solution. Thanks for your patience, reviewers!

Reviewable status: labeled "do not merge" (waiting on @SeanCurtis-TRI)

@rpoyner-tri rpoyner-tri merged commit a910475 into RobotLocomotion:master May 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants