added ecs back and forth, part 12

skypjack · skypjack · commit 2be1c2c9eff0 · 2021-08-29T17:55:17.000+02:00
diff --git a/_posts/2021-08-29-ecs-baf-part-12.md b/_posts/2021-08-29-ecs-baf-part-12.md
@@ -0,0 +1,208 @@
+---
+layout: post
+title: ECS back and forth
+subtitle: Part 12 - Introduction to sparse sets and pointer stability
+gh-repo: skypjack/entt
+gh-badge: [star, follow]
+tags: [ecs, entt, cpp]
+---
+
+With my [last post](https://skypjack.github.io/2021-06-12-ecs-baf-part-11/) I've
+revised the big matrix model and given some hints on pointer stability among the
+other things.<br/>
+This time, I want to dig a little further into the sparse set model to describe
+how we can have pointer stability also in this case.
+
+There are actually many tricks to implement such an useful feature without
+sacrificing performance or wasting memory unnecessarily.<br/>
+This post will be an introductory and purely descriptive analysis to get to what
+is the new and definitive model adopted in
+[`EnTT`](https://github.com/skypjack/entt), which I'll describe in the next post
+instead.
+
+## Introduction
+
+Pointer stability is a really nice-to-have feature that is often sacrificed when
+it comes to working with an entity-component-system architecture.<br/>
+It has many advantages that are worth it in my humble opinion. Among the others:
+
+* Hierarchies are as trivial and efficient as they can get. No hidden costs, no
+  tricks, just a plain pointer to the parent element and that's it.
+
+* Third party libraries integration becomes straightforward, especially with C
+  ones that expect users to allocate objects and provide them to the library
+  itself.
+
+Unfortunately, some models make it especially hard to implement pointer
+stability, for example archetypes. This is due to the fact that they want to
+move entities around every time a component is added or removed. In this case,
+unless you are fine with finding a compromise like introducing a sparse set
+aside your tables, pointer stability isn't really an option.<br/>
+More in general, all models that don't rely on independent pools are in troubles
+to support this kind of feature out of the box. On the other side, the big
+matrix as well as sparse sets are a perfect fit.
+
+We've already seen how trivial it is to have pointer stability with the big
+matrix. Let's see now what it means to support the same with sparse sets.
+
+## Scope of the problem
+
+When working with sparse sets, pointers are usually invalidated in two cases by
+many implementations:
+
+* When adding entities and therefore components to pools.
+* When removing entities and therefore components from pools.
+
+At least, these are the functionalities for which users may expect pointer
+stability. We'll see how the former is trivial to address while the latter
+requires some clever tricks instead to get the maximum out of it.<br/>
+There exist also other operations which invalidate pointers, of course. For
+example, [`EnTT`](https://github.com/skypjack/entt) allows to
+[sort pools in-place](https://skypjack.github.io/2019-09-25-ecs-baf-part-5/). It
+goes without saying that all references, pointers and iterators are invalidated
+when sorting. However, as an user I don't expect pointers to remain stable if I
+sort a container and therefore we won't care about this kind of functionalities.
+
+## Component creation
+
+Adding a component can invalidate references if elements are tightly packed in a
+single chunk of memory. This is pretty intuitive.<br/>
+As soon as we go out of space, we need to allocate a larger piece of memory and
+migrate all components from one side to the other. No way a pointer can survive
+this operation (well, technically speaking it's somewhat be possible but I
+wouldn't rely on this assumption).
+
+To get around reference invalidation as introduced by a reallocation, all we
+need to do is to paginate the component array.<br/>
+This can introduce a few extra jumps (the number of which depends on the page
+size) but all in all their cost during iterations is negligible if not
+irrelevant at all. On the other side, _reallocating_ is much cheaper and no
+pointer is invalidated when we run out of space.<br/>
+The latter is true because moving the components means to only move the pointers
+to the component pages. This is both cheaper in terms of performance and safer
+in terms of stability. In other terms, no component will be mistreated anymore.
+
+## Component destruction
+
+Having pointer stability during component destruction is trickier instead. What
+happens in general with sparse sets is that we _swap-and-pop_ the component to
+remove with the last element in the array. This means that there exists a
+component somewhere the reference of which is invalidated after this
+operation.<br/>
+To get around this problem, we must get rid of the _swap_ operation and just
+_pop_. That is, we'll be literally creating _holes_ in our pools. However, how
+do we recognize these holes? Is it possible to recycle them later on?
+
+With one of my
+[previous posts](https://skypjack.github.io/2019-05-06-ecs-baf-part-3/) (a very
+old one actually) I described how `EnTT` embeds both the entity and its version
+within an identifier.<br/>
+Moreover, I also explained how this design can be used to create implicit lists
+of entities within an otherwise uniform vector of identifiers. Let's see how far
+we can push this trick now for supporting stable pointers in a sparse set.
+
+### Tombstone entity: yay or nay?
+
+The first thing to do is to find a way to _recognize_ our holes. One of the
+possibilities is to use the `null` entity as a _tombstone entity_. However, this
+has an inherent problem.<br/>
+From the post linked above, we know that we can use the entity part of an
+identifier to store the position of the next element of our implicit list. In
+fact, unless the version occupies half of the size of an identifier (and this
+would really be a waste), it's not suitable for the same purpose as well.
+Therefore, using the `null` entity to identify tombstones would prevent us from
+using our nice trick to create a _list of holes_ to recycle.
+
+It goes without saying that this solution has some advantages though. First of
+all, it doesn't require us to introduce anything to support pointer stability.
+We already have all we need to create tombstones. Moreover, it makes deletion
+trivial and fully thread safe: we can now delete multiple components of the same
+type in parallel without risking any conflict.<br/>
+On the other side, there are also some disadvantages. For example, we don't know
+where our holes are. In theory, we can make the elements in the sparse array
+still refer to the holes and recycle them when the entity is re-assigned the
+same component, though this risks to waste lot of space in the long run and need
+some syncing that may introduce bugs. The other way around is to _search_ the
+next hole when needed, even though this could be very expensive more often than
+not.<br/>
+All in all, our best bet seems to leave holes there and re-compact pools from
+time to time, without even trying to re-use them otherwise.
+
+### Tombstone version and in-place delete
+
+What if we introduced a tombstone version instead? Also in this case, we can
+still find tombstones in a pool with a check of the versions. However, this time
+we can also use the entity parts of our identifiers to construct an implicit
+list of holes within our list of entities. To do that, we only need to add an
+extra member to our pool to store aside the position of the first element to
+recycle.<br/>
+Concurrent deletion is only slightly more complex to manage but still not a
+problem. in fact, we can only conflict with other threads when it comes to
+updating the head of the list of holes during component destruction. An atomic
+variable gets the job done and will give us enough guarantees in this sense.
+
+## Iteration policy
+
+How much the solution above affects iteration performance?
+
+First of all, it's worth noting that pointer stability isn't necessarily enabled
+for all pools. The `EnTT` library allows to turn it on and off on a per-type
+basis. This is possible due to the independent pools nature of its design.<br/>
+This is a must have for this kind of feature. If a type is primarily accessed
+linearly, it doesn't make sense to enable pointer stability for it. On the other
+side, types that are usually queried randomly or that require pointer stability
+for other reasons (i.e. hierarchy support or because you want to pass them to a
+third party library that expects stable references) are good candidates for this
+feature.
+
+Consider now two pools for two different types: `T` for which pointer stability
+is enabled and `U` for which it is not.<br/>
+If you remember how things are iterated with sparse sets, we usually picks the
+shortest pool and perform a validity check on the other pool before returning
+entities (and components). There are two cases here:
+
+* `U` leads the iteration. In this case, pointer stability doesn't affect the
+  performance at all, since we don't have holes in the pools for `U`.
+* `T` leads the iteration. In this case, pointer stability introduces a branch
+  that is meant to recognize and discard tombstones.
+
+In many (likely the vast majority of) cases, this isn't a problem since the
+overall cost may still be negligible. However, it may show up in your
+measurements in some cases, especially if `T` is linearly iterated very often
+and/or along hot paths.<br/>
+This should clearly explain why pointer stability is less worth it when types
+are primarly accessed linearly. In this case, a tightly packed array without
+holes performs better without a doubt.<br/> 
+However, my advice is to measure, measure, measure! The number of entities and
+the type of operation performed on the components could easily hide or overcome
+the tombstone check. As a true story, I've seen very large projects with
+thousands of entities and components enabling this exact model for all types
+without observing any relevant change in performance despite everything.
+
+## No policy is the best policy
+
+If it's true that pointer stability is a super nice feature, it's also true that
+no policy is the best policy when it comes to iterating entities and
+components.<br/>
+Therefore, we can further refine our model and also get rid of the tombstone
+check. This will bring us to the old, dear model we already know without any
+additional costs.
+
+However, as this post has already gone on for a long time, I'll leave this topic
+to a _part 2_ that will follow shortly.<br/>
+To give you some context, `EnTT` went through more or less all the steps
+described above, up to the policy based implementation. Only recently I've
+further refined the model in use to eliminate the _tombstone check_ and I'm moving
+the library towards that solution.
+
+Stay tuned if you want to know more!!
+
+## Let me know that it helped
+
+I hope you enjoyed what you've read so far.
+
+If you liked this post and want to say thanks, consider to star the
+[GitHub project](https://github.com/skypjack/skypjack.github.io) that hosts this
+blog. It's the only way you have to let me know that you appreciate my work.
+
+Thanks.