-
-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How are components/entity stored in memory? #3
Comments
@raizam simple question, complex answer ;-) First of all, reflecs is an "archetype" based ECS framework. You can find a short description of this design approach here: https://github.com/SanderMertens/ecs-faq#what-is-an-archetype-based-ecs. Also, this blog (from the author of EnTT) has a pretty good description of archetype-based ECS frameworks. In short, an archetype based ECS stores entities with the same set of components in the same place (arrray). Entities with components (A, B, C) will be stored in a different location from (A, B), which will be stored in a different location from (B, C) etc. In reflecs, archetypes store their entities using the SoA (structure of arrays) approach. You could say that the storage for an archetype (A, B, C), for 1000 entities looks like this (simplified): struct MyType {
A a[1000] ;
B b[1000];
C c[1000];
}; The current master uses the AoS approach- but reflecs is almost migrated to SoA. See the aos_to_soa branch for the latest version This approach is very cache friendly, because if I create a system that only needs a subset of the components (say, A, C), I will not waste any CPU cache space because of B (as would be the case with AoS- array of structs). Systems are matched against the different archetypes. For example, a system can express interest in (A, C), and it will match with all archetypes that have these components. When you iterate over your entities, your system receives the arrays from all matching archetypes, like so: // Three simple component types
typedef int A;
typedef int B;
typedef int C;
// This function will run twice per iteration, once for each archetype
void MySystem(EcsRows *rows) {
A *a = ecs_column(rows, A, 1); // direct access to the array from the archtype
C *c = ecs_column(rows, C, 2);
for (int i = 0; i < rows->limit; i ++) {
a[i] += c[i];
}
}
int main(int argc, char *argv[]) {
EcsWorld *world = ecs_init();
// Register components and system
ECS_COMPONENT(world, A);
ECS_COMPONENT(world, B);
ECS_COMPONENT(world, C);
ECS_SYSTEM(world, MySystem, EcsOnFrame, A, C);
ECS_ENTITY(world, Entity_1, A, B, C); // Creates archetype A, B, C
ECS_ENTITY(world, Entity_2, A, B, C); // Adds another entity to A, B, C
ECS_ENTITY(world, Entity_3, A, C); // Creates archetype A, C
// Because we have two archetypes, MySystem will be invoked twice
ecs_progress(world, 0);
return ecs_fini(world);
} Hopefully this gives you an idea of the storage model! Let me know if you have more questions. |
Actually, that was my underlying question, I was hoping it implements archetypes :) Do you think it's stable already or this is still WIP ? |
It is still work in progress. The After the merge, the API will be pretty stable. It may take me 1-2 more weeks to make sure everything is tested properly before I merge. |
That SoA vs AoS aspect is very interesting... I've also read that stackoverflow thread, Now I'm reading your explanation again and I'm starting to understand. |
I'm not sure if I'll spend any time on that (though you're welcome to ;-). I did do some benchmarks with the new branch vs. EnTT, and it is quite fast: Adding/removing components is more expensive. This is to be expected because of the archetypes design (EnTT is not archetypes). Iterations are more or less in the same ballpark for a lot of the cases. |
A bit off-topic but the simple explanation above is super clear! I'm trying to understand how a sparse-array approach with groups (i.e. EnTT) works under the hood, but don't speak C++ and have trouble finding resources. By any chance do you know of a simple C-like explanation of the sparse-array (with groups) approach? |
I'll try ;) In EnTT, the registry has an array per component. Because not each entity has all components, the entities don't match up perfectly in the arrays (e.g. Essentially you start with the smallest set (lets say, Groups are a clever optimization that allow you to iterate entities for a certain set of components as regular C arrays, no sparse sets needed. To accomplish this, groups sort component arrays, so that all entities for a given component combination are at the top of the arrays, at the same indices. If I create a group It's a pretty elegant way to combine both the sparse set model (which is flexible) and the direct-array access model (which is fast). It's not perfect though: components can only appear in one group. I cannot, for example, have both Interestingly, EnTT add performance with groups is very similar to unoptimized adds in Flecs (perf measured on the type_graph branch): Add three components (n = 1000000):
Anyway, hopefully that gives you some idea. The author of EnTT also wrote an excellent series of blogs on this topic, you might want to check this out: [1] you strictly can have components in multiple groups if you make them non-owning, which essentially combines a group with an optimized sparse set, where the sparse set points to the indices that have the non-owned component. |
Fantastic, thanks!!! It's very helpful! As it happens I took some notes in the interim try and understand and @skypjack helped me confirm it... happy to see that it matches exactly what you wrote :) Those notes are here: https://gist.github.com/dakom/82551fff5d2b843cbe1601bbaff2acbf I'd think that in the situation of wholly-owned groups, the performance would be equal to archetypes since in both cases the components are iterated linearly? |
Iteration performance is similar, but as with anything there are trade-offs. The advantage of groups is that all component data is in a single array, whereas in an archetype-based system you can have multiple arrays. For example, if I want to iterate In extreme cases where you have lots of different archetypes, this fragmentation of the data space starts to show. When iterations move from one archetype to another you inevitably incur a few cache misses. However, as long as entities significantly outnumber your archetypes, these cache misses will inherently be insignificant. The big (in my opinion) advantage of archetypes is that you don't have to choose which combination of components you want to iterate as a straight array. You iterate any combination of components as a straight array, with as only downside that you may need to iterate multiple arrays. Aside from iteration, there is additional overhead in archetypes as entities have to be moved between archetypes when adding/removing components. This overhead is however quite similar to the overhead of groups as you can see from the perf numbers in my previous comment. The explanation for this is that groups also move entities around when adding/removing to keep entities in the group at the top of the array. |
I just added a chapter to the manual that explains the internals of Flecs in more detail: More content to follow... |
Hi, first of all this is a very interesting project and I've been keeping an eye on it.
My question is simple, how is organized the memory layout?
The text was updated successfully, but these errors were encountered: