Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canonicalization of Entities #98

Closed
Anders429 opened this issue Aug 25, 2022 · 3 comments · Fixed by #100
Closed

Canonicalization of Entities #98

Anders429 opened this issue Aug 25, 2022 · 3 comments · Fixed by #100
Labels
A - Querying Area: Querying components of entities. A - Scheduling Area: Parallel scheduling of systems. A - Storage Area: Storage inside a World. A - Systems Area: Systems defined to operate over entities. C - Enhancement Category: New feature or request. P - Low Priority: Not particularly urgent.

Comments

@Anders429
Copy link
Owner

Anders429 commented Aug 25, 2022

This is an idea that has haunted me for a while. Originally, brood was architected without archetype::Identifier, with Archetype being generic on the entity type. That only works if the entity type is "canonical", which requires some kind of canonicalization.

The original canonicalization I used didn't scale. It ended up 2^n code paths for each component, which obviously did not scale well compile-time wise. My small tests with 4 components worked fine, but when I expanded to the 27 components for some of the ECS bench suite tests, it quickly fell apart.

So I decided to use the run-time archetype identifier solution. It incurs a runtime cost of allocating for the archetype identifier, which increases linearly with the number of components, but it works well enough.

However, I think I now have a method of actually doing canonicalization at compile time, transforming any arbitrary entity into a canonical entity. It exists here at this playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=06ce568a5291035c8da980d887de8892

Huge shoutout to frunk and Lloyd Chan's blog post about how frunk was written. He was right, the hardest part of this stuff was getting the type signatures right.

The solution above uses the Registry as the canonical form, and uses each component to try to pull on a component in the entity. Using the Get trait, which is generated for each entity for each of its components, we can pull the component out. If the Get trait isn't generated for the entity for that component, that means the component isn't in the entity, and so we can safely skip it. Through crazy type shenanigans, we can effectively encode this for the compiler to understand. Because the invalid cases of pulling an entity that doesn't exist can't be generated at compile time, and therefore are never attempted, the end result is the compiler generating the code quickly, even for large amounts of components.

To use this in the library will require a good amount of footwork, but in the end it will simplify quite a bit. This should allow us to rip out archetype identifiers, possibly rip out component maps, and make insertions much faster. Additionally, this canonicalization can be applied to views as well, which means queries should be much faster too.

@Anders429 Anders429 added the C - Enhancement Category: New feature or request. label Aug 25, 2022
@Anders429
Copy link
Owner Author

Not sure if I can actually fully rip out archetype identifiers or component maps, even with this.

We still need archetype identifiers for internal locations, as well as for serialization and deserialization. And component maps are still needed to construct those archetype identifiers, and for locating component columns in archetypes on queries and additions/removals.

But this still adds great simplifications to push and insert. Rather than always allocating a new archetype identifier, we only have to allocate it on the first insertion. That's a huge benefit, especially for programs that insert and extend on the world a lot.

I'm looking into a way to improve performance on views as well. We can guarantee with this method that a view has only one view on each component with this method, so long as the registry only has one of each component.

It would also be great if there were a way to do insertion and removal of components with this canonicalization, but there just doesn't seem to be a way to encode the type for each location's archetype, since there can be 2^n possible types for n components. It's just not practical to store 2^n variants or 2^n fields, so the type erasure is necessary.

@Anders429
Copy link
Owner Author

One other benefit of these new heterogeneous list powers: A lot of things in #49 are now possible. We can make them "compile time checks" by making the program not compile if they aren't satisfied. For example, we can make the canonical entity trait only compile if the components are in the registry.

@Anders429 Anders429 mentioned this issue Sep 6, 2022
5 tasks
@Anders429
Copy link
Owner Author

In the work on this so far, it seems that this introduces a lot of generics into signatures. Basically, for every trait that can be reordered or indexed into in relation to a registry (which is basically all of the heterogeneous ones), you have to include a separate marker type to indicate the location of such a type. From what I can tell, it seems like that won't be necessary if we can do some kind of negative bounds, but that's most likely a long way off in stable Rust.

The issue that arises here is that World::query(), World::par_query(), Entry::remove(), Entry::query(), and the boilerplate surrounding System and ParSystem implementations all just got a lot more busy, since they were best used by just supplying the type. There's no way to omit the ending generics, it seems.

I wonder if it might be good to introduce some kind of Query<V, F> type that could be passed to a query() method. That would get us back to only having to specify two types. For systems, we could also explore some kind of macro for defining the systems without all of the repeated boilerplate.

@Anders429 Anders429 linked a pull request Sep 13, 2022 that will close this issue
@Anders429 Anders429 added P - Low Priority: Not particularly urgent. A - Storage Area: Storage inside a World. A - Querying Area: Querying components of entities. A - Systems Area: Systems defined to operate over entities. A - Scheduling Area: Parallel scheduling of systems. labels Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A - Querying Area: Querying components of entities. A - Scheduling Area: Parallel scheduling of systems. A - Storage Area: Storage inside a World. A - Systems Area: Systems defined to operate over entities. C - Enhancement Category: New feature or request. P - Low Priority: Not particularly urgent.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant