Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structure of Arrays Components and Custom Allocator Support #68

Open
randomPoison opened this issue Dec 4, 2014 · 15 comments
Open

Structure of Arrays Components and Custom Allocator Support #68

randomPoison opened this issue Dec 4, 2014 · 15 comments

Comments

@randomPoison
Copy link
Contributor

I've been looking at entityx and it seems like a great library, thanks for makings such a great library 👍 .

Is there any plan/desire to add support for defining components as a struct of arrays (as opposed to making them arrays of structs as they are now)? I'm thinking something along the lines of what's described in this BitSquid post and the one before it.

Hand-in-hand with that goes custom allocator support. I know entityx already uses memory pools to keep cache coherency, but is there any way to provide your own allocation scheme to be used? I'll admit that I haven't looked too much into whether this option is already available, but it would be necessary to allow struct-of-array type components.

Thanks again for your hard work on entityx.

@alecthomas
Copy link
Owner

Interesting article, though I don't see how the approach taken by EntityX precludes what he describes. Each pool manages a single component, so if you reduce your components to single attributes you have the same approach described at the beginning of the article. eg.
eg.

struct Position: Component<Position> {
  Vector3 position;
};
struct Velocity: Component<Velocity> {
  Vector3 velocity;
};
struct Acceleration: Component<Acceleration> {
  Vector3 acceleration;
};

The obvious difference with the latter part of the article is that component arrays aren't offset within a single block of memory, but I'm not convinced that's a big win in reality.

However all that said, I'm not opposed to supporting custom component allocators. I think it's a good idea. It also wouldn't be that difficult to refactor to achieve this.

PS. Glad you like EntityX!

@alecthomas
Copy link
Owner

Actually this could also address the concerns in #67, given an appropriate allocator.

@randomPoison
Copy link
Contributor Author

Thanks for the response! I'm curious, how would this help resolve #67? I'm not seeing the connection between custom component allocators and having to iterate over all of the entities every frame.

@alecthomas
Copy link
Owner

I think one of the main concerns in #67 is holes in the component vectors and how that affects cache coherency. That part (at least) could be solved with a custom allocator that compacts the holes.

@alecthomas
Copy link
Owner

I've just pushed an experimental branch that performs a lot of the component computations to compile time. As part of that I also added support for custom component storage, and implemented ColumnStorage.

I'm not 100% sure that the compile-time approach is a good one, I'll have to do some benchmarks, but I'm a bit concerned about compilation bloat.

@alecthomas
Copy link
Owner

I think to be more efficient, the storage implementations will also need to be in control of the entity component bitmaps. This would allow them to perform more intelligent compaction, etc.

@alecthomas
Copy link
Owner

I've been doing a bunch of research into data structures for entity-component systems. I think this is one of the best articles, and it basically states that there is no silver bullet (which is also the conclusion I had come to). If you have one column per component, then add packing, the IDs in each column will become desynced. This will lead to parallel iteration over multiple components blowing out the cache as it has to jump all over the place to match the IDs in whichever component is the "primary" iteration target.

The bitsquid article you mentioned at the beginning does not cover this at all.

I think what might be the best general purpose solution is to do two things:

  1. Provide a storage implementation that produces statistics on access patterns.
  2. Provide a storage implementation that lets you group packed columns of components. eg. PackedGroupedColumnStorage<ColumnGroup<Position, Velocity, Acceleration>, Body, Mesh>

This should provide general consumers with enough flexibility to get pretty close to optimal performance out of the library.

@wivlaro
Copy link

wivlaro commented Feb 13, 2015

Continued from #86 , I wasn't suggesting striping the data even more by splitting components further and having something convenient to re-assemble them.

Although cache-coherency can definitely be one advantage to using entity-component systems, another is simply the code-organisation (performance notwithstanding). Some use-cases are that you use large components on a few objects and want to take advantage of the ECS for consistency of application code organisation.

I was suggesting something like this e3cba3e so we can simply specify an alternative Pool implementation.

I was considering a couple of alternative pool implementations:

  • PackedPool - similar to the default pool (allocated in chunks) but tightly packed with a lookup and a minheap free list (similar to the EntityManager)
  • NoPool - just defers back to the std::allocator (a vector of pointers) (Granted, I could just make a component with the pointer inside, but it might be nice for the application developer to be able to easily switch strategy to adapt to their particular use-case without rewriting application code.)

[OT] Also, have you considered using a std::vector as a minheap for the free list of the EntityManager instead of the FIFO list? It might help cache consistency. e.g. If I allocate entities 100, then free id 100 down to 2 and then allocate 1 more, I'll be left with a huge gap. Just wondering if you tried benchmarking a minheap free list.

@wivlaro
Copy link

wivlaro commented Feb 14, 2015

Actually, a min-heap probably would make no difference.

@alecthomas
Copy link
Owner

Although cache-coherency can definitely be one advantage to using entity-component systems, another is simply the code-organisation (performance notwithstanding).

Yes... I am aware... In fact, code organisation is the primary reason to use an ECS in my opinion, and what most of the seminal articles cite as motivation.

I was suggesting something like this e3cba3e so we can simply specify an alternative Pool implementation.

Yes, that is the second part of this ticket: "Custom Allocator Support". Though perhaps I should consolidate into a new discussion ticket specifically about custom allocators.

I was suggesting something like this e3cba3e so we can simply specify an alternative Pool implementation.

This solves one problem: "overriding allocation for a single component". But I think there are a few other use cases it should support to be generally useful. I'd be interested in hearing thoughts, but here's what I have considered:

Required:

  • Custom allocation per-component.
  • Custom allocation for groups of components (eg. position+velocity are always used together).

Useful:

  • Support for per-allocator iteration? eg. for a PackedAllocator implementation, being able to iterate directly over the components without ID indirection could be useful.
  • Ability to construct an allocator with arguments before passing to EntityManager. eg. chunk size, memory alignment parameters, etc.

Other considerations:

  • Cache coherency.
  • Sparsity.
  • Large component support.

I haven't had time to come up with a design for this that I like yet, but I definitely would like to support some or all of these features.

@wivlaro
Copy link

wivlaro commented Feb 14, 2015

Required:

  • Custom allocation per-component.
  • Custom allocation for groups of components (eg. position+velocity
    are always used together).

Definitely per-component is useful. Different components have different usage patterns.

Maybe where components are always used together and it's a performance
bottleneck, they could just be written as one component? I would argue that
this feature is less important.

Useful:

  • Support for per-allocator iteration? eg. for a PackedAllocator
    implementation, being able to iterate directly over the components without
    ID indirection could be useful.

Yeah this would be quite good for isolated single-component updates.

  • Ability to construct an allocator with arguments before passing to
    EntityManager. eg. chunk size, memory alignment parameters, etc.

This could also be useful. It brings me onto another question: do you find
use-cases for having multiple EntityManagers in a project? Eg. where
interaction is minimal between two distinct sets of entities and generally
have different sets of components, such as where all of one set of
user-controlled entities can collide with all of one set of generated
enemies, but neither group collides with their own group.

I feel like I should just have one manager rather than having to manage the
interactions between many, but with the memory pooling currently there,
just wondered how you find yourself using it.

@waldnercharles
Copy link

Does entityx currently not support any form of cache coherency when deleting and re-adding components / entities? (in compile_time branch or master)

eg. When deleting an entity does it swap the deleted entity and its corresponding components with the last known valid entity / components? (swap would be the wrong term, more like, overwrite.)

@alecthomas
Copy link
Owner

@waldnercharles there is a fork of the compile_time branch that has a PackedColumnStorage that does just this. It is incomplete, but I got it to a stage where benchmarks showed generally worse performance than the naive solution.

I haven't profiled it in detail though.

@waldnercharles
Copy link

Worse performance? Is that just when deleting or is that also when iterating? or all of the above?

@alecthomas
Copy link
Owner

TBH I don't remember the exact details, but I'm assuming it was during iteration. The fact that the indices of stored components can differ means that iteration can become quite complex. For example if you have two component columns a=[0, 1, 2, 3]; b=[0, 1, 2, 3] and you remove b[1], the order will then be a=[0, 1, 2, 3]; b=[0,3,2]. This means that iterating over these two columns in parallel involves quite a bit of housekeeping, and added complexity.

Edit: The optimize() method sorts the columns so they are all in the same order.
Edit 2: This post outlines the problem quite well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants