finish and document our novel/unique event system #1169

slimsag · 2024-03-05T18:17:51Z

Important

The point of this issue is to share what I am exploring with others interested in Mach because this is a foundational part not yet established. We're exploring this path, and this is the direction we're going in for now.

This issue is to catch others up on where we are going, not an invitation to discuss ad-nauseam the different approaches to ECS (in my experience, there are a lot of people who are very interested in discussing ECS as a code-golf/architecture challenge while not solving real problems with it.)

meta: the point of that statement is to say that this is the direction we're currently exploring, at this point more discussion of ECS won't help us progress that exploration - we need code improvements and actual testing to further the discussion. It's not to say 'you are not welcome here' but rather to say 'I can't spend a ton of time discussing this at length, we want to play with it and experiment and find out' - it's an apology, not a 'go away' statement 🙂

Event system

events (aka message passing) are our (yet-to-be-proven, novel/unique, currently less-than-half-implemented) approach to solving a lot of problems with typical ECS designs.

Mach modules will support sending events to one another. This is partially implemented today, with global (one->many) and module-to-module (one->one) events.

Order-of-execution challenges

ECS design patterns often face a challenge of order-of-execution with systems: e.g. you may want physics systems to run before rendering systems as a trivial example.

In other ECS you often describe systems as executing in a specific order via e.g. a sorting integer, or by specifying their dependent systems that should execute first. Kind of like adding all functions to a big [][]fn, then executing them in order (using the 2nd list as 'systems that can be run in parallel').

Our solution to this is different, and based on the fact that a message send() will be a synchronization point in the sense that in A->B or A->BC, B and C will be guaranteed to execute after A.

In other words, our approach is a bit like 'call foo() then bar() if you want them to execute in that order' while the traditional approach is a bit more like 'set the priority of foo=1 and bar=2 if you want them to execute in that order, and don't forget about baz=2!'

Parallelism / multi-threading

Traditionally ECS requires some up-front declarative API design:

Specifying which order systems should execute in
Specifying which systems have dependencies on other systems
Specifying specifically what ECS data will be queried (potentially down to a specific set of entities, components, and whether you will write or just read component data)
Potentially specifying (or deferring until end-of-frame) creation of new entities.

Traditionally the ECS system scheduler takes all of this information into account when scheduling systems, because if enough information is declared ahead of time then one can work out which systems can execute in parallel without needing e.g. heavy-handed row-level locks on component data for entities to avoid data races.

The downside to this traditional approach is that it requires you to be much more declarative in using the ECS API, which I believe makes it harder to iterate quickly. Additionally, it leads to some 'awkward' situations where one may need to e.g. break up a system that is logically just doing one thing, into two independent systems to fit into the requirements of ECS' declarative patterns.

Again, our solution to this is different:

The plan is to leverage sending events as an opportunistic point of parallelism. For example, every frame a global .tick event is sent to every system that wishes to listen for it, a one->many event, and this would be an opportunity point to execute all systems listening for that event in parallel on different threads.
As a result, a system in our ECS is not 'a function' but rather 'an event handler' (a function with optional data sent to it)
As a result, a system in our ECS does not need to declare ahead-of-time what exactly it will query and why. Instead, it can do as it likes - as the programmer intends - and ECS can be the good 'help me structure my code' abstraction that it tries to be.
- As a consequence, we either (1) have to accept data races if you run two systems in parallel that access the same data - or (2) come up with an alternative system to prevent this without harming performance. We choose (2)
- Our belief is that with some DB-inspired techniques (think: row locks), additional tooling (to observe event sending behavior, and measuring different parallel execution orders at runtime for contention) - that we can do some runtime-guided-optimization of parallelism points in our scheduler and get the best of both worlds.
- This is, in many ways, similar to the tradeoffs Zig makes compared to Rust's borrow checker - for example.

Other use-cases

A keen reader of all the above will have observed something: If all things in our game/app state are derived from named events and data - then surely we could leverage this information for some interesting use cases?

Observing events as a method of debugging, visualizing and profiling execution flow
Sending events from another process to control your game/app (e.g. game editor sending events to your game)
Replaying events to arrive at the same effective game state, if you're careful, such as in e.g. networking code
Observing changes to entities at whatever level of granularity your prefer (not as 'an observer callback' but as an "I did something" event to be observed)
Solving the 'new entity creation is deferred until the end of the frame' problem in a different way.
Solving the traditional (e.g. Unity) mess of 'how do I run something before Update? Before FixedUpdate? Is there a PreUpdate? PrePreUpdate? PrePreFixedUpdate?'
Running different things at different frequencies - if all things stem from events, one can easily have multiple different loops firing different 'tick' events running at different frequencies, e.g. physics at 240hz and rendering at 60hz, with some synchronization point.
Modularity in general, because it opens the opportunity for modules to communicate with each-other directly without passing data through entities-and-components from one system to another.

The text was updated successfully, but these errors were encountered:

slimsag added this to the Mach 0.4 milestone Mar 5, 2024

slimsag mentioned this issue Mar 10, 2024

module system revamp #1182

Merged

17 tasks

slimsag closed this as completed in #1182 Apr 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

finish and document our novel/unique event system #1169

finish and document our novel/unique event system #1169

slimsag commented Mar 5, 2024 •

edited

Loading

finish and document our novel/unique event system #1169

finish and document our novel/unique event system #1169

Comments

slimsag commented Mar 5, 2024 • edited Loading

Important

Event system

Order-of-execution challenges

Parallelism / multi-threading

Other use-cases

slimsag commented Mar 5, 2024 •

edited

Loading