Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archetype Invariants #1481

Open
alice-i-cecile opened this issue Feb 19, 2021 · 16 comments
Open

Archetype Invariants #1481

alice-i-cecile opened this issue Feb 19, 2021 · 16 comments
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible C-Usability A targeted quality-of-life change that makes Bevy easier to use S-Needs-Design-Doc This issue or PR is particularly complex, and needs an approved design doc before it can be merged

Comments

@alice-i-cecile
Copy link
Member

alice-i-cecile commented Feb 19, 2021

Introduced in #1312. The complete design will be found at RFC #5.

The Basics

An archetype is the set of components that are found together on a single entity. The set of archetypes present in our World is the union of the archetypes of every entity in the world.

Archetype invariants are rules that limit which components can coexist, limiting the possible archetypes that can co-occur. Because of the flexible power of commands.spawn and commands.insert, the components that an entity could have is unknowable at compile time, preventing us from .

We can use this information to allow for more granular component access in ways that would otherwise result in inconsistent runtime behavior, or verify that certain logical rules are followed during execution.

Due to the overhead of verifying these rules against the World's archetypes, this is likely best done only during debug mode, or perhaps by post-hoc inspection of logs.

Use Cases

  1. Reducing false positives in system scheduling ambiguity detection (Ambiguous system ordering #1312)
  2. Assumption checking for safety
  3. More granular overlapping queries (to expand on Less permissive component conflict allowance in system queries #1320)

API for specifying archetype invariants

Primitives:

  • forbidden(my_bundle): an entities archetype cannot have have the specific combination of components listed in the bundle as a subset
  • 'A.always_with(B)': entities that have component A always also have component B
  • `A.only_with(my_bundle)': component A never occurs with components other than those in the bundle

Derived:

  • inseparable(my_bundle): entities that have either component A or B never occur without each other. Equivalent to combining A.always_with(B) with the reverse for every pairwise combination
  • disjoint(my_bundle): entities can only have at most one component in the bundle. Equivalent to creating a forbidden archetype invariant for each pairwise combination
  • bundle versions of all of the primitives, where A and B can be replaced by tuples of components (bundles) instead
@alice-i-cecile alice-i-cecile added C-Feature A new feature, making something new possible C-Usability A targeted quality-of-life change that makes Bevy easier to use A-ECS Entities, components, systems, and events labels Feb 19, 2021
@alice-i-cecile
Copy link
Member Author

alice-i-cecile commented Feb 27, 2021

A new thing that I want: element_of(bundle), which allows for "any one of the underlying components" and can be dropped in anywhere you could put a component. This enables enum-like component groups in a really nice way.

For context, I want this for overriding behavior in a checked and elegant way.

@alice-i-cecile
Copy link
Member Author

As an interesting twist: a user asked in #1635 to enforce that a resource must always exist. In theory, because #1525 stores resources as components on a singleton entity, we could provide a wrapper to ensure that, by tying it to a Resource marker component or the like.

While I think that archetype invariants are the right tool for ultimately solving the other requests in #1635, I'm much less convinced on this use case. It feels like it's exploiting a quirk of the implementation details, and that systems panicking when they request a resource that doesn't exist gets us the desired behavior already.

@alice-i-cecile
Copy link
Member Author

From @BoxyUwU, archetype invariants should only be checked once all commands are flushed, in order to allow for comfortable gradual creation / modification of new entities.

@alice-i-cecile
Copy link
Member Author

Once all archetype invariants are known, we need to check that no contradictory (impossible to satisfy) rules exist.

@BoxyUwU
Copy link
Member

BoxyUwU commented Mar 25, 2021

archetype invariants should only be checked once all commands are flushed

we actually need to check invariants basically after any time we hand out an &mut World as that would be enough to violate invariants and cause unsoundness if we rely on invariants for query disjointedness.

so this basically means that

  • Exclusive systems need invariant checking afterwards
  • Applying a command takes &mut World which means commands cannot rely on invariants for soundness unless we check invariants after EVERY command is ran (this is extremely undesirable as it would destroy the ability to build entities across multiple commands)
  • All API's on world that require invariants to hold (such as running systems) must either check invariants themselves or have users check it (it would be really easy to forget to do this so we should we try leverage the type system for this. i.e. a CheckedWorld struct which is just a wrapper around a borrow of World that exposes all the APIs that want invariants to hold)

@concave-sphere
Copy link

These "invariants" could be viewed as a bottom-up "proto-schema". It might be worth comparing to how traditional database schemas are put together:

  • The database is broken into tables
  • Tables are broken into columns
  • Some columns of a table are the "primary key"
  • Columns might have constraints on them (like "always present")
  • Various additional finer grained functionality

Two things that jump out at me here are (1) the entire schema is explicitly defined in one place, not scattered around the code base, and (2) the definition includes a basic hierarchy that can be used to organize thinking about the system.

Purely as a thought experiment, here is an attempt at describing this in Bevy ECS terms. Parts of this are probably contrary to Bevy's design goals, but maybe some ideas can be pulled from it.

  1. A "table" is a value that implements the trait Table.
  2. Every spawn command now takes a type defining the table to spawn inside: commands.spawn::<MyTable>().insert(...).
  3. An additional QueryFilter is added: Table<MyTable>. Query, the scheduler, the ambiguity checker, etc understand that different Tables are disjoint.
  4. A "column" is a component.
  5. The "primary key" is Entity.
  6. Additional constraints are created when the Table is defined.

A rough realization of this in Rust:

trait Table { ... }
trait Column<Table: Table> { ... }

/// See below.
trait SubTable {
  type Parent;
}

The Column trait opts in a column to being usable with a particular table. The type signatures of spawn and insert require the inserted component to be a column of the table. The Bundle trait also is parameterized by a table, and all fields of a derived Bundle must implement Column<Table>. There's no way to stop a type from implementing both Table and Column, but doing so is going to be extremely confusing at best and should be avoided.

To ensure a single definition of a Table, it is added from AppBuilder like this:

app.define_table::<MyTable>().unwrap()
   .add_column::<MyColumn>()
   // Sub tables allow modules to have private Column types
   // that are inserted into the schema in a well defined way and
   // don't collide with the Columns of the parent table.
   //
   // A sub table is a collection of columns and constraints that
   // are logically separate from the main table.  Specifically, the
   // sub table can only add constraints that involve its own
   // columns.  Sub tables can NOT have multiple "rows" per
   // Entity -- they just add "normal" columns to the Entity that
   // are kept separate from the parent schema.
   //
   // The sub table *as a whole* is either present or not-present
   // for a given Entity.  Invoking `insert::<MySubTable>` on an
   // `EntityCommands` causes the sub table to be inserted.
   // Must-exist constraints are triggered at this time.  If the
   // sub-table is required for a given `Entity`, the parent table
   // should assert a must-exist constraint with the sub-table's
   // type (if known).
   //
   // `add_sub_schema_bundle` on `TableBuilder<MyTable>` invokes
   // the callback with a `SchemaBuilder`.  Any tables defined by
   // the callback will become sub-tables of `MyTable`.
   .add_sub_schema_bundle(|builder| foreign_package::define_table(builder))
   .add_constraint(Constraint::always_present::<(MyColumn, MyOtherColumn)>())
   .insert_table();

trait SchemaBuilder {
  fn define_table<Table>(&mut self) -> Result<impl TableBuilder<Table>, anyhow::Error>;
  fn add_column<C: Column>(self) -> Self;
  fn add_sub_schema(self, cb: FnOnce<impl TableDefiner>) -> Self;
  fn add_constraint(self, c: Constraint<Table>) -> Self;
  fn insert_table(self) -> Result<(), anyhow::Error>;
}

Constraints are defined like this:

impl Constraint<Table> {
  // This type is always present for any Entity in the table.
  //
  // ColumnSet is a trait that's implemented for anything that is
  // Column or a tuple of ColumnSets.
  fn always_present<CS: ColumnSet<Table>>();

  // More constraints can go here.  Constraints always involve columns of `Table`.
  // Cross `Table` constraints are not supported.
}

A table can only be defined once: invoking define_table twice for the same type will return an Err. The table can't be used until it is inserted (e.g, spawn will fail with an error that the table isn't defined).

I'm not sure that's implementable, but I've spent too much time on it already.

@alice-i-cecile
Copy link
Member Author

@concave-sphere this is fascinating. I agree with your intuition that they feel like bottom-up schema. I'll come back to this in depth as I work on an RFC for these.

@alice-i-cecile alice-i-cecile added the S-Needs-Design-Doc This issue or PR is particularly complex, and needs an approved design doc before it can be merged label Apr 23, 2021
@alice-i-cecile
Copy link
Member Author

alice-i-cecile commented May 21, 2021

What if you would store invariant components in separate tables. That way you'd never have to table-move them and you vastly reduce the amount of components to move on add/remove ops. I'd expect many of the components (pos,rot,transform,mesh,material,...) to be static anyway
It'd be a pretty easy thing to add into bevy's current design: in addition to the current table ptr & table index, just add an invariant_table_ptr and an invariant_index

From @SanderMertens on Discord.

@alice-i-cecile
Copy link
Member Author

When combined with command error handling, we could use this to reliably swap out incompatible sets of components. Instead of panicking, we could handle the result of an insertion error by removing the offending components.

What I really want is negative or exclusive members in bundles, to make it easier when changing the state of an entity, like moving it to a different area that switches cameras and disables physics.
So some markers are added, others are removed in one direction, and the opposite in the other.

@arialpew
Copy link

arialpew commented May 12, 2022

When you start to model something with marker components, you inevitably fall into this issue.

Markers components like Dead and Alive can be represented as an enum and it's often the first intuitive way of doing it.

Then, you are probably going to use markers components, you flatten the game state.

It make easy for Bevy to filter components at runtime and parallelize systems with fine-grained query. But there's a cost, by using marker components instead of enum, you make illegal state representable in your own logic.

Not a trivial issue to solve. You really need theses markers components because it's a major point of an ECS and the concept is easy to understand, but at the end you need to be carefull about what you do with your game state and your code.

You can introduce illegal state like a Player that is Swimming and Dead when you expected a Player that is Idle and Dead.

@sixfold-origami
Copy link
Contributor

sixfold-origami commented Jun 22, 2022

I started working on a basic prototype for this (with help from @alice-i-cecile), but came to an impasse trying to initialize components by TypeId. I'll probably take another crack at it soon with a different approach, but if I don't, here's what I have:
https://github.com/plof27/bevy/tree/basic-archetype-invariants

Some things we worked out:

  1. Archetype invariants should live on the world: Different worlds may have radically different archetypes, and app-less worlds should also be supported.
  2. I think that including helper methods on ArchetypeInvariant to construct common invariants is a nice design. These would then be accompanied by add_archetype_invariant methods on both World and App. The app method would add the invariant to all worlds by default.
  3. The archetype invariants stored on the world should be append-only, and should not be modified at runtime. Since we do not store which archetypes currently have entities, it is impossible to know which archetypes would need to be re-checked if/when a new invariant is added. Furthermore, it is also impossible to know which old, invalid archetypes are no longer a concern. (The prototype handles this by rechecking all archetypes whenever a new invariant is added, and discourages users from adding new invariants at runtime.)
  4. Storing invariants purely at the type level causes unacceptable amounts of API mess, since the types need to be generic over both single components and bundles of components. It also causes problems when storing a list of invariants, as we must resort to dynamic dispatch.
  5. Storing invariants as just ComponentIds means that the user needs to pass &mut World whenever they initialize an archetype invariant, since we cannot ensure that the invariant's components have been intialized in the world before the invariant is added. This causes problems with the app builder API, in addition to being messy.
  6. Storing TypeIds causes problems because we cannot initialize components from TypeIds, and we can't guarantee that all components will be initialized before the invariants are first checked.
  7. Since Archetypes store their components as a list (not really but it's basically a list) of ComponentIds, the archetype invariant system needs to use ComponentIds at some level. This means that dynamic component support will likely come for free if the architecture is designed with it in mind. (This is why the prototype exposes the UntypedArchetypeInvariant struct.)
  8. If at all possible, archetype invariants should be stored in a structured form. This way, they can be used as constraints when optimizing other parts of the engine. In particular, these are appealing as method for ensuring safe parallelization of system parameters that otherwise appear to conflict.
  9. The structure shown in my branch does not cover all possible logical statements that can be made about archetypes, but it does cover the vast majority of common cases. In the future, I would add a Custom variant to ArchetypeStatement that allows the user to provide a custom function, as a safety valve.
  10. The most natural way to validate invariants is to simply check all the new archetypes whenever the world's archetypes are modified. The best way to cache which archetypes have already been checked is to store a cursor pointing to the most recently checked archetype in the archetype list. This relies on the fact that the archetype list is append-only.
  11. The efficiency of checking multiple invariants can likely be improved non-trivially using the following method for each archetype to check:
    1. For each unique ComponentId in the invariant list, check if the archetype has this component. Store the results in a boolean array or as bits in an unsigned integer
    2. For each invariant, use an AND mask to select the bits (components) that are present in the predicate and consequence (separately)
    3. Do the appropriate operation to determine if the value of the predicate and consequence
    4. Compute (!predicate || consequence)
  12. Pruning redundant or tautological invariants is likely not worth the effort, since removing them is very unlikely to reduce the number of set containment checks we will need to do.
  13. Archetype invariants should simply panic if they fail at runtime, because they will likely be used for soundness. Before panicking, however, the system should output a list of all the invariants that the archetype violated (for debugging). These can just be checked sequentially and naively: performance does not matter if you are about to panic anyway.
  14. Certain reasonable archetype invariants cause weird problems with transitivity when chaining entity commands. For example, if {A} and {A, B, C} are valid, but {A, B} is not, adding B followed by C will cause the invariants to panic, but adding {B, C} simultaneously will not. We came to the conclusion that handling this case would be quite difficult without knowing which archetypes are currently in use. Furthermore, disallowing transitive invalidation improves safety. So, instead of handling this in the archetype invariant system, we propose two complementary strategies to mitigate this:
    1. Automatically merge chained entity commands. See Chained EntityCommands create useless temporary archetypes #5074
    2. Warn users of this danger, and instruct them to perform their operations in an order that will not break their invariants. It is relatively straightforward to prove that this is always (ish) possible, although it may be slower in some cases.

Open questions

  1. We're still unsure of how best to store the invariant information. As described above, all the approaches considered thus far (raw types, TypeIds, ComponentIds) have non-trivial problems or limitations.
  2. The exact list of invariant construction helper methods still needs to be worked out.
  3. Some of the helper methods will likely be bijective (e.g. A <=> B), but the current API design doesn't have a nice way to generate and add multiple invariants at once.
  4. How can we ensure that the invariants will be checked every time the archetypes are modified? As far as I know, there isn't a universal modification hook for archetypes. We could manually add the check wherever &mut Archetypes appears, but this will likely cause bugs in the future. (The likely solution here is to add a Mut-style smart pointer.)
  5. Is there a stronger way to ensure archetype invariants are not added at runtime? Reverting back to AppBuilder would solve this 🙃

@Zeenobit
Copy link
Contributor

Zeenobit commented Feb 2, 2023

I have been toying with a very simple solution to this issue. It's not perfect, mainly because I tried to implement it without touching Bevy internals.

In summary, my idea involves a bundle called Require<T: Bundle>:

#[derive(Bundle)]
struct Require<T: Bundle> {
    bundle: RequiredBundle,
    #[bundle(ignore)]
    marker: PhantomData<T>,
}

Usage:

#[derive(Bundle)]
struct B {
    require: Require<(X, Y)>,
    /* ... */
}

This bundle stores its given type. A system then checks all instances of this bundle at the end of every app cycle, and react as needed. In my implementation, I'm just calling panic!(), but it would be easy to change/configure this behavior.

This implementation could also be extended to support different kinds of requirements, such as RequireNone (require none of components in given bundle) or RequireAny (require any of the components in given bundle).

Full gist along with a test case is included here:
https://gist.github.com/Zeenobit/bbceaa4f54b952eb7d7cf47492ad71bf

The benefit of this approach is mainly it's simplicity. There is no need to register anything with the app, and the bundle requirements are very explicitly stated on the bundle itself, which makes it very readable to me.

The biggest issue with this implementation is that the inner bundle must be spawned at least once so that it's registered with the app. I don't think I can solve this issue without touching Bevy internals, because init_bundle() isn't public. And even if it was, I'd have to force the user to register their requirement bundle with the app (so that the system can register it using init_bundle). I'm trying to avoid registering anything with the app, because I think the requirements of a bundle should reside with the bundle declaration for best readability.

The other issue is that bundles of bundles with the same requirements would cause a duplicate bundle error. This could be easily solved by ignoring duplicated Require<T> bundles.
Edit: As I'm typing this, I realize because all requirements would cause a duplication error, because the inner RequiredBundle has no type discriminator. I'd need to find a way to merge RequiredBundle components during insertion.

I'm curious what the Bevy community thinks of this approach, and if anyone has any ideas for the first issue I mentioned above.

@tigregalis
Copy link
Contributor

As an alternative, this is a pattern I've used for "exclusive marker types":

mod cam_marker {
    #![allow(unused)]
    use bevy::prelude::{Component, With, Without};

    #[derive(Component)]
    pub struct Main;

    #[derive(Component)]
    pub struct LeftEye;

    #[derive(Component)]
    pub struct RightEye;

    pub type OnlyMain = (With<Main>, Without<LeftEye>, Without<RightEye>);
    pub type OnlyLeftEye = (Without<Main>, With<LeftEye>, Without<RightEye>);
    pub type OnlyRightEye = (Without<Main>, Without<LeftEye>, With<RightEye>);
}
use cam_marker::*;

The above would probably easily generalise to a macro.

Add the main structs as components to entities:

    // spawn the left camera
    commands
        .spawn(Camera3dBundle {
            transform: Transform::from_xyz(-0.1, 0.0, 8.0).looking_at(Vec3::ZERO, Vec3::Y),
            camera: Camera {
                priority: -1,
                ..default()
            },
            camera_3d: Camera3d {
                clear_color: ClearColorConfig::Default,
                ..default()
            },
            ..default()
        })
        .insert(LeftEye);
    // spawn the right camera
    commands
        .spawn(Camera3dBundle {
            transform: Transform::from_xyz(0.1, 0.0, 8.0).looking_at(Vec3::ZERO, Vec3::Y),
            camera: Camera {
                priority: 1,
                ..default()
            },
            camera_3d: Camera3d {
                clear_color: ClearColorConfig::None,
                ..default()
            },
            ..default()
        })
        .insert(RightEye);

The OnlyX type aliases are used as query filters:

// these systems can run in parallel

fn spin_left_camera(
    mut query: Query<&mut Transform, OnlyLeftEye>,
    // ...
) {
    // ...
}

fn spin_right_camera(
    mut query: Query<&mut Transform, OnlyRightEye>,
    // ...
) {
    // ...
}

This doesn't enforce anything as such at runtime, but it removes the ambiguities. If any entity has more than one member in that exclusive set the queries simply won't match anything.

It might be nicer with rust-lang/rfcs#2593

As I write this comment, I discover:
#6556
https://github.com/ChoppedStudio/bevy_ecs_markers

@Zeenobit
Copy link
Contributor

Zeenobit commented May 6, 2023

Another attempt at this:
#8557

@robojeb
Copy link
Contributor

robojeb commented Jun 26, 2023

As an extension of disjoint components it would be nice if there were a way to define a "disjoint wrapper" or "state" type.
A type of the form:

#[derive(Component)]
struct AIState<T>(PhantomData<T>);

Such that you can define that any AIState<T> is disjoint with any AIState<U> where T != U.
Ideally this would come with some ability to do something like entity.remove_disjoint_id(AIState::<()>::disjoint_id()), which would remove any component that is of type AIState<T>.
Even better would be if the act of entity.insert(AIState<T>) would automatically remove any other AIState<T> component.

Such a type would behave as one component type for the purposes of insertion, but act as distinct types for queries.

I know that a something similar can be achieved with Enum, but I have two issues with using enums:

  1. Extending the set of states requires modifying the Enum definition, so I can't really abstract the logic out to a crate that could be easily shared.

  2. I have to process all the states in the same system which reduces potential parallelism by requiring me to bring all the components I need into that one system (or using an exclusive system).

@alice-i-cecile
Copy link
Member Author

Required components (#7272) are on the path to being added to the engine, and will fill some of the niche of these. They don't account for mutually exclusive components, aren't enforced at runtime and can't be used for borrow checker purposes though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible C-Usability A targeted quality-of-life change that makes Bevy easier to use S-Needs-Design-Doc This issue or PR is particularly complex, and needs an approved design doc before it can be merged
Projects
None yet
Development

No branches or pull requests

8 participants