Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC Discussion] Legion ECS Evolution #22

Open
jaynus opened this issue Oct 24, 2019 · 47 comments
Open

[RFC Discussion] Legion ECS Evolution #22

jaynus opened this issue Oct 24, 2019 · 47 comments

Comments

@jaynus
Copy link

@jaynus jaynus commented Oct 24, 2019

Legion ECS Evolution

Following lengthy discussion on both Discord and the Amethyst Forum (most of which, including chat logs, can be found here), we propose with this RFC to move Amethyst from SPECS to Legion, an ECS framework building on concepts in SPECS Parallel ECS, as well as lessons learned since. This proposal stems from an improved foundational flexibility in the approach of Legion which would be untenable to affect on the current SPECS crate without forcing all users of SPECS to essentially adapt to a rewrite centered on the needs of Amethyst. The flexibility in Legion is filled with tradeoffs, generally showing benefits in performance and runtime flexibility, while generally trading off some of the ergonomics of the SPECS interface. While the benefits and the impetus for seeking them is described in the "Motivations" section, the implictions of tradeoffs following those benefits will be outlined in greater detail within the "Tradeoffs" section.

There are some core parts of Amethyst which may either need to considerably change when moving to Legion, or would otherwise just benefit from substantial changes to embrace the flexibility of Legion. Notably, systems in Legion are FnMut closures, and all systems require usage of SystemDesc to construct the closure and its associated Query structure. The dispatch technique in Legion is necessarily very different from SPECS, and the parts of the engine dealing with dispatch may also be modified in terms of Legion's dispatcher. Furthermore, the platform of Legion provides ample opportunity to improve our Transform system, with improved change detection tools at our disposal. These changes as we understand them are described below in the "Refactoring" section.

The evaluation of this large transition requires undertaking a progressive port of Amethyst to Legion with a temporary synchronization shim between SPECS and Legion. This effort exists here, utilizing the Legion fork here. Currently, this progressive fork has fully transitioned the Amethyst Renderer, one of the largest and most involved parts of the engine ECS-wise, and is capable of running that demo we're all familiar with:

image alt text

Not only can you take a peek at what actual code transitioned directly to Legion looks like in this fork, but the refactoring work in that fork can be utilized given this RFC is accepted while actively helping to better inform where there may be shortcomings or surprises in the present.

Motivations

The forum thread outlines the deficiencies we are facing with specs in detail. This table below is a high level summary of the problems we are having with specs, and how legion solves each one.

Specs Legion
Typed storage nature prevents proper FFI All underlying legion storage is based on TypeId lookups for resources and components
hibitsetcrate has allocation/reallocation overhead, branch misses Archetypes eliminate the need of entity ID collections being used for iteration
Sparse storages causes cache incoherence Legion guarantees allocation of simliar entities into contigious, aligned chunks with all their components in linear memory
Storage fetching inherently causes many branch mispredictions See previous
Storage methodology inherently makes FlaggedStorage not thread safe. Queries in legion store filter and change state, allowing for extremely granular change detection on a Archetype, Chunk and Entity level.
Component mutation flagging limited to any mutable access Legion dispatches on a Archetype basis instead of component, allowing to parallel execute across the same component data, but just different entities. *A special case exists for sparse read/write of components where this isnt the case
Parallelization limited to component-level, no granular accesses See previous information about Archetypes
Many elided and explicit lifetimes throughout make code less ergonomic System API designed to hide the majority of these lifetimes in safety and ergonomic wrappers
ParJoin has mutation limitations See previous statements about system dispatcher and Archetypes

Immediate Benefits in a Nutshell

  • Significant performance gains

  • Scripting RFC can move forward

  • Queries open up many new optimizations for change detection such as culling, the transform system, etc.

  • More granular parallelization than we already have achieved

  • Resolves the dispatcher Order of Insertion design flaws

  • ???

Tradeoffs

These are some things I have ran into that were cumbersome changes or thoughts while porting. This is by no means comprehensive. Some of these items may not make sense until you understand legion and/or read the rest of this RFC.

  • Systems are moved to a closure, but ergonomics are given for still maintaining state, mainly in the use of FnMut[ref] for the closure, and an alternative build_disposable [ref].

  • All systems are built with closures, causing some initialization design changes in regards to reference borrowing

  • The SystemDesc/System types have been removed.

  • a Trait type cannot be used for System declaration, due to the typed nature of Queries in legion. It is far more feasible and ergonomic to use a closures for type deduction. The except to this case is thread-local execution, which can still be typed for ease of use.

Refactoring

This port of amethyst from legion -> specs has aimed to keep to some of the consistencies of specs and what Amethyst users would already be familiar with. Much of the implementation of Legion and the amethyst-specific components was heavily inspired/copied from the current standing implementations.

SystemBundle, System and Dispatcher refactor

This portion of the port will have the most significant impact on users, as this is where their day-to-day coding exists. The following is an example of the same system, in both specs and legion.

High Level Changes

  • Systems are all now FnMut closures. This allows for easier declaration and type deduction. They can capture variables from their builder for state. Additional ‘disposable’ build types are available for more complex stateful modes.

  • System data declarations are now all within a builder, and not on a trait.

  • Component data is now accessed via "queries" instead of “component storages”

  • Component addition/removal is now deferred, in line with entity creation/removal

  • Default resource allocation is removed, all world resource access now return Option<Ref>

  • System registration explicit dependecies are removed, now execution is ordered based on "Stages", which can be explicit priorities, but all system execution is flattened into a single data-dependent execution.

Following is an example of a basic system in both specs and legion

Specs
impl<'a> System<'a> for OrbitSystem {
    type SystemData = (
        Read<'a, Time>,
        ReadStorage<'a, Orbit>,
        WriteStorage<'a, Transform>,
        Write<'a, DebugLines>,
    );

    fn run(&mut self, (time, orbits, mut transforms, mut debug): Self::SystemData) {
        for (orbit, transform) in (&orbits, &mut transforms).join() {
            let angle = time.absolute_time_seconds() as f32 * orbit.time_scale;
            let cross = orbit.axis.cross(&Vector3::z()).normalize() * orbit.radius;
            let rot = UnitQuaternion::from_axis_angle(&orbit.axis, angle);
            let final_pos = (rot * cross) + orbit.center;

            debug.draw_line(
                orbit.center.into(),
                final_pos.into(),
                Srgba::new(0.0, 0.5, 1.0, 1.0),

            );
            transform.set_translation(final_pos);
        }
    }
}
Legion
fn build_orbit_system(
    world: &mut amethyst::core::legion::world::World,
) -> Box<dyn amethyst::core::legion::schedule::Schedulable> {
    SystemBuilder::<()>::new("OrbitSystem")
        .with_query(<(Write<Transform>, Read<Orbit>)>::query())
        .read_resource::<Time>()
        .write_resource::<DebugLines>()
        .build(move |commands, world, (time, debug), query| {
            query
                .iter_entities()
                .for_each(|(entity, (mut transform, orbit))| {
                    let angle = time.absolute_time_seconds() as f32 * orbit.time_scale;
                    let cross = orbit.axis.cross(&Vector3::z()).normalize() * orbit.radius;
                    let rot = UnitQuaternion::from_axis_angle(&orbit.axis, angle);

                    let final_pos = (rot * cross) + orbit.center;

                    debug.draw_line(
                        orbit.center.into(),
                        final_pos.into(),
                        Srgba::new(0.0, 0.5, 1.0, 1.0),
                    );
                    transform.set_translation(final_pos);
                });
        })
}

Example bundle Changes

RenderBundle - Specs

impl<'a, 'b, B: Backend> SystemBundle<'a, 'b> for RenderingBundle<B> {
    fn build(
        mut self,
        world: &mut World,
        builder: &mut DispatcherBuilder<'a, 'b>,
    ) -> Result<(), Error> {
        builder.add(MeshProcessorSystem::<B>::default(), "mesh_processor", &[]);
        builder.add(
            TextureProcessorSystem::<B>::default(),
            "texture_processor",
            &[],
        );

        builder.add(Processor::<Material>::new(), "material_processor", &[]);
        builder.add(
            Processor::<SpriteSheet>::new(),
            "sprite_sheet_processor",
            &[],
        );

        // make sure that all renderer-specific systems run after game code
        builder.add_barrier();
        for plugin in &mut self.plugins {
            plugin.on_build(world, builder)?;
        }
        builder.add_thread_local(RenderingSystem::<B, _>::new(self.into_graph_creator()));
        Ok(())
    }
}

RenderBundle - Legion

impl<'a, 'b, B: Backend> SystemBundle for RenderingBundle<B> {
    fn build(mut self, world: &mut World, builder: &mut DispatcherBuilder) -> Result<(), Error> {
        builder.add_system(Stage::Begin, build_mesh_processor::<B>);
        builder.add_system(Stage::Begin, build_texture_processor::<B>);
        builder.add_system(Stage::Begin, build_asset_processor::<Material>);
        builder.add_system(Stage::Begin, build_asset_processor::<SpriteSheet>);

        for mut plugin in &mut self.plugins {
            plugin.on_build(world, builder)?;

        }

        let config: rendy::factory::Config = Default::default();
        let (factory, families): (Factory<B>, _) = rendy::factory::init(config).unwrap();
        let queue_id = QueueId {
            family: families.family_by_index(0).id(),
            index: 0,
        };

        world.resources.insert(factory);
        world.resources.insert(queue_id);

        let mat = crate::legion::system::create_default_mat::<B>(&world.resources);
        world.resources.insert(crate::mtl::MaterialDefaults(mat));

        builder.add_thread_local(move |world| {
            build_rendering_system(world, self.into_graph_creator(), families)
        });

        Ok(())
    }
}

Parallelization of mutable queries

One of the major benefits of legion is its granularity with queries. Specs is not capable of performing a parralel join of Transform currently, because FlaggedStorage is not thread safe. Additionally, a mutable join such as above automatically flags all Transform components as mutated, meaning any readers will get N(entities) events.

In legion, however, we get this short syntax: query.par_for_each(|(entity, (mut transform, orbit))| { Under the hood, this code actually accomplishes more than what ParJoin may in specs. This method threads on a per-chunk basis on legion, meaning similiar data is being linearly iterated, and all components of those entities are in cache.

Transform Refactor (legion_transform)

Legion transform implementation

@AThilenius has taken on the task of refactoring the core Transform system. This system had some faults of its own, which were also exacerbated by specs. The system itself is heavily tied in with how specs operates, so a rewrite of the transform system was already in the cards for this migration.

Hierarchy

This refactor is aimed towards following the Unity design, where the source-of-truth for the hierarchy (hot data) is stored in Parent components (ie. a child has a parent). This has the added benefit of ensuring only tree structures can be formed at the API level. Along with the Parent component, the transform system will create/update a Children component on each parent entity. This is necessary for efficient root->leaf iteration of trees, which is a needed operation for many systems but it should be noted that the Children component is only guaranteed valid after the transform systems have run and before any hierarchy edits have been made. Several other methods of storing the hierarchy were considered and prototyped, including an implicit linked-list, detailed here. Given all the tradeoffs and technical complexity of various methods (and because a very large game engine company has come to the same conclusion) the current method was chosen. More info can be found in the readme of legion_transform.

Transform

The original Amethyst transform was problematic for several reasons, largely because it was organically grown:

  • The local_to_world matrix was stored in the same component as the affine-transform values.

    • This also implies that use of the original transform component was tightly coupled with the hierarchy it belongs to (namely it’s parent chain).
  • The component was a full Affine transform (for some reason split between an Isometry and a non-uniform scale stored as a Vector3).

  • Much of the nalgebra API for 3D space transform creation/manipulation was replicated with little benefit.

Given the drawbacks of the original transform, it was decided to start from a clean slate, again taking inspiration from the new Unity ECS. User defined space transforms come in the form of the following components:

  • Translation (Vector3 XYZ translation)

  • Rotation (UnityQuaternion rotation)

  • Scale (single f32, used for uniform scaling, ie. where scale x == y == z)

  • NonUniformScale (a Vector3 for non-uniform scaling, which should be avoided when possible)

Any valid combinatoric of these components can also be added (although Scale and NonUniformScale are mutually exclusive). For example, if your entity only needs to translate, you need only pay the cost of storing and computing the translation and can skip the cost of storing and computing (into the final homogenius matrix4x4) the Rotation and Scale.

The final homogeneous matrix is stored in the LocalToWorld component, which described the space transform from entity->world space regardless of hierarchy membership. In the event that an entity is a member of a Hierarchy, an additional LocalToParent components (also a homogeneous matrix4x4) will be computed first and used for the final LocalToWorld update. This has the benefits of:

  • The LocalToWorld matrix will always exist and be updated for any entity with a space transform (ex any entity that should be rendered) regardless of hierarchy membership.

  • Any entity that is static (or is part of a static hierarchy) can have it’s LocalToWorld matrix pre-baked and the other transform components need not be stored.

  • No other system that doesn’t explicitly care about the hierarchy needs to know anything about it (ex rendering needs only the LocalToWorld) component.

Dispatcher refactor

The Dispatcher has been rewritten to utilize the built-in legion StageExecutor, while layering amethyst needs on top of it.

The builder and registration process still looks fairly similar; the main difference being naming is now debug only, and explicit dependencies have been removed in favor of inferred insertion order via Stages, and then full parallel execution. ThreadLocal execution still also exists, to execute the end of any given frame in the local game thread. This means that a Stage or RelativeStage can be used to infer the insertion order of the system, but it will still execute based on its data dependencies, and not strictly its place "in line".

Migration Story

World Synchronization Middleware

Because of the fundemental changes inherent in this migration, significant effort has gone into making transitioning, and using both "old" and “new” systems as seamless as possible. This does come with significant performance cost, but should allow people to utilize the a mix of specs and legion while testing migration.

  1. The engine provides a "LegionSyncer" trait; this is dispatched to configure and handle syncing of resources and components between specs and legion

  2. Underneath these LegionSyncer traits, lies a set of automatic syncer implementations for the common use cases. This includes resource and component synchronization between the two worlds

  3. Dispatching already exists for both worlds; dispatch occurs as:

    1. Run specs

    2. Sync world Specs -> Legion

    3. Run Legion

    4. Sync world Legion -> Specs

  4. Helper functions have been added to the GameDataBuilder and DispatcherBuilder to streamline this process.

In the current design, syncers are not enabled by default and must be explicitly selected by the user via the game data builder. For example:

.migration_resource_sync::<Scene>()
.migration_resource_sync::<RenderMode>()
.migration_component_sync::<Orbit>()
.migration_sync_bundle(amethyst::core::legion::Syncer::default())
.migration_sync_bundle(amethyst::renderer::legion::Syncer::<DefaultBackend>::default())

The above will explicitly synchronize the specifed resources and components. Additionally, the "Sync bundles" are provided for synchronizing the features out of any given crate.

This synchronization use can be seen in the current examples, as they still utilize a large amount of unported specs systems.

Proposed Timeline

With the synchronization middleware available, it gives users the ability to slowly transition to the new systems while actively testing their project. I propose the following release timeline for this, allowing users to skip versions as we go and work between:

  1. The current implementation is feature gated behind "legion-ecs" feature. This can be released as a new version of Amethyst to begin migration

  2. The next release performs a "hard swap", from “specs default” to “Legion default”. This would rename all the migration_with_XXX functions to specs, and make legion default. This would also include ironing out the legion world defaulting in the dispatchers and builders.

  3. The next release removes specs entirely, leaving legion in its place.

Brain-fart of changes needed by users

  • Render Plugins/Passes

    • Change renderer::bundle::* imports to renderer::legion::bundle

    • Change renderer::submodules::* imports to renderer::legion::submodules

    • All resource access changes from world to world.resources

      • fetch/fetch_mut change to get/get_mut

        • get/get_mut return "Option", not default
      • Read and Write remove lifetime

      • ReadExpect/WriteExpect change to Read/Write

      • You can still use the same <(Data)>::fetch(&world.resources) syntax

    • Resources need to cache their own queries

      • THIS IS RECOMMENDED DONE IN A CLOSURE, TO PREVENT TYPING NIGHTMARE
    • amethyst/amethyst-imgui@06c1a58 a commit showing a port

@jaynus jaynus changed the title [RFC Discussion] Legion Migration [RFC Discussion] Legion ECS Evolution Oct 24, 2019
@0x6273

This comment has been minimized.

Copy link

@0x6273 0x6273 commented Oct 24, 2019

Looks good! Maybe you could add some benchmarks to back up the "Significant performance gains" point?

@jaynus

This comment has been minimized.

Copy link
Author

@jaynus jaynus commented Oct 24, 2019

Benchmark visualization is here: https://github.com/jaynus/legion/raw/master/bench.png

To discuss more candid, discussion-style explanation of why:

Specs specifically utilizes sparse component storage. That means whenever you are joining 2 or more components, you are jumping between 2 arbitrary locations in memory repeatedly, for every single entity. This is very hard on CPU caching, as you are repeatedly loading and processing N completely separate regions of memory.

Legion allocations like-entities in contiguous memory, and iterates based on allocated "chunks" for iteration. This means that whenever performing a query against N-components, the memory is almost always guaranteed to be local and adjacent because its archetype is the same for that query.

This is a high level explanation, and leaving out some edge cases, but that is the gist of why we see significant performance benefits with legion. The more complex of these benefits can't really be realized in any current use of amethyst, so those are basically theoretical.

@OvermindDL1

This comment has been minimized.

Copy link

@OvermindDL1 OvermindDL1 commented Oct 24, 2019

For full disclosure, that image only shows creation and iteration and so forth, it does not demonstrate the timings of when components are added and/or removed as is common in many ECS engine styles. The adding/removing pattern is very slow compared to specs because it forced a copy of every component an entity has into a new 'chunk'. A combined pattern where components can be marked as what 'chunks' (if any) they should appear with in regards to other components would fix the issue as it would keep the bulk components together and the rapidly changing ones would then remain out of band.

@jaynus

This comment has been minimized.

Copy link
Author

@jaynus jaynus commented Oct 25, 2019

I've started adding benchmarks to my fork jaynus/amethyst-ecs-benchmarks

The reports can be seen Criterion Reports Here

They are a WIP as I add more optimized cases.

@OvermindDL1

This comment has been minimized.

Copy link

@OvermindDL1 OvermindDL1 commented Oct 25, 2019

Awesome @jaynus, looking forward. My minigames tend to do a significant amount of component swapping of a set amount of component types with rare swapping of another set of component types and little to no swapping of the remaining amount of component types, so optimizing for these is very doable with legion with some more design.

@Moxinilian

This comment has been minimized.

Copy link
Member

@Moxinilian Moxinilian commented Oct 26, 2019

Really excited for the new transforms. It could also potentially let us easily create alternative coordinate systems (most notably spherical coordinates) to very easily create 3rd person cameras for example.

@khionu

This comment has been minimized.

Copy link
Member

@khionu khionu commented Nov 10, 2019

There's been a lot of positive feedback to this change. Do we have any issues that would require delaying confirming this as our course of action? Would we want to publicize this discussion more before doing so?

@zicklag

This comment has been minimized.

Copy link

@zicklag zicklag commented Nov 10, 2019

It seem to me like it would be good to make this as public as possible before confirming it, being that it is probably one of the single largest possible changes that we could make.

It seems unlikely at this point that we would change our direction, but I think it makes sense to give people as much an opportunity to state their thoughts against it before we go ahead with it.

@distransient

This comment has been minimized.

Copy link
Member

@distransient distransient commented Nov 11, 2019

Khionu and I have posted it around now, thanks.

@Ralith

This comment has been minimized.

Copy link

@Ralith Ralith commented Nov 11, 2019

As a non-amethyst-user who is hopeful about legion's future, I'm concerned that amethyst adopting it wholesale in the near future might make it difficult for legion to make significant changes to its public API. Legion's a very new crate and issues like TomGillen/legion#27 illustrate that significant refinements are still occurring, and that locking in a stable API is therefore premature.

Conversely, adoption by amethyst could provide a substantial amount of the experimental use required to drive future design revisions, if those revisions aren't prevented by stability requirements, so the key question is how much churn amethyst is willing to tolerate in the name of improvements to upstream libraries.

@Voultapher

This comment has been minimized.

Copy link

@Voultapher Voultapher commented Nov 11, 2019

Great work all around! This definitely looks promising. A word of caution, I've seen a world like concept crop up in different game engines and it nearly always turned out to have a lot of the same problems as global mutable state, especially the more complex a project got.

Also with something this fundamental for the engine, I'd highly advise also running benchmarks in more complex use cases like a simple 3d game, as microbenchmarking tells an incomplete story.

@mrhatman

This comment has been minimized.

Copy link

@mrhatman mrhatman commented Nov 11, 2019

Are the changes to the transform component tied to the legion change, or can they be done independently?

I really like the transform suggestion, but I do have some comments:

  • having the ability to do custom transforms is fairly important to me, whether that's by directly editing the LocalToWorld component or by a custom transform component
  • the order that scale, rotation, and translation are applied needs to be well documented.
@OriontheCat

This comment has been minimized.

Copy link

@OriontheCat OriontheCat commented Nov 11, 2019

As long as it is enabled only as a feature first, it's fine by me.

@AThilenius

This comment has been minimized.

Copy link

@AThilenius AThilenius commented Nov 11, 2019

@mrhatman

  • having the ability to do custom transforms is fairly important to me, whether that's by directly editing the LocalToWorld component or by a custom transform component

LocalToWorld is a public component that you can freely add/edit at-will. The beauty of it being the only transform component that the vast majority of other systems care about is that you can compute this component (which is just a mat4) how ever you like. It's not tied to the hierarchy, or any specific space. Right now there is a system that provides a path for [Translation, Rotation, Scale/NonUniformScale] => LocalToWorld as well as hierarchy support for the aforementioned => LocalToParent (which is tern feeds the LocalToWorld). To opt out of this for a single entity, simply don't add any of [Translation, Rotation, Scale/NonUniformScale] at which point the entire thing will be ignored by the existing transform systems, or you can selectively disable systems if (ex) you have a totally different coordinate system in your game.

There are also lots of other cool transforms that can be added to legion_transform for things like skinned meshes. Because it's a 'pay for what you use' approach, adding any number of other supported transform types has little/no performance impact on existing ones.

TL;DR: Yes, it's high customizable and you only pay the cost of the transformations you actually use.

  • the order that scale, rotation, and translation are applied needs to be well documented.

Agreed. Need lots more documentation.

Nalgebra Transform API Issue

One big issue I have with my own library: the nalgebra API for computer graphics is... to be polite let's just say "difficult to use", and their documentation is hard to navigate. Because Translation is just a wrapper around nalgebra::Translation3 (as with all the other types), it means creating/manipulating these types is all nalgebra API. I'm opposed to providing duplicate APIs, but that also means, well... frankly a painful public API for these types. I also don't want to duplicate documentation with nalgebra, but right now I run into so much pain using legion_transform types myself that it almost seems worth it, at least for common computer-game cases :(

@mrhatman

This comment has been minimized.

Copy link

@mrhatman mrhatman commented Nov 11, 2019

LocalToWorld is a public component that you can freely add/edit at-will.

TL;DR: Yes, it's high customizable and you only pay the cost of the transformations you actually use.

These answer my first question well.

Great to hear this whole system is being thought out well.

@zicklag

This comment has been minimized.

Copy link

@zicklag zicklag commented Nov 11, 2019

the nalgebra API for computer graphics is... to be polite let's just say "difficult to use",the nalgebra API for computer graphics is... to be polite let's just say "difficult to use",

I agree that nalgebra's API is pretty difficult. It isn't like it is totally bad, but the whole dimensions abstraction really leverages Rust's type system in a way that, while it actually works fine, is very non-intuitive and comparatively difficult to think about. It also doesn't seem to be the fastest solution out there, but that is probably another discussion.

I absolutely love what you are saying about the transforms being pay for what you use and highly customizable. 👍

@seanmiddleditch

This comment has been minimized.

Copy link

@seanmiddleditch seanmiddleditch commented Nov 11, 2019

Small personal pet peeve around terminology. :)

Sparse storages causes cache incoherence

Should say "Sparse storage has poor cache locality."

Cache coherency is the (generally in-hardware) process by which memory writes from one CPU are propagated so the caches on other CPUs in a shared-memory system don't have stale data.

Cache locality is the likelihood of subsequent memory operations being in cache and is the root of why we prefer contiguous and sequential memory access pattern.

@mrhatman

This comment has been minimized.

Copy link

@mrhatman mrhatman commented Nov 12, 2019

It would be great to see a few more benchmarks. As just the iteration time of a basic component is only marginally faster and all of the other benchmarks are slower ( I realize they are a WIP ).

One case that isn't really discussed is memory usage? Does one have more overhead?

Also, these benchmarks seem to focus on cases where all entities have all components instead of the more likely scenario where things like render targets and transforms are used alot and other things like Uibuttons are used more sparingly. If we could get a benchmark where we can test both cases (and one in between maybe) that could be revealing.

@jaynus

This comment has been minimized.

Copy link
Author

@jaynus jaynus commented Nov 13, 2019

@mrhatman @Voultapher @Ralith

Reminder that I linked above https://github.com/jaynus/amethyst-ecs-benchmarks, which is the WIP benchmark comparisons I showed the results for here. The current benchmark results show specs ahead of legion because these are worst case scenario benchmarks for legion, and best case scenario benchmarks for specs. E.g. the only benches I implemented are basically to show legion in the worst possible light, and it still holds up fairly well.

Anyways, the whole point of this is that I could really use help writing more benchmarks more reflective of real world use. I only have so much time, so any help will be greatly appreciated. I'll take all the PR's in the world to that project to start visualizing some true usage.

@zicklag

This comment has been minimized.

Copy link

@zicklag zicklag commented Nov 13, 2019

Also, just to reiterate and not to negate the need for benchmarks, but performance isn't the only reason for using Legion, though it is among the biggest reasons. Legion is also more flexible and easier to setup an FFI with, which is needed for scripting.

@mrhatman

This comment has been minimized.

Copy link

@mrhatman mrhatman commented Nov 14, 2019

@jaynus I tried cloning that repo bit ran into alot of issues with getting it to build, I think because you are using custom crates. So decided to make my own.

I can publish the source later if there is interest ,but I ran two tests:
-First I repeated the basic position velocity test and it showed legion doubling the speed.

-Second I varied the frequency that entities had velocity components. I tested (90%,50%, and 10%). Legion had significant speed increases as that frequency decreased as you would expect due to less components to join. Specs was relatively the same regardless of the frequency which was concerning.

I am on board that the change should lead to significant improvements in amethyst.

@OvermindDL1

This comment has been minimized.

Copy link

@OvermindDL1 OvermindDL1 commented Nov 14, 2019

I'm for legion being chosen as well, however it still sorely needs grouping support like other 'batched' ECS's as well. Sometimes components change rapidly so they should be in a different group. Sometimes components need to be tightly packed together with only themselves to allow for OpenCL or whatever work, those should be in a null group or a group by themselves, etc... Regardless, grouping needs to be supported.

@jaynus

This comment has been minimized.

Copy link
Author

@jaynus jaynus commented Nov 14, 2019

@OvermindDL1 We've been thinking about it here: TomGillen/legion#16

This issue is the "correct" way to provide that kind of functionality, so its not like its not on our minds or something that is not going to be implemented; its just not implemented yet. I know you've brought this up many times in discord and on the community post, so I just wanted to inform you on the plan.

@mrhatman Oh sorry, I don't think I've updated it for the latest API safety changes. I'll get it updated shortly for you to PR into.

@reesmanp

This comment has been minimized.

Copy link

@reesmanp reesmanp commented Nov 19, 2019

I like the current API for creating systems as it makes sense in my mind. Is there a need/reason the change the developer facing API? I'd love to hear any arguments for or against moving to the new API.

@jaynus

This comment has been minimized.

Copy link
Author

@jaynus jaynus commented Nov 19, 2019

I'd love to hear any arguments for or against moving to the new API.

The shred trait-based API /w a Query type is just not possible in legion without GATs; because the types are fairly complex for a query. We will most likely be adding a simple case for single-query systems, but more complex ones will still require the closure builders.

You can see the current thoughts on that here: TomGillen/legion/issues/31

@moonheart08

This comment has been minimized.

Copy link

@moonheart08 moonheart08 commented Nov 26, 2019

I'm onboard with choosing Legion, primarily for performance gains, secondarily for the scripting API (Which will be great to finally have moving forward again).

@ivan-roksandic

This comment has been minimized.

Copy link

@ivan-roksandic ivan-roksandic commented Dec 9, 2019

I like with the idea of having a scripting API.
Tho if you are gonna re attach all systems inside atm, it would probobally be a good idea to try to see if its possible to make some modularity making Isolated bundles loadable/reloadable through FFI/DLLs.
Just sort of keep a definition of all components and thier members in memory, to compare when loading in new bundles/modules.
Could allow for a lot faster compilation and even hot-reloadability.
Interaction between bundles could just be as simple as just communicating with Component data.

@khionu

This comment has been minimized.

Copy link
Member

@khionu khionu commented Dec 9, 2019

There's discussion of scripting in other threads, including the Scripting RFC and on the forums. It'd be best if discussion of features that are not exactly in scope are held where they are.

@joonazan

This comment has been minimized.

Copy link

@joonazan joonazan commented Dec 18, 2019

One of my biggest concerns with Amethyst is that it is easy to accidentally cause nondeterministic multi-frame delays by not properly specifying dependencies.

Would it be possible to automate this? After all, the systems tell what they depend on.

In Legion it is possible to select entities that have a certain component without reading those components. We should also have a way to signal that a component is written, but not read.

Systems that have read and write access to a component are problematic. There is the worst case where entities have to be topologically sorted, for example a scene graph. How is this dealt with currently?

@khionu

This comment has been minimized.

Copy link
Member

@khionu khionu commented Dec 18, 2019

System dependencies can be qualified by factors other than the components they iterate over. For example, a Damage-over-Time system might need to be ran before or after systems that apply the effects, depending on the design of the game mechanics.

@joonazan

This comment has been minimized.

Copy link

@joonazan joonazan commented Dec 18, 2019

@khionu

This comment has been minimized.

Copy link
Member

@khionu khionu commented Dec 18, 2019

It's not about frame delay, it's about whether you apply a piece of logic before or after another. Even if Rust supported checks to a degree as perfect as technology could get, it would still not be able to understand design.

For another example, you would want to check whether a character has feinted after calculating everything else going on in a frame. Calculating DOT, IsFainted, and Regen, in that order, could yield a false positive Fainted status, but those concepts are impossible for us to detect and account for in the order we run systems. They are non-technical concepts being represented in code. It's not possible to be 100% sure about their order.

@joonazan

This comment has been minimized.

Copy link

@joonazan joonazan commented Dec 18, 2019

@mrhatman

This comment has been minimized.

Copy link

@mrhatman mrhatman commented Dec 18, 2019

You bring up an missing feature of Legion compared to specs, in Specs the last parameter of adding a system to a dispatcher was any systems it depended on to run previously. Is the a way preform the same in legion?

@jaynus

This comment has been minimized.

Copy link
Author

@jaynus jaynus commented Dec 18, 2019

Please review my fork with my new amethyst dispatcher. There is a concept of Stages. Stage and RelativeStage are Ord and allow you to infer run order for systems. These give the sort order of systems when flattened into the final execution list. After that, dispatch order is based on insertion order. Finally, "dependency" is inferred by resource write access; there is no more defining explicit dependencies.

You can either infer you want a system to get dispatched around a Stage, or explicitly use resources to force exclusive execution order.

I'm not sure where these statements about nondetermism are coming from, unless your speaking about specs. For legion dispatches, the systems are always dispatched in the same order every frame, and will always deduce the same dependency order.

Explicit named dependencies were removed as they are messy and force synchronousity and naming when it's not needed.

@mrhatman

This comment has been minimized.

Copy link

@mrhatman mrhatman commented Dec 18, 2019

Ok, that makes sense on execution order.

I read your branch but it's a little confusing.

I assume things with in a stage can run in parallel assuming no conflicting dependencies.
For example if you have 3 systems in a stages A,B,C entered in that order. And B depends on A because of shared resources. At start A & C will be running on separate threads and once A is done then B runs. And them when they are all done, you move to the next stage.

@jaynus

This comment has been minimized.

Copy link
Author

@jaynus jaynus commented Dec 18, 2019

Almost, except Stages are not discrete, they don't imply any sort of explicit bucketing. They just infer the order of execution in a linear list (sorted into a flat list). So systems in various stages will still run in parralel if no data dependency exists between them.

@mrhatman

This comment has been minimized.

Copy link

@mrhatman mrhatman commented Dec 18, 2019

Ok, I guess I am confused on how stages are different than just using the order they are added to the builder in. Other than just clarity. Like what problem do they solve?

@joonazan

This comment has been minimized.

Copy link

@joonazan joonazan commented Dec 18, 2019

@jaynus

This comment has been minimized.

Copy link
Author

@jaynus jaynus commented Dec 18, 2019

Okay, lets take a step back and go over an example here. Here is a case I have:

let builder = DispatcherBuilder::default()
       .with_system(Stage::Begin, build_map_entity_position_system)
       .with_system(RelativeStage(Stage::Begin, -1), build_camera_movement_system)
       .with_system(Stage::Begin, build_mouse_world_state_system)
       .with_system(Stage::Render, build_mouse_selection_system,)
       .with_thread_local(rl_ui::system);

When a DispatcherBuilder.build() is called, it consumes each builder function (FnOnce(&mut World) -> Box<dyn Schedulable>) explicitly in the order they were added to the builder. We have two cases to worry about here: Build order, and execution order. Because of this, it is not ideal to to strictly rely on insertion order because there may be cases where we need to build a system first, but run it after another system. In specs, this was done by using named dependencies between systems. However, this naming forced synchronicity for systems, which was not flexible. In a parallel executor, there are two, not one, scenarios we care about: Waiting for another system to finish running, and the order in which we start systems. These are two different scenarios, and both can be described in this system.

The named dependencies in specs are dirty (They require explicit names for systems, which means string comparisons and storage, etc) - when the feat it accomplishes is really defined in the data of systems themselves, being that if your system is accessing data that a prior system needs to transform, then you've already defined that dependency. You shouldn't need to explicitly state it by names, it should just exist.

Now, shifting gears, the dispatcher builds its list of execution like this:

  • On build, consume all builder functions in insertion order (internally, this is done with ConsumeDesc implemented for these closures)
  • Each created Box<dyn Schedulable> is inserted into a BTreeMap, which is sorted based on RelativeStage. Stage implements IntoRelativeStage, which defaults to RelativeStage(Stage::N, 0).
  • This sorting means you can infer whether you want systems to be inserted before, or after, others in the flattened system list..
  • Finally, we build the Schedule in legion. This is the data-driven description of the order in which systems are executed. This order is defined by data dependencies. Therefore, if you have:
RelativeStage(Begin, 0) SystemA - Write Res1
RelativeStage(Begin, -1) SystemB - Write Res1
RelativeStage(Begin, 1) SystemC - Read Res1

These get re-ordered prior to creating our schedule, as

RelativeStage(Begin, -1) SystemB - Write Res1
RelativeStage(Begin, 0) SystemA - Write Res1
RelativeStage(Begin, 1) SystemC - Read Res1

Because System A and B are writes, then these are guaranteed to dispatch linearly, one after the other, with no overlap, because the writes are exclusive. HOWEVER, this is a simple case. There are cases where there will be mixed reads and writes, of course, and this system allows us to begin execution of order dependent systems, or run them in parralel, across stages based on their data dependencies. But that data dependency order is defined by the stages.

Hope that makes more sense, and explains the problem space and why the string dependencies was a messy, incomplete solution, and this allows more granular and explicit usage. The question, in this case, is "What if I have a system that needs to run after another, but they have no data dependencies between them?". My first question is "Why is that the case then, why cant they run in parralel?" Many times, it doesn't matter. And the answer is, create a ZST resource between them, then, to enforce that synchronicity.

@mrhatman

This comment has been minimized.

Copy link

@mrhatman mrhatman commented Dec 18, 2019

Awesome explanation. I was missing that there might be times where you want build order and execution order to be different.

@CleanCut

This comment has been minimized.

Copy link

@CleanCut CleanCut commented Feb 13, 2020

What's the status of legion? I checked the forks linked to from the original post (1 and 2), but no commits have been pushed since last November.

@zicklag

This comment has been minimized.

Copy link

@zicklag zicklag commented Feb 16, 2020

I haven't directly done any work on it and @jaynus might be able to speak better to this, but as far as I know there hasn't been any further development of the port of Amethyst to Legion, but there has been miscellaneous development work on Legion itself. I've seen a bit of general activity, PRs, issues, etc. on the Legion repo.

@CleanCut

This comment has been minimized.

Copy link

@CleanCut CleanCut commented Feb 16, 2020

Ah, it looks like I was inadvertently on a non-master branch when I was looking at commit dates. There is definitely some more recent activity.

@CleanCut

This comment has been minimized.

Copy link

@CleanCut CleanCut commented Feb 19, 2020

Still, I think folks would all love to hear how things are coming along and if there are places that help is needed!

@jaynus

This comment has been minimized.

Copy link
Author

@jaynus jaynus commented Feb 19, 2020

I've been asked a bit for an update, and I've taken a small break for amethyst related work, hence my fork stalled for a bit.

I have pushed the latest legion work @ https://github.com/amethyst/amethyst/tree/legion This includes another rewrite of the dispatcher and some other minor changes. This branch DOES NOT CURRENTLY BUILD due to being in the middle of updating everything to the latest legion master code.

TLDR Status
The port of the core, rendering, tiles and window is complete and works (once above mentioned legion update is complete). The core mechanics of the dispatcher, world syncs and execution are all complete. Remaining work items is all documentation, systems and example porting.

Here is a task list of items that are complete:

  • amethyst_rendy
  • amethyst_core
  • amethyst_tiles
  • New legion/specs hybrid dispatcher
  • amethyst_window
  • Asset loader ported to legion
  • Temporary synchronization between specs and legion worlds as documented in this RFC.
  • Two new examples added and working. legion and legion_tiles

Following are items that STILL NEED TO BE COMPLETED from a implementation standpoint:

  • amethyst_input, complete port
  • amethyst_animation, complete port (This should probably just get gutted and removed)
  • amethyst_controls, complete port
  • amethyst_audio, complete port
  • amethyst_gltf, complete port
  • amethyst_locale, complete port
  • amethyst_network, complete port
  • amethyst_test, complete rewrite
  • amethyst_utils, complete port
  • amethyst_ui, complete port (This should probably just get gutted and removed)

Supporting work that needs to be completed:

  • Documentation pass updated all inline code examples
  • Book update for new systems
  • amethyst_tools needs to be updated or ganked, its still very old
  • All examples need to be ported to legion
@azriel91

This comment has been minimized.

Copy link
Member

@azriel91 azriel91 commented Feb 19, 2020

Hm, perhaps it's useful to do a release of amethyst with its current changes, and have the version after that purely the legion switchover (2-smaller-step upgrades, instead of one big step).

One benefit is people get the non-"nightly" specs upgrade first while we transition over to legion.

I'll likely take on amethyst_test, since i wrote most of it, but of course the base crates come first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.