Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove reflection #44

Closed
wants to merge 11 commits into from

Conversation

andreakarasho
Copy link
Contributor

@andreakarasho andreakarasho commented Jan 2, 2023

Main

To follow the project phylosophy, this PR aims to make Arch faster.
As discussed in #40 Arch still using Reflection and an unconventional components access with the Unsafe.As<T> API.

The solution: use a bytes buffer

The Archetype class accepts a set of Components which the corrispective types are unknown at compile time and the opaque Array class get casted to a specific type using the Unsafe.As<T> API with some magic.
I replaced the Array with the following implementation
https://github.com/andreakarasho/Arch/blob/remove-reflection/Arch/Core/Chunk.cs#L12-L28
The hack is to use a byte[sizeof(T) * Capacity] array to store the components set.
In this way we can cast our components using the MemoryMarshal.Cast<byte, TComponent> API which garantees a O(1) operation.

No managed references declared as struct field

Solution: usually ECS components contains an index to the managed resource

// old
record struct Sprite(object Texture, Color color);

// new: use 'TextureID' to grab the resource
record struct Sprite(int TextureID, Color color);

About the string type I think to implement a struct which will handle the implicit convertion to unmanaged memory automatically. Something like in flecs-cs:

record struct NameComponent(CString Value);

Reference: https://github.com/flecs-hub/flecs-cs/blob/main/src/cs/production/Flecs/flecs.cs#L11708-L11937

No class components

I don't consider this a bad thing actually. I don't see a valid reason to use class components becase they cause addresses jump). So struct is the way to go!
Unity discourages the class usage too: https://docs.unity3d.com/Packages/com.unity.entities@0.16/manual/component_data.html#managed-icomponentdata

The sample

No reflection sample

Arch.Sample.NoReflection.mp4

(The initial fps drops was due to a misslick to the tiltle bar during the video recording)

Current master

Arch.Samples.WithReflection.mp4

TODO

  • code cleanup
  • documentation

@andreakarasho andreakarasho marked this pull request as draft January 2, 2023 16:20
@genaray
Copy link
Owner

genaray commented Jan 2, 2023

Thanks! Looks great :)

So it's actually a bit faster than the current version that uses "type magic" under the hood. This is great ^^

The main advantage of managed structs and classes is their "easy" usage... there are some cases where the user does not want to write boilerplate code for converting existing courses to components, indexes, and containers. This could probably scare some people away since they will need to write more boilerplate code to reach their goal.

So I wonder if its possible to combine both ways and if its worth it. By a flag or combined API. Unity for example has the mechanic of "ObjectComponents" which are basically classes and structs with managed references.

We could theoretically do the same, without losing performance... while maintaining flexibility for some scenarios.

About the string type I think to implement a struct which will handle the implicit convertion to unmanaged memory automatically.

Thats also sounding pretty nice ^^ Probably there could be even some sort of inbuild lookup mechanism which helps to map the index of an sprite, string e.g. to its class, without forcing the user to do it himself.
var id = collection.Register<Sprite>(sprite); var sprite = collection.Get(cmp.Id);

@andreakarasho
Copy link
Contributor Author

The main advantage of managed structs and classes is their "easy" usage... there are some cases where the user does not want to write boilerplate code for converting existing courses to components, indexes, and containers. This could probably scare some people away since they will need to write more boilerplate code to reach their goal.

Somebody already answered to this question :D

Thats also sounding pretty nice ^^ Probably there could be even some sort of inbuild lookup mechanism which helps to map the index of an sprite, string e.g. to its class, without forcing the user to do it himself.

It's pretty close to the Bevy Handles concept and I really like it,
This could solves nicely the managed references in structs.
Probably better to put this to the Arch-TODO list until/if this PR get merged [?]

@genaray
Copy link
Owner

genaray commented Jan 3, 2023

It's pretty close to the Bevy Handles concept and I really like it, This could solves nicely the managed references in structs. Probably better to put this to the Arch-TODO list until/if this PR get merged [?]

I see, that's actually very close to those Handles ^^

So there actually like three ways for providing managed struct or class support.

  1. We could use flags to add/remove reflection support from the API and chunks
  2. We could add ObjectComponents like unity did, they are also stored in chunks like the current solution, but are accessed with a different API : entity.SetObjectComponent(new Texture2D());
  3. We could use Handles or some sort of content registration system to access managed structs and classes.

I think the first one is not worth it, since it causes the most work and could be a nightmare to maintain.
Whats your opinion to unitys workflow with ComponentObjects´ ? ^^ Would you favor Handles` ?

Probably better to put this to the Arch-TODO list until/if this PR get merged [?]

This will be merged, however, it will still take some time. Im currently working on the documentation on the feature/code-style branch. I will finish it, than merge it :) This could still take several days however since writing documentation is incredibly... boring.

@andreakarasho
Copy link
Contributor Author

I definitely prefer the Handle<T> approach because:

  • we don’t need to edit the arch core code (heavily at least)
  • sounds like more ecs style
  • Arch will maintain the best performances given by the span tricks
  • it’s a nice solution to manage any external resource of any type

@andreakarasho andreakarasho marked this pull request as ready for review January 4, 2023 15:48
@andreakarasho andreakarasho changed the title [WIP] Remove reflection Remove reflection Jan 4, 2023
@genaray
Copy link
Owner

genaray commented Jan 4, 2023

I definitely prefer the Handle<T> approach because:

  • we don’t need to edit the arch core code (heavily at least)
  • sounds like more ecs style
  • Arch will maintain the best performances given by the span tricks
  • it’s a nice solution to manage any external resource of any type

Well in that case we should create a issue for it :)
Im still working on the docs, gonna merge it the next few days.

Should we probably wait for the handle api before merging this ?

@andreakarasho
Copy link
Contributor Author

I think this is up to you :)
If you don't plan to make a release suddenly, merging this PR is fine. It's ok to go ahead with the development when we have nugets that work good.
I'll be OK with any decision you will take np

@genaray
Copy link
Owner

genaray commented Jan 5, 2023

In this case, i think its the best to wait till I finally finished the code style rework ^^ ( Which is hella annoying ).
I wish I could do this more quickly, but I currently have a lot of work to do. My bachelor thesis starts soon and I currently have many meetings and stuff to prepare.

Out of curiosity, could the Handle<T> API and the removal of managed struct and class support actually harm the performance ? I thought about it a bit and did some research. This came into my mind :

Is the PR approach really faster compared to the previous reflection one ?

Is it possible that the speed advantage came from component restructuring instead of reflection removal ?

In your updated Systems.cs from Arch.Samples, you simply pass the Texture to the system and then assign it to each draw call manually. This is the most efficient way, no lookups are required and no additional data is being passed around.

I assume that the PR and the master actually perform similar when both use the same technique.
Could you try the same with the master and see if it performs similar ?

Could such Handle<T> API actually harm the performance ?

This is a theoretical one, but it still makes kinda sense.

Having no support for managed structs or classes, required the discussed Handle<T> API #47.
However, this could come at a cost... array lookups, bound checks, or additional struct spam on the stack.
Especially when the data is custom for entities, which is kind often the case. Different sprites, gameobjects, meshes, materials, animations or whatever. This could have a huge negative impact since it would actually create a lot of lookups.

// Class/Managed struct support
world.Query(..., (ref Sprite sprite) => {
    batch.Draw(sprite.Texture2D);   // No lookup, direct reference to the Texture2D class, many different textures no problem
});

// PR & Handle
world.Query(...,(ref Sprite sprite) => {

   // Lookup since the handle would need to access an array
   // Results in either bound checks or usage of spans for accessing the array without checks. 
   // Creating such a span for every access, however, could pollute the stack drastically which would result in a big slowdown. 
   // Furthermore accessing an external array will load it into L1 cache and disrupt the iteration performance. 

   var texture = sprite.TextureHandle.Get(); 
   batch.Draw(texture);
});

// Snippet from #47 
public ref T Get(in Handle<T> handle)
{

  // AsSpan creates a new span struct EVERY call, even if its just a struct, its kinda bad.
  // _list is being accessed, which is a list and its internal array. Those are being loaded into the cache every Get operation
  // This could completely disrupt the L1 cache used for the fast iteration, which could result in an extremely slow iteration speed for handle accesses in general. 
   return ref CollectionsMarshal.AsSpan(_list)[handle.ID];
}

Representation of why it would disrupt the cache :

  1. Chunk 1 accessed.
  2. First Entity & Components are being accessed | Arrays being loaded into L1 cache.
  3. Handle<T>.Get is called, | Handle list is being loaded into cache, replacing the chunk arrays in the L1 Cache.
  4. Second Entity & Components are being accessed. | Chunk arrays loaded back into L1 cache again.
  5. Handle<T>.Get is called again, circle repeats.

This could cause a lot of cache misses, on every single entity actually... If they access the handles of course.

So i actually assume that a Handle<T> API or removal of managed structs and classes, could actually result in a worse performance... even more worse compared the the current master :/

What are your thoughts on this ? Or have I missed something ? ^^

@andreakarasho
Copy link
Contributor Author

However, this could come at a cost... array lookups, bound checks, or additional struct spam on the stack.
Especially when the data is custom for entities, which is kind often the case. Different sprites, gameobjects, meshes, materials, animations or whatever. This could have a huge negative impact since it would actually create a lot of lookups.

Don't forget that managed objects are simply pointers so you will do random memory jumps in all cases [what OOP does actually :D].


Could you try the same with the master and see if it performs similar ?

Sure. Both now run the same code:

Master

diff --git a/Arch.Samples/Components.cs b/Arch.Samples/Components.cs
index e0a791e..16445b4 100644
--- a/Arch.Samples/Components.cs
+++ b/Arch.Samples/Components.cs
@@ -24,6 +24,6 @@ public struct Velocity
 /// </summary>
 public struct Sprite
 {
-    public Texture2D Texture2D;
+    //public Texture2D Texture2D;
     public Color Color;
 }
\ No newline at end of file
diff --git a/Arch.Samples/Game.cs b/Arch.Samples/Game.cs
index 6634ede..deb973e 100644
--- a/Arch.Samples/Game.cs
+++ b/Arch.Samples/Game.cs
@@ -32,6 +32,10 @@ public class Game : Microsoft.Xna.Framework.Game
     {
         _graphics = new GraphicsDeviceManager(this);
         Content.RootDirectory = "Content";
+        this.IsFixedTimeStep = false;
+        _graphics.SynchronizeWithVerticalRetrace = false;
+        this.IsMouseVisible = true;
+        this.InactiveSleepTime = TimeSpan.Zero;
     }
     
     protected override void Initialize()
@@ -61,7 +65,7 @@ public class Game : Microsoft.Xna.Framework.Game
         _jobScheduler = new("SampleWorkerThreads");
         _movementSystem = new MovementSystem(_world, GraphicsDevice.Viewport.Bounds);
         _colorSystem = new ColorSystem(_world);
-        _drawSystem = new DrawSystem(_world, _spriteBatch);
+        _drawSystem = new DrawSystem(_world, _spriteBatch, _texture2D);
 
         // Spawn in entities with position, velocity and sprite
         for (var index = 0; index < 1000; index++)
@@ -69,7 +73,7 @@ public class Game : Microsoft.Xna.Framework.Game
             _world.Create(
                 new Position{ Vec2 = _random.NextVector2(GraphicsDevice.Viewport.Bounds) }, 
                 new Velocity{ Vec2 = _random.NextVector2(-0.25f,0.25f) }, 
-                new Sprite{ Texture2D = _texture2D, Color = _random.NextColor() }
+                new Sprite{ Color = _random.NextColor() }
             );
         }
     }
@@ -93,7 +97,7 @@ public class Game : Microsoft.Xna.Framework.Game
     {
         _graphics.GraphicsDevice.Clear(Color.CornflowerBlue);
         _drawSystem.Update(in gameTime);
-        Console.WriteLine($"FPS : {(1 / gameTime.ElapsedGameTime.TotalSeconds)}");
+        //Console.WriteLine($"FPS : {(1 / gameTime.ElapsedGameTime.TotalSeconds)}");
         base.Draw(gameTime);
     }
 
diff --git a/Arch.Samples/Systems.cs b/Arch.Samples/Systems.cs
index 44e1407..7719a6a 100644
--- a/Arch.Samples/Systems.cs
+++ b/Arch.Samples/Systems.cs
@@ -132,8 +132,9 @@ public class ColorSystem : SystemBase<GameTime>
 public class DrawSystem : SystemBase<GameTime>
 {
 
+    private readonly Texture2D _texture2D;
     private readonly SpriteBatch _batch;
-    public DrawSystem(World world, SpriteBatch batch) : base(world) { _batch = batch;}
+    public DrawSystem(World world, SpriteBatch batch, Texture2D texture) : base(world) { _batch = batch; _texture2D = texture; }
 
     /// <summary>
     /// Targets all entities with a position and sprite
@@ -161,7 +162,7 @@ public class DrawSystem : SystemBase<GameTime>
                 // Get refs to position and sprite.
                 ref var position = ref positions[index];
                 ref var sprite = ref sprites[index];
-                _batch.Draw(sprite.Texture2D, position.Vec2, sprite.Color);  // Draw
+                _batch.Draw(_texture2D, position.Vec2, sprite.Color);  // Draw
             }
         }
         

... and this is the results:
Screenshot done after some minutes of both app running.
image

Using the Handle<T> prototype to store/grab the Texture2D, i see the same results of the ss.


Handle<T>

Mine was just a prototype done in 30secs :D just to give an idea

@andreakarasho
Copy link
Contributor Author

I actually found a memory overlap somewhere in the code.
Im going to check it out

@genaray
Copy link
Owner

genaray commented Jan 5, 2023

Don't forget that managed objects are simply pointers so you will do random memory jumps in all cases [what OOP does actually :D].

Well that's right, but using a Handle<T> should actually cause one more lookup and a little bit more pollution compared with a simple reference inside a struct... since the reference in the struct is already there while the handle needs to lookup it first ^^

Atleast in theory,

Screenshot done after some minutes of both app running.

This however looks incredible... probably a bit "too" good :D
I did some research and both array variants should actually perform the same. Both array variants are the same size ( in kb ), converted to spans and accessed during the loop. Both arrays are still managed ( since [] in c# is always a managed type ). They also should generate equal IL and assembly code.

C# should not really treat a raw byte array different from an Array. Neither should the CPU L1 Cache prioritize any of these. The access itself is what matters and the access is similar. The only difference is that the PR uses MemoryMarshal and the master the Unsafe API.

So Im curious why exactly the PR is "faster" and if it really is, or if we probably misunderstand something ^^

Sure. Both now run the same code:

Have you probably looked into benchmarking the versions to see how they perform under heavy load ?
I think rerunning the query benchmarks should give us enough insight about all its details :D

Side note, I'm just asking all these questions to understand and learn. Part of my learning behaviour ^^

@andreakarasho
Copy link
Contributor Author

andreakarasho commented Jan 5, 2023

Looks like I fall into the hell.
For a large struct [32 bytes]:
Marshal.SizeOf(type) returns an higher value.
this https://github.com/andreakarasho/Arch/blob/remove-reflection/Arch/Core/Utils/CompileTimeStatics.cs#L87
Instead when the raw bytes buffer get converted from bytes to this struct using MemoryMarshal.Cast<byte, T>, the size is lower.

During the development of this PR I tried to replace the Marshal.SizeOf(type) with the Unsafe.SizeOf<T> but it's a pain because I should use source generator and remove all public methods that accepts ComponentType[] as argument.

@andreakarasho andreakarasho marked this pull request as draft January 5, 2023 23:33
@genaray
Copy link
Owner

genaray commented Jan 6, 2023

For a large struct [32 bytes]:
Marshal.SizeOf(type) returns an higher value.

Is it probably a padding problem ? I ran into this problem a lot since c#/c++ will automatic add padding bytes to make it fit the registers better. If the types of the 32 bytes struct arent aligned properly, it will add a few bytes which results in wrong sizes.

and remove all public methods that accepts ComponentType[] as argument.

This is hella work ^^ Furthermore it would remove flexbility. There many many usecases where its great to e.g. create entities by types instead generics. For example custom serialization.

I actually planned to add entity.Set(type, t); entity.Get(type);, entity.Add(type); entity.Remove(type) extensions in the future ( as additions NOT replacements ) since they could come in pretty handy for custom persistence ^^

@andreakarasho
Copy link
Contributor Author

Closed as not applicable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants