Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Godot's .NET math types are way slower than System.Numerics #69490

Open
wihrl opened this issue Dec 2, 2022 · 10 comments
Open

Godot's .NET math types are way slower than System.Numerics #69490

wihrl opened this issue Dec 2, 2022 · 10 comments

Comments

@wihrl
Copy link

wihrl commented Dec 2, 2022

Godot version

4.0 beta 6

System information

Win 11, Ryzen 5 5600H, 32GB RAM

Issue description

Godot's integrated math library for .NET 6 is considerably slower than standard .NET System.Numerics.

image

Steps to reproduce

Do pretty much any Vector ops and compare it against System.Numerics. See included project for a small benchmark.

Minimal reproduction project

VectorBenchmark.zip

@Calinou
Copy link
Member

Calinou commented Dec 2, 2022

Can you reproduce this in 3.5.1 which uses Mono?

@raulsntos
Copy link
Member

System.Numerics uses intrinsics so it's expected it would be faster.

We could probably use SIMD to speed things up a bit, note that SIMD is also not implemented on the C++ side of the engine, there is a proposal for that godotengine/godot-proposals#4563 but it likely won't affect C# since we reimplement the math in order to avoid marshaling.

If you really need the performance, there's nothing wrong about converting your inputs into System.Numerics types to perform the math operations and then convert back to Godot types to return the output.

@tannergooding
Copy link

tannergooding commented Mar 1, 2023

Happy to provide input here and/or recommendations for implementation (whether for C/C++ or C#).

Recently completed a large multi-release refactoring of the general backing JIT code for System.Numerics and a more general rewrite of Matrix3x2 and Matrix4x4 for .NET 8 that resulted in up to 48x faster code for some APIs and of course that's all MIT as well so shouldn't be overly complicated to port to godot if necessary.

Edit: For those unaware, I'm the owner of the System.Numerics and System.Runtime.Intrinsics namespaces on the .NET libraries team. Love what's been going on in godot land and congrats on the 4.0 release!

@AbeniMatteo
Copy link
Contributor

I'm taking a look at this as part of #79395.

I have a suite of benchmarks that compares:

  • Godot.VectorX 4.1
  • Godot.VectorX PR impl
  • System.Numerics.VectorX
  • System.Runtime.Intrinsics.Vector128
  • Godot.VectorX wrapping System.Numerics.VectorX
  • Godot.VectorX wrapping System.Runtime.Intrinsics.Vector128
  • Hand-rolled System.Runtime.Intrinsics.Vector128

Summary:

  • The difference between net6 and net8 is gigantic and using SnVec (net6) is not always the best solution.
  • Net8 is still in preview (6 right now), in some case the codegen is "strange".
  • By fully using xmm registers, stack spill is less of a problem but it still there and sometimes it leads to unexpected regressions.
  • The PR i'm working on is almost useless with net8, the jit is a lot smarter now.
  • Most of the problems caused by the odd size of Vec3 (96bits) are solved with net8.

@tannergooding I'll contact you if these problems are not solved with next preview (or RC). Thanks.

Results: (example ported to C#, 512x512px)

godot/core/io/image.cpp

Lines 3646 to 3692 in 725beaa

void Image::bump_map_to_normal_map(float bump_scale) {
ERR_FAIL_COND(!_can_modify(format));
clear_mipmaps();
convert(Image::FORMAT_RF);
Vector<uint8_t> result_image; //rgba output
result_image.resize(width * height * 4);
{
const uint8_t *rp = data.ptr();
uint8_t *wp = result_image.ptrw();
ERR_FAIL_COND(!rp);
unsigned char *write_ptr = wp;
float *read_ptr = (float *)rp;
for (int ty = 0; ty < height; ty++) {
int py = ty + 1;
if (py >= height) {
py -= height;
}
for (int tx = 0; tx < width; tx++) {
int px = tx + 1;
if (px >= width) {
px -= width;
}
float here = read_ptr[ty * width + tx];
float to_right = read_ptr[ty * width + px];
float above = read_ptr[py * width + tx];
Vector3 up = Vector3(0, 1, (here - above) * bump_scale);
Vector3 across = Vector3(1, 0, (to_right - here) * bump_scale);
Vector3 normal = across.cross(up);
normal.normalize();
write_ptr[((ty * width + tx) << 2) + 0] = (127.5 + normal.x * 127.5);
write_ptr[((ty * width + tx) << 2) + 1] = (127.5 + normal.y * 127.5);
write_ptr[((ty * width + tx) << 2) + 2] = (127.5 + normal.z * 127.5);
write_ptr[((ty * width + tx) << 2) + 3] = 255;
}
}
}
format = FORMAT_RGBA8;
data = result_image;
}

Namespace Method Runtime Mean Error Ratio Code Size
Godot_Old BumpToNormal .NET 6.0 4.750 ms NA 1.00 735 B
Godot_New BumpToNormal .NET 6.0 3.593 ms NA 0.76 500 B
SnVec_Struct BumpToNormal .NET 6.0 4.143 ms NA 0.87 675 B
SnVec BumpToNormal .NET 6.0 3.858 ms NA 0.81 663 B
Vec128_Struct BumpToNormal .NET 6.0 2.886 ms NA 0.61 580 B
Vec128 BumpToNormal .NET 6.0 1.936 ms NA 0.41 503 B
Godot_Old BumpToNormal .NET 8.0 3.597 ms NA 0.76 482 B
Godot_New BumpToNormal .NET 8.0 3.569 ms NA 0.75 586 B
SnVec_Struct BumpToNormal .NET 8.0 2.907 ms NA 0.61 564 B
SnVec BumpToNormal .NET 8.0 2.495 ms NA 0.53 556 B
Vec128_Struct BumpToNormal .NET 8.0 2.181 ms NA 0.46 484 B
Vec128 BumpToNormal .NET 8.0 1.871 ms NA 0.39 470 B

@tannergooding
Copy link

I'll contact you if these problems are not solved with next preview (or RC). Thanks.

Sounds good, thanks! You can notably get nightly builds here as well: https://github.com/dotnet/installer#installers-and-binaries

Also glad to see the improvements done in .NET 8 are paying off. Did quite a bit of refactoring and improvements, particularly to Quaternion, Plane, and Matrix4x4. There's more still to land, but .NET 8 will be in a lot better shape than previous releases when it comes out in November.

@API-Beast
Copy link

API-Beast commented Dec 24, 2023

^ .NET 8 is out now.
Guess I will use .NET vectors for non-interfacing code for now, but some kind of parity would be nice for Godot Vectors.

Edit: Nevermind, seems like Godot vectors are already quite good according to AbeniMatteo tests.

@lucasteles
Copy link

Maybe some implicit cast from Numerics would be nice

@Ziflin
Copy link

Ziflin commented Apr 12, 2024

@tannergooding If you're ever bored, you might take a look at System.Numerics.Vector3.Cross(). It's the only method that we're able to get noticeably faster performance from a custom function by converting to Vector128 and back. I believe the Numerics version is still only using scalar instructions.

@tannergooding
Copy link

If you're ever bored, you might take a look at System.Numerics.Vector3.Cross(). It's the only method that we're able to get noticeably faster performance from a custom function by converting to Vector128 and back. I believe the Numerics version is still only using scalar instructions.

This was done in dotnet/runtime#103527, as was several other improvements including to matrix multiplication. It should be available in .NET 9 Preview 6.

@hawkerm
Copy link

hawkerm commented Sep 7, 2024

Maybe some implicit cast from Numerics would be nice

Agree, is there a separate issue/discussion tracking that? I haven't found one yet, but there's a lot of different terms to search for. I guess that'd go in the proposals repo?

I was just doing some work to keep my game logic in an isolated C# project and was going to leverage the System.Numerics Vector3 to store my resulting data. It'd be nice if I could just directly assign it to the Node3D transforms and such within my game project layer.

I guess the question would be if implicit operators would be wanted, someone could run into issues with perf if they're not aware it's creating/converting behind the scenes, but an explicit conversion operator and/or extension method ToGodotVector() would be nice:

public static class VectorExtensionMethods
{
    public static Godot.Vector2 ToGodotVector(this System.Numerics.Vector2 v) => new(v.X, v.Y);

    public static Godot.Vector3 ToGodotVector(this System.Numerics.Vector3 v) => new(v.X, v.Y, v.Z);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants