Skip to content

Support older core2 processors#115485

Open
xtremeqg wants to merge 1 commit intogodotengine:masterfrom
xtremeqg:no-sse42
Open

Support older core2 processors#115485
xtremeqg wants to merge 1 commit intogodotengine:masterfrom
xtremeqg:no-sse42

Conversation

@xtremeqg
Copy link

Force-enabling SSE4.2 causes signal 4 (Illegal instruction) on older processors. By only enabling SSE4.1, similar performance benefits are retained while still maintaining compatibility.

Ideally, multiple code paths should be compiled into the same binary. At runtime, the most optimal path can then be selected.

This way, newer CPU features (eg. SSE4.2, AVX512) can be used on modern CPUs while simultaneously maintaining compatibility with systems that do not support them.

closes #115001, closes #113617

Force-enabling SSE4.2 causes signal 4 (Illegal instruction) on older
processors. By only enabling SSE4.1, similar performance benefits
are retained while still maintaining compatibility.

Ideally, multiple code paths should be compiled into the same binary.
At runtime, the most optimal path can then be selected.

This way, newer CPU features (eg. SSE4.2, AVX512) can be used on modern
CPUs while simultaneously maintaining compatibility with systems
that do not support them.

closes godotengine#115001, closes godotengine#113617
@xtremeqg xtremeqg requested a review from a team as a code owner January 28, 2026 00:13
@Calinou
Copy link
Member

Calinou commented Jan 28, 2026

Force-enabling SSE4.2 causes signal 4 (Illegal instruction) on older processors. By only enabling SSE4.1, similar performance benefits are retained while still maintaining compatibility.

Did you test this on an actual Core 2 CPU? As far as I know, downgrading to SSE4.1 won't help. We would need to downgrade all the way to SSE2 (which matches Godot 4.4 and prior behavior). See also godotengine/godot-proposals#13644.

The 32-bit binaries still target SSE2, so they will keep working on Core 2 CPUs for the foreseeable future.

Ideally, multiple code paths should be compiled into the same binary. At runtime, the most optimal path can then be selected.

This can't be done for autovectorization, only hand-written intrinsics.

@xtremeqg
Copy link
Author

xtremeqg commented Jan 28, 2026

Force-enabling SSE4.2 causes signal 4 (Illegal instruction) on older processors. By only enabling SSE4.1, similar performance benefits are retained while still maintaining compatibility.

Did you test this on an actual Core 2 CPU? As far as I know, downgrading to SSE4.1 won't help. We would need to downgrade all the way to SSE2 (which matches Godot 4.4 and prior behavior). See also godotengine/godot-proposals#13644.

Yes, I own a Q9550 (which I used to build and run Godot on), just like the OP of 13644. /proc/cpuinfo says it supports sse4_1.

I also own a Q6600, which supports up to SSE3.

Intel ARK used to have a nice listing of what each core 2 processor supports but a couple years ago they decided to delete all the pages containing core 2 processors.

This information can still be found on Wikipedia, apparently all core 2 processors on the 45nm node support SSE 4.1

@xtremeqg
Copy link
Author

Ideally, multiple code paths should be compiled into the same binary. At runtime, the most optimal path can then be selected.

This can't be done for autovectorization, only hand-written intrinsics.

One could compile the same file multiple times (one for SSE2, one for SSE3, one for SSE4, etc). Extract specific symbols that received performance improvements (a profiler will tell you) using objcopy, adding some prefix/suffix in the process. Then, refactor these functions as function pointers which reference the correct function variant after enumerating processor features.

@xtremeqg
Copy link
Author

I can probably rewrite the PR to add a build flag, as proposed in 13644.

@asyync1024
Copy link

asyync1024 commented Jan 29, 2026

I can probably rewrite the PR to add a build flag, as proposed in 13644.

I myself would also recommend that, as it would keep the SSE4.2 baseline as default, while enabling custom compilation for older CPUs by users and distros at once. If you can seriously do it, then please do so, it would be a very ideal way of handling baselines.

For example the azahar 3DS emulator has an option to disable the SSE4.2 baseline and allowing custom instruction usage in the CFLAGS etc, means disabling SSE4.2, and still being able to use the highest instruction supported by the non-SSE4.2 CPU.

Thank you for not letting this topic become cold, I was going to bring it up within a few days.

@asyync1024
Copy link

asyync1024 commented Feb 1, 2026

I made a PR and added a build flag (enable_sse42), it's still a work in progress, and it can be found here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Crash at startup with SIGILL godot 4.5.1 crash on ubuntu 24.04.1

4 participants