Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support building for e2k (Elbrus2000) CPU architecture (VLIW) #871

Merged
merged 1 commit into from Jan 12, 2024

Conversation

NicSavichev
Copy link
Contributor

@NicSavichev NicSavichev commented Jan 11, 2024

Elbrus 2000 (aka e2k) is a 64-bit little-endian WLIV architecture (more info https://github.com/ilyakurdyukov/e2k-ports).
This change adds compilation for e2k (using SSE intrinsics that are translated to close equivalents for Elbrus2000 CPU instructions) with LCC compiler (mimics GCC 9.3.0).

Since LCC issues more warnings JoltPhysics cannot be compiled with treat-warning-as-errors, so we setup and compile as:
cmake . -DENABLE_ALL_WARNINGS=OFF

Internal tests passed:
./UnitTests

Single precision 64-bit with instructions: SSE2
[doctest] doctest version is "2.4.11"
[doctest] run with "--help" for options
===============================================================================
[doctest] test cases:    409 |    409 passed | 0 failed | 0 skipped
[doctest] assertions: 280481 | 280481 passed | 0 failed |
[doctest] Status: SUCCESS!

./PerformanceTest

Single precision 64-bit with instructions: SSE2
Running scene: Ragdoll
Motion Quality, Thread Count, Steps / Second, Hash
Discrete, 1, 9.346963, 0xa9daaf344fe673db
Discrete, 2, 17.908885, 0xa9daaf344fe673db
Discrete, 3, 25.916333, 0xa9daaf344fe673db
Discrete, 4, 33.719118, 0xa9daaf344fe673db
Discrete, 5, 41.038943, 0xa9daaf344fe673db
Discrete, 6, 47.863349, 0xa9daaf344fe673db
Discrete, 7, 54.226738, 0xa9daaf344fe673db
Discrete, 8, 52.689932, 0xa9daaf344fe673db
LinearCast, 1, 8.790343, 0xcdcbb4da185d1a13
LinearCast, 2, 16.644860, 0xcdcbb4da185d1a13
LinearCast, 3, 24.083228, 0xcdcbb4da185d1a13
LinearCast, 4, 31.272998, 0xcdcbb4da185d1a13
LinearCast, 5, 38.307285, 0xcdcbb4da185d1a13
LinearCast, 6, 44.485573, 0xcdcbb4da185d1a13
LinearCast, 7, 50.048080, 0xcdcbb4da185d1a13
LinearCast, 8, 48.670700, 0xcdcbb4da185d1a13

NicSavichev added a commit to GaijinEntertainment/DagorEngine that referenced this pull request Jan 11, 2024
@mlknz
Copy link

mlknz commented Jan 11, 2024

Wow, exciting 💪

// Elbrus e2k architecture
#define JPH_CPU_E2K
#define JPH_CPU_ADDRESS_BITS 64
#define JPH_USE_SSE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is safer to first check the availability of the SSE macro, and then enable it.

#if defined(__SSE__) && !defined(JPH_USE_SSE)
 #define JPH_USE_SSE
#endif

You can do the same with the rest of the macros (SSE4_1, SSE4_2, AVX, AVX2, F16C, LZCNT, BMI).

@jrouwe
Copy link
Owner

jrouwe commented Jan 11, 2024

Hello,

It's a cool project for sure, but I'm not sure if I want to support this in Jolt. I have no idea how widespread these CPUs are (I had never heard of them).

Looking at the change, isn't it possible to support it without having an JPH_CPU_E2K define? Since E2K is an x86 CPU (and I'm hoping that the compiler defines the relevant define for it), I think you can achieve the same by just disabling the relevant instruction sets by adding one or more of the following defines to cmake:

-DUSE_SSE4_1=OFF -DUSE_SSE4_2=OFF -DUSE_AVX=OFF -DUSE_AVX2=OFF -DUSE_AVX512=OFF -DUSE_LZCNT=OFF -DUSE_TZCNT=OFF -DUSE_F16C=OFF -DUSE_FMADD=OFF

@NicSavichev
Copy link
Contributor Author

NicSavichev commented Jan 12, 2024

Hello,

It's a cool project for sure, but I'm not sure if I want to support this in Jolt. I have no idea how widespread these CPUs are (I had never heard of them).

Looking at the change, isn't it possible to support it without having an JPH_CPU_E2K define? Since E2K is an x86 CPU (and I'm hoping that the compiler defines the relevant define for it), I think you can achieve the same by just disabling the relevant instruction sets by adding one or more of the following defines to cmake:

-DUSE_SSE4_1=OFF -DUSE_SSE4_2=OFF -DUSE_AVX=OFF -DUSE_AVX2=OFF -DUSE_AVX512=OFF -DUSE_LZCNT=OFF -DUSE_TZCNT=OFF -DUSE_F16C=OFF -DUSE_FMADD=OFF

Unfortunately E2K it not x86 (or even near it), it is VLIW (very long instruction word) architecture.
Just on compiler stage it can convert SSE intrinsics to some equivalent sequence of native instructions.
I will try to add defines you suggested but I think have tried it before and it will not work since none of these

#if defined(__x86_64__) || defined(_M_X64) || defined(__i386__) || defined(_M_IX86)
#elif defined(__aarch64__) || defined(_M_ARM64) || defined(__arm__) || defined(_M_ARM)
#elif defined(JPH_PLATFORM_WASM)

is defined and CANNOT be defined globally for this arch.

@NicSavichev
Copy link
Contributor Author

It's a cool project for sure, but I'm not sure if I want to support this in Jolt. I have no idea how widespread these CPUs are (I had never heard of them).

If you don't want to merge this (I think no maintaining is required from your side - maybe just I will occasionally send patches if code stops compiling after some large changes - but I don't expect so because core headers are not changing frequently) just say, I will abandon PR (and just proceed maintaining altered JoltPhysics source code in DagorEngine project)

@jrouwe jrouwe merged commit 106a35e into jrouwe:master Jan 12, 2024
68 checks passed
@jrouwe
Copy link
Owner

jrouwe commented Jan 12, 2024

Ok, I've merged it. It is indeed unlikely to get in the way.

How common is this CPU?

@NicSavichev
Copy link
Contributor Author

How common is this CPU?

It is CPU developed in Russia, started at 200X. Now 6th generation of CPU is available.
It was rather uncommon but now is going to target mass-market (at least in Russia), when home workstations are built with this CPU.
https://en.wikipedia.org/wiki/Elbrus_2000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants