Added Elbrus 2000 architecture #3425

EntityFX · 2021-04-12T14:02:03Z

e2k (Elbrus 2000) - this is VLIW/EPIC architecture, like Intel Itanium (IA-64) architecture.
Architecture has half native / half software support of most Intel/AMD SIMD (e.g. MMX/SSE/SSE2/SSE3/SSSE3/SSE4.1/SSE4.2/AES/AVX/AVX2 & 3DNow!/SSE4a/XOP/FMA4) via intrinsics.

https://en.wikipedia.org/wiki/Elbrus_2000

Cpu Info (Elbrus 8CB):

processor       : 0
vendor_id       : E8C
cpu family      : 4
model           : 9
model name      : E8C2
revision        : 2
cpu MHz         : 1549.910240
cache0          : level=1 type=Instruction scope=Private size=128K line_size=256 associativity=4
cache1          : level=1 type=Data scope=Private size=64K line_size=32 associativity=4
cache2          : level=2 type=Unified scope=Private size=512K line_size=64 associativity=4
cache3          : level=3 type=Unified scope=Private size=16384K line_size=64 associativity=16
bogomips        : 3100.46

..


processor       : 7
vendor_id       : E8C
cpu family      : 4
model           : 9
model name      : E8C2
revision        : 2
cpu MHz         : 1549.910240
cache0          : level=1 type=Instruction scope=Private size=128K line_size=256 associativity=4
cache1          : level=1 type=Data scope=Private size=64K line_size=32 associativity=4
cache2          : level=2 type=Unified scope=Private size=512K line_size=64 associativity=4
cache3          : level=3 type=Unified scope=Private size=16384K line_size=64 associativity=16
bogomips        : 3099.84

Compiler: Elbrus C Compiler (in code: LCC) (own MCST's compiler with GCC compatibility, not Little C Compiler)

Architecture name: E2k (in code: e2k)

Architecture versions: elbrus-v2 (generic), elbrus-v3 (e2k-v3), elbrus-v4 (e2k-v4), elbrus-v5 (e2k-v5), elbrus-v6 (e2k-v6)

EntityFX · 2021-04-12T14:03:19Z

Benchmark results:

Intel(R) Core(TM)2 Duo CPU     T9400  @ 2.53GHz (2 thread)

===========================
Total time (ms) : 674701
Nodes searched  : 846305220
Nodes/second    : 1254341


Celeron(R) CPU N3350 @ 1.10GHz (2 threads)

===========================
Total time (ms) : 899814
Nodes searched  : 830772446
Nodes/second    : 923271


Intel(R) Core(TM)i7-2600 CPU @ 3.40GHz (8 threads)
===========================
Total time (ms) : 5307
Nodes searched  : 57637842
Nodes/second    : 10860720


MCST Elbrus 8CB @ 1.55 GHz (8 threads)

===========================
Total time (ms) : 502943
Nodes searched  : 1570786593
Nodes/second    : 3123190


MCST Elbrus 8C @ 1.2 GHz (1 CPU 8 threads)

===========================
Total time (ms) : 26857
Nodes searched  : 47084162
Nodes/second    : 1753143


MCST Elbrus 8C @ 1.2 GHz (4 CPU 32 threads)
===========================
Total time (ms) : 14678
Nodes searched  : 105723120
Nodes/second    : 7202828

Sopel97 · 2021-04-12T14:19:13Z

what's the difference with just using an x86-64 compile?

EntityFX · 2021-04-12T14:25:23Z

what's the difference with just using an x86-64 compile?

Different compiler, different optimization levels. LCC E2K -O4 == GCC x86-64 -O3, need flags to tune architecture binary: -march (elbrus-v2, elbrus-v3, elbrus-v4, elbrus-v5, elbrus-v6).

Sopel97 · 2021-04-12T14:32:08Z

sure, but elbrus can run x86-64 code. How much better is this patch?

EntityFX · 2021-04-12T14:42:26Z

sure, but elbrus can run x86-64 code. How much better is this patch?

It can run x86-64 in emulation mode, but this patch brings e2k native support.

MichaelB7 · 2021-04-13T03:32:32Z

sure, but elbrus can run x86-64 code. How much better is this patch?

It can run x86-64 in emulation mode, but this patch brings e2k native support.

Can you answer the question posed by @Sopel97 , what are the bench numbers for this patch before and after?

EntityFX · 2021-04-13T11:17:28Z

sure, but elbrus can run x86-64 code. How much better is this patch?

It can run x86-64 in emulation mode, but this patch brings e2k native support.

Can you answer the question posed by @Sopel97 , what are the bench numbers for this patch before and after?

./stockfish bench 256 8 16

In ARCH=x86-64 compilation mode results are:

Total time (ms) : 26857
Nodes searched  : 47084162
Nodes/second    : 1753143

In ARCH=e2k-v4 compilation mode results are:

Total time (ms) : 22904
Nodes searched  : 46801912
Nodes/second    : 2043394

g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -m64 -DUSE_PTHREADS -DNDEBUG -O4 -DIS_64BIT -DNO_PREFETCH -msse3 -mpopcnt -DUSE_POPCNT -march=elbrus-v4 -DUSE_SSE2 -msse2 -flto

snicolet · 2021-04-14T13:40:38Z

Thanks for the patch, good to discover the Elbrus processor brand :-)

I would need at least the default bench number to check that it is the same as Intel/arm etc. What is the result of this command ?

./stockfish bench > /dev/null

snicolet · 2021-04-14T13:46:08Z

Another remark is that the changes to the makefile and to the output of make help are quite invasive, especially because of all the variants.

Is there a way to use a generic flag for the compiler which would say "use the best available variant for the current machine", or something like that?

vondele · 2021-04-18T14:22:58Z

@EntityFX can you report the output of ./stockfish bench 256 1 16 so we can see the result matches that of other architectures?

Honestly, I'm wonder if we should support this, I think it is a really rare architecture. Do you have any projects based on this code that would run on this architecture? Just trying to get a feel for it.

EntityFX · 2021-04-19T17:51:33Z

@EntityFX can you report the output of ./stockfish bench 256 1 16 so we can see the result matches that of other architectures?

Honestly, I'm wonder if we should support this, I think it is a really rare architecture. Do you have any projects based on this code that would run on this architecture? Just trying to get a feel for it.

Elbrus-8CB (E8C2) @ 1.55 GHz

===========================
Total time (ms) : 42845
Nodes searched  : 16013252
Nodes/second    : 373748

EntityFX · 2021-04-19T18:07:55Z

@EntityFX can you report the output of ./stockfish bench 256 1 16 so we can see the result matches that of other architectures?

Honestly, I'm wonder if we should support this, I think it is a really rare architecture. Do you have any projects based on this code that would run on this architecture? Just trying to get a feel for it.

https://en.wikipedia.org/wiki/Elbrus-8S

ilyakurdyukov · 2021-05-08T10:52:06Z

(in code: LCC) (own MCST's compiler with GCC compatibility

Not very own, it uses frontend from EDG (Edison Design Group). You can check that by __EDG_VERSION__ predefined macro.

ifeq ($(findstring -v5,$(ARCH)),-v5)
sse41 = yes
endif

You can assume that elbrus-v2 is legacy hardware and deprecated. Starting from elbrus-v3 you can assume that SSE4.1 is supported. Also it supports MMX intrinsics, which can be used in places where you don't need 128-bit wide vectors. 128-bit vectors on Elbrus before v5 are emulated using two 64-bit vector instructions. Even AVX/AVX2 is emulated in that way, and have enabled by default in LCC (as if -mavx2 given).

ilyakurdyukov · 2021-05-08T11:02:37Z

#elif GNUC
compiler += "g++ (GNUC) ";
compiler += make_version_string(GNUC, GNUC_MINOR, GNUC_PATCHLEVEL);
#elif LCC && !defined(_WIN32)
compiler += "Elbrus C Compiler";

I think this is unnecessary and it is also wrong, because __GNUC__ is defined by the LCC and the GNUC case will always be picked first. Overall, I think the Elbrus support patch could be much smaller (no need for an exact e2k version selector).

Is there a way to use a generic flag for the compiler which would say "use the best available variant for the current machine", or something like that?

It always happens automatically, no flag to compiler needed at all (only if you want to cross-compile), so in my opinion the build system shouldn't care about that.

... || (defined(LCC) && !defined(_WIN32))
#define POSIXALIGNEDALLOC

Use || defined(__e2k__).

And if you really need to print exact LCC version, then use #elif defined(__e2k__) && defined(__LCC__) before #elif __GNUC__. Use __LCC__ and __LCC_MINOR__ predefined macros, if __LCC__ is 125 and __LCC_MINOR__ is 9, then it's "LCC 1.25.09".

    #define dot_ver2(n) \
      compiler += (char)'.'; \
      compiler += (char)('0' + (n) / 10); \
      compiler += (char)('0' + (n) % 10);
    compiler += "LCC ";
    compiler += std::to_string(__LCC__ / 100);
    dot_ver2(__LCC__ % 100)
    dot_ver2(__LCC_MINOR__)

ifeq ($(findstring e2k,$(ARCH)),e2k)
CXXFLAGS += -O4

Don't use -O4, it generates too much binary code. MCST is also going to change -O4 to -O3, -O3 to -O2, so less tuning is required for various software build systems. Therefore, I also recommend removing this change.

noobpwnftw · 2021-05-08T14:12:34Z

Considering the development of the architecture came a long way and still quite active, I guess adding compiler support is more or less equivalent to supporting Apple M1 for benchmark purposes rather than their practical use, which we already did, so I'm for it.

vondele · 2021-05-08T14:57:10Z

I suggest the patch can be updated to include the review comments, and made as small as meaningful, e.g. drop some of the legacy architectures.

EntityFX · 2021-05-11T10:28:09Z

#elif GNUC
compiler += "g++ (GNUC) ";
compiler += make_version_string(GNUC, GNUC_MINOR, GNUC_PATCHLEVEL);
#elif LCC && !defined(_WIN32)
compiler += "Elbrus C Compiler";

I think this is unnecessary and it is also wrong, because __GNUC__ is defined by the LCC and the GNUC case will always be picked first. Overall, I think the Elbrus support patch could be much smaller (no need for an exact e2k version selector).

Is there a way to use a generic flag for the compiler which would say "use the best available variant for the current machine", or something like that?

It always happens automatically, no flag to compiler needed at all (only if you want to cross-compile), so in my opinion the build system shouldn't care about that.

... || (defined(LCC) && !defined(_WIN32))
#define POSIXALIGNEDALLOC

Use || defined(__e2k__).

And if you really need to print exact LCC version, then use #elif defined(__e2k__) && defined(__LCC__) before #elif __GNUC__. Use __LCC__ and __LCC_MINOR__ predefined macros, if __LCC__ is 125 and __LCC_MINOR__ is 9, then it's "LCC 1.25.09".
    #define dot_ver2(n) \
      compiler += (char)'.'; \
      compiler += (char)('0' + (n) / 10); \
      compiler += (char)('0' + (n) % 10);
    compiler += "LCC ";
    compiler += std::to_string(__LCC__ / 100);
    dot_ver2(__LCC__ % 100)
    dot_ver2(__LCC_MINOR__)
ifeq ($(findstring e2k,$(ARCH)),e2k)
CXXFLAGS += -O4

Don't use -O4, it generates too much binary code. MCST is also going to change -O4 to -O3, -O3 to -O2, so less tuning is required for various software build systems. Therefore, I also recommend removing this change.

Thank you, I've updated my code according your recommendations, please review.

snicolet · 2021-05-11T10:44:39Z

Patch looks much better now, thanks!

ilyakurdyukov · 2021-05-11T14:51:35Z

Fine.

LCC 1.25.09 failed to compile due to constexpr related errors, newer LCC 1.25.15 compiles without errors.

$ ./stockfish compiler
Stockfish 110521 by the Stockfish developers (see AUTHORS file)

Compiled by MCST LCC (version 1.25.15) on Linux
Compilation settings include:  64bit SSE41 SSSE3 SSE2 POPCNT MMX
__VERSION__ macro expands to: 7.3

vondele · 2021-05-11T17:28:20Z

@EntityFX let me know your full name for the AUTHORS file, if you wish.

vondele · 2021-05-11T17:45:46Z

using your name as inferred from the email address for now, let me know if there is a mistake.

EntityFX · 2021-05-11T20:34:21Z

@EntityFX let me know your full name for the AUTHORS file, if you wish.

EntityFX artem.solopiy@gmail.com

vondele · 2021-05-12T07:49:37Z

thanks that's what I used.

BTW, since we don't have access to the hardware and no CI on it, I would appreciate if you could check from time to time if things still work for you.

e2k (Elbrus 2000) - this is a VLIW/EPIC architecture, the like Intel Itanium (IA-64) architecture. The architecture has half native / half software support for most Intel/AMD SIMD (e.g. MMX/SSE/SSE2/SSE3/SSSE3/SSE4.1/SSE4.2/AES/AVX/AVX2 & 3DNow!/SSE4a/XOP/FMA4) via intrinsics. https://en.wikipedia.org/wiki/Elbrus_2000 closes official-stockfish#3425 No functional change

EntityFX added 5 commits April 12, 2021 13:42

E2K: added initial support of MCST Elbrus 2000 CPU architecture

48f0ef9

Use 64 bit, added help

630c947

Elbrus 2000 optimization flags fixes

fec1dbc

Use optimization for any Elbrus version

9ea49fe

Added Elbrus 2000 different versions support

6f12c60

Corrected compilation for Elbrus 2000

d4bc8a9

vondele added the to be merged Will be merged shortly label May 11, 2021

vondele closed this in b62af7a May 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Elbrus 2000 architecture #3425

Added Elbrus 2000 architecture #3425

EntityFX commented Apr 12, 2021

EntityFX commented Apr 12, 2021 •

edited

Sopel97 commented Apr 12, 2021

EntityFX commented Apr 12, 2021

Sopel97 commented Apr 12, 2021

EntityFX commented Apr 12, 2021

MichaelB7 commented Apr 13, 2021

EntityFX commented Apr 13, 2021

snicolet commented Apr 14, 2021 •

edited

snicolet commented Apr 14, 2021

vondele commented Apr 18, 2021

EntityFX commented Apr 19, 2021 •

edited

EntityFX commented Apr 19, 2021

ilyakurdyukov commented May 8, 2021

ilyakurdyukov commented May 8, 2021 •

edited

noobpwnftw commented May 8, 2021

vondele commented May 8, 2021

EntityFX commented May 11, 2021

snicolet commented May 11, 2021

ilyakurdyukov commented May 11, 2021

vondele commented May 11, 2021

vondele commented May 11, 2021

EntityFX commented May 11, 2021

vondele commented May 12, 2021

Added Elbrus 2000 architecture #3425

Added Elbrus 2000 architecture #3425

Conversation

EntityFX commented Apr 12, 2021

EntityFX commented Apr 12, 2021 • edited

Sopel97 commented Apr 12, 2021

EntityFX commented Apr 12, 2021

Sopel97 commented Apr 12, 2021

EntityFX commented Apr 12, 2021

MichaelB7 commented Apr 13, 2021

EntityFX commented Apr 13, 2021

snicolet commented Apr 14, 2021 • edited

snicolet commented Apr 14, 2021

vondele commented Apr 18, 2021

EntityFX commented Apr 19, 2021 • edited

EntityFX commented Apr 19, 2021

ilyakurdyukov commented May 8, 2021

ilyakurdyukov commented May 8, 2021 • edited

noobpwnftw commented May 8, 2021

vondele commented May 8, 2021

EntityFX commented May 11, 2021

snicolet commented May 11, 2021

ilyakurdyukov commented May 11, 2021

vondele commented May 11, 2021

vondele commented May 11, 2021

EntityFX commented May 11, 2021

vondele commented May 12, 2021

EntityFX commented Apr 12, 2021 •

edited

snicolet commented Apr 14, 2021 •

edited

EntityFX commented Apr 19, 2021 •

edited

ilyakurdyukov commented May 8, 2021 •

edited