Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solve compile errors on Raspbian buster #2641

Closed
mayweed opened this issue Apr 18, 2020 · 20 comments
Closed

Solve compile errors on Raspbian buster #2641

mayweed opened this issue Apr 18, 2020 · 20 comments

Comments

@mayweed
Copy link

mayweed commented Apr 18, 2020

Lo
If anyone encounters the same compile errors while attempting on building sf on raspbian buster:

/usr/bin/ld: /tmp/cciyPLsk.ltrans0.ltrans.o: in function `ThreadPool::start_thinking(Position&, std::unique_ptr<std::deque<StateInfo, std::allocator<StateInfo> >, std::default_delete<std::deque<StateInfo, std::allocator<StateInfo> > > >&, Search::LimitsType const&, bool) [clone .constprop.62]':
<artificial>:(.text+0x3e34): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0x3e48): undefined reference to `__atomic_store_8'
/usr/bin/ld: /tmp/cciyPLsk.ltrans0.ltrans.o: in function `TimeManagement::elapsed() const [clone .isra.105] [clone .constprop.36]':
<artificial>:(.text+0x6294): undefined reference to `__atomic_load_8'
/usr/bin/ld: /tmp/cciyPLsk.ltrans0.ltrans.o: in function `Value (anonymous namespace)::search<((anonymous namespace)::NodeType)1>(Position&, Search::Stack*, Value, Value, int, bool) [clone .constprop.31]':
<artificial>:(.text+0x6f9c): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x7108): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x7cbc): undefined reference to `__atomic_fetch_add_8'
/usr/bin/ld: <artificial>:(.text+0x7da8): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x8278): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0x8390): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x885c): undefined reference to `__atomic_fetch_add_8'
/usr/bin/ld: /tmp/cciyPLsk.ltrans5.ltrans.o: in function `dbg_print()':
<artificial>:(.text+0x8e44): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x8e58): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x8e84): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x8eb0): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x8edc): undefined reference to `__atomic_load_8'
/usr/bin/ld: /tmp/cciyPLsk.ltrans5.ltrans.o:<artificial>:(.text+0x8ef0): more undefined references to `__atomic_load_8' follow
/usr/bin/ld: /tmp/cciyPLsk.ltrans3.ltrans.o: in function `Thread::search()':
<artificial>:(.text+0x45e8): undefined reference to `__atomic_store_8'
/usr/bin/ld: /tmp/cciyPLsk.ltrans3.ltrans.o: in function `Position::do_move(Move, StateInfo&, bool)':
<artificial>:(.text+0x74a8): undefined reference to `__atomic_fetch_add_8'
/usr/bin/ld: /tmp/cciyPLsk.ltrans3.ltrans.o: in function `MainThread::search()':
<artificial>:(.text+0xa578): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0xa698): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0xa920): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0xa954): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0xaab8): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0xb090): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0xb138): undefined reference to `__atomic_store_8'
/usr/bin/ld: /tmp/cciyPLsk.ltrans3.ltrans.o: in function `Value (anonymous namespace)::search<((anonymous namespace)::NodeType)0>(Position&, Search::Stack*, Value, Value, int, bool) [clone .lto_priv.281]':
<artificial>:(.text+0xd748): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0xd810): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0xeb28): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0xf0a0): undefined reference to `__atomic_fetch_add_8'
/usr/bin/ld: <artificial>:(.text+0xf248): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0xf2ec): undefined reference to `__atomic_load_8'
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:541: stockfish] Error 1
make[1]: Leaving directory '/home/pi/build/Stockfish-fishnet-180120/src'
make: *** [Makefile:458: build] Error 2

I solved it by those commands:

$ sudo apt-get install gcc-5 g++-5
$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 60 --slave /usr/bin/g++ g++ /usr/bin/g++-5
$ sudo update-alternatives --set gcc "/usr/bin/gcc-5"

And it finally compiles...

@Dantist
Copy link

Dantist commented Apr 18, 2020

I can confirm that it is compiling well on Raspbian Stretch (GCC 6.3) and failing on Raspbian Buster (GCC 8.3).

The solution is pretty straightforward.
You still can build with GCC8, by adding -latomic to the linker flags:

make build ARCH=armv7 EXTRALDFLAGS=-latomic

Don't know whether there are any drawbacks.
I think this flag can be added into Makefile for ARM arch.

P.S. On, now obsolete, Raspberry Pi 1: executable produced by GCC8 (with -latomic) is 23% bigger and 9% slower than the one produced by GCC6 (tested on SF10). But it still hits the 23 kNps ))

Correction:
I forgot to strip GCC8 executable, so the bigger size was a wrong conclusion. GCC8 (with -latomic) and GCC6 (with -latomic) are pretty much the same in terms of produced file sizes and performance.

-latomic itself is what causes performance drop (~3.6% on a RPi1 - average among 16 bench runs on different depths).

Update 2: GCC on Raspbian Buster:

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/8/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Raspbian 8.3.0-6+rpi1' --with-bugurl=file:///usr/share/doc/gcc-8/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-8 --program-prefix=arm-linux-gnueabihf- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --disable-libquadmath-support --enable-plugin --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions --with-arch=armv6 --with-fpu=vfp --with-float=hard --disable-werror --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
gcc version 8.3.0 (Raspbian 8.3.0-6+rpi1)

@vondele
Copy link
Member

vondele commented Apr 18, 2020

can -latomic also be used together with gcc 6.3 on raspbian stretch?

@vondele
Copy link
Member

vondele commented Apr 18, 2020

some related gcc issues : https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358

@Dantist
Copy link

Dantist commented Apr 18, 2020

can -latomic also be used together with gcc 6.3 on raspbian stretch?

Just tested. Compiled fine on GCC 6.3 with and without -latomic.
Produced executables are same in terms of file size, but performance differs (checksums are differrent also, as expected).
I've updated my statement regarding performance in a previous comment.

Best runs:

$ ./stockfish-11-gcc6-latomic bench
===========================
Total time (ms) : 247409
Nodes searched  : 5156767
Nodes/second    : 20843
$ ./stockfish-11-gcc6 bench
===========================
Total time (ms) : 238671
Nodes searched  : 5156767
Nodes/second    : 21606

Hardware (of Raspberry Pi 1):

$ lscpu
Architecture:          armv6l
Byte Order:            Little Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
Model:                 7
Model name:            ARMv6-compatible processor rev 7 (v6l)
CPU max MHz:           700.0000
CPU min MHz:           700.0000
BogoMIPS:              697.95
Flags:                 half thumb fastmult vfp edsp java tls
$ cat /proc/cpuinfo
processor       : 0
model name      : ARMv6-compatible processor rev 7 (v6l)
BogoMIPS        : 697.95
Features        : half thumb fastmult vfp edsp java tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xb76
CPU revision    : 7

Hardware        : BCM2835
Revision        : 000e
Serial          : 00000000b346df22

GCC on Raspbian Stretch:

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/6/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Raspbian 6.3.0-18+rpi1+deb9u1' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=arm-linux-gnueabihf- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-armhf/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-armhf --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-armhf --with-arch-directory=arm --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions --with-arch=armv6 --with-fpu=vfp --with-float=hard --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
gcc version 6.3.0 20170516 (Raspbian 6.3.0-18+rpi1+deb9u1)

@MichaelB7
Copy link
Contributor

Everything here with my custom Makefile on the Honey fork compiles fine on both 32 bit and 64 bit Buster, different flags of course , with GCC 8.3. Atomic flag required.
GCC 10.1 is currently problematic on 64 bit Buster, but it appears to be a distro still in development. GCC 10.1 works fine in 32 bit Buster. Tested both RP3+ and RP4. RP4 now approaches 900K/nps classical and 330K/nps NNUE, single core.

@gsobala
Copy link
Contributor

gsobala commented Aug 13, 2020

An update on compilation on a Raspberry Pi running 64-bit Gentoo on a raspi 4, a profile-build NNUE gets about 850,000 nps when running on all four cores. Only 36% the speed of classical but fast enough to have an Elo about 70 higher at STC. Straightforward compilation make -j profile-build ARCH=armv8 - compiles "out of the box". gcc version is 10.1

@Dantist
Copy link

Dantist commented Aug 14, 2020

Just wanted to say that:
@mayweed, most probably, had armv7 hardware (RPi 2)
I had armv6 hardware (RPi 1), but have no issues compiling it with ARCH=armv7 for years.
@gsobala , @MichaelB7 both have armv8 hardware (RPi 3, RPi 4).

I will check the newest master on RPi 1 this weekend, but if someone can check it earlier - you may give it a try..

@vondele
Copy link
Member

vondele commented Aug 14, 2020

I'm trying to understand the issue. For me the issue would be resolved if we can compile with the unmodified Makefile. 'out-of-the-box' Is this now (after NNUE merge, and the required compiler versions >6) the case, or not yet?

@Dantist
Copy link

Dantist commented Aug 14, 2020

OK, I have tested now :-)

  • There is no armv6 ARCH in Makefile, so simple make build fails as the default ARCH is x86-64-modern.
  • As I mentioned above I was able to compile Stockfish with ARCH=armv7 on my armv6 hardware for years (until Raspbian Buster with gcc 8.3 was released) :-)

I have now tried to build again on Raspbian Buster (gcc 8.3) on RPi 1 (armv6) and it fails as in the first message PLUS few new warnings from NNUE code emerged:

pi@pi:~/sf_nnue/src $ make build ARCH=armv7

Config:
debug: 'no'
sanitize: 'no'
optimize: 'yes'
arch: 'armv7'
bits: '32'
kernel: 'Linux'
os: 'GNU/Linux'
prefetch: 'yes'
popcnt: 'no'
sse: 'no'
ssse3: 'no'
sse41: 'no'
avx2: 'no'
pext: 'no'
avx512: 'no'
vnni: 'no'
neon: 'no'

Flags:
CXX: g++
CXXFLAGS: -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto
LDFLAGS:  -Wl,--no-as-needed -lpthread -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto -flto=jobserver

Testing config sanity. If this fails, try 'make help' ...

make ARCH=armv7 COMP=gcc all
make[1]: Entering directory '/home/pi/sf_nnue/src'
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o benchmark.o benchmark.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o bitbase.o bitbase.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o bitboard.o bitboard.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o endgame.o endgame.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o evaluate.o evaluate.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o main.o main.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o material.o material.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o misc.o misc.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o movegen.o movegen.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o movepick.o movepick.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o pawns.o pawns.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o position.o position.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o psqt.o psqt.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o search.o search.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o thread.o thread.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o timeman.o timeman.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o tt.o tt.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o uci.o uci.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o ucioption.o ucioption.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o tune.o tune.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o tbprobe.o syzygy/tbprobe.cpp
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o evaluate_nnue.o nnue/evaluate_nnue.cpp
nnue/evaluate_nnue.cpp: In function ‘Value Eval::NNUE::ComputeScore(const Position&, bool)’:
nnue/evaluate_nnue.cpp:135:61: warning: requested alignment 64 is larger than 8 [-Wattributes]
         transformed_features[FeatureTransformer::kBufferSize];
                                                             ^
nnue/evaluate_nnue.cpp:137:61: warning: requested alignment 64 is larger than 8 [-Wattributes]
     alignas(kCacheLineSize) char buffer[Network::kBufferSize];
                                                             ^
g++ -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto   -c -o half_kp.o nnue/features/half_kp.cpp
g++ -o stockfish benchmark.o bitbase.o bitboard.o endgame.o evaluate.o main.o material.o misc.o movegen.o movepick.o pawns.o position.o psqt.o search.o thread.o timeman.o tt.o uci.o ucioption.o tune.o tbprobe.o evaluate_nnue.o half_kp.o  -Wl,--no-as-needed -lpthread -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto -flto=jobserver
/usr/bin/ld: /tmp/ccNhu0Kx.ltrans0.ltrans.o: in function `ThreadPool::start_thinking(Position&, std::unique_ptr<std::deque<StateInfo, std::allocator<StateInfo> >, std::default_delete<std::deque<StateInfo, std::allocator<StateInfo> > > >&, Search::LimitsType const&, bool) [clone .constprop.49]':
<artificial>:(.text+0x858c): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0x85a8): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0x85bc): undefined reference to `__atomic_store_8'
/usr/bin/ld: /tmp/ccNhu0Kx.ltrans0.ltrans.o: in function `TimeManagement::elapsed() const [clone .isra.75] [clone .constprop.37]':
<artificial>:(.text+0x948c): undefined reference to `__atomic_load_8'
/usr/bin/ld: /tmp/ccNhu0Kx.ltrans0.ltrans.o: in function `Value (anonymous namespace)::search<((anonymous namespace)::NodeType)1>(Position&, Search::Stack*, Value, Value, int, bool) [clone .constprop.34]':
<artificial>:(.text+0xa184): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0xa32c): undefined reference to `__atomic_fetch_add_8'
/usr/bin/ld: <artificial>:(.text+0xa558): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0xacec): undefined reference to `__atomic_fetch_add_8'
/usr/bin/ld: <artificial>:(.text+0xb074): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0xb4d4): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0xb580): undefined reference to `__atomic_load_8'
/usr/bin/ld: /tmp/ccNhu0Kx.ltrans4.ltrans.o: in function `dbg_print()':
<artificial>:(.text+0x3c40): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x3c54): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x3c88): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x3cc0): undefined reference to `__atomic_load_8'
/usr/bin/ld: /tmp/ccNhu0Kx.ltrans4.ltrans.o:<artificial>:(.text+0x3cf8): more undefined references to `__atomic_load_8' follow
/usr/bin/ld: /tmp/ccNhu0Kx.ltrans2.ltrans.o: in function `Value (anonymous namespace)::search<((anonymous namespace)::NodeType)0>(Position&, Search::Stack*, Value, Value, int, bool) [clone .lto_priv.233]':
<artificial>:(.text+0x7218): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0x7308): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x78e8): undefined reference to `__atomic_fetch_add_8'
/usr/bin/ld: /tmp/ccNhu0Kx.ltrans3.ltrans.o: in function `Thread::search()':
<artificial>:(.text+0x3cb8): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x3f38): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x564c): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x5670): undefined reference to `__atomic_store_8'
/usr/bin/ld: /tmp/ccNhu0Kx.ltrans3.ltrans.o: in function `MainThread::search()':
<artificial>:(.text+0x5fc8): undefined reference to `__atomic_load_8'
/usr/bin/ld: <artificial>:(.text+0x63bc): undefined reference to `__atomic_store_8'
/usr/bin/ld: <artificial>:(.text+0x63f8): undefined reference to `__atomic_load_8'
/usr/bin/ld: /tmp/ccNhu0Kx.ltrans3.ltrans.o: in function `Position::do_move(Move, StateInfo&, bool)':
<artificial>:(.text+0x8d54): undefined reference to `__atomic_fetch_add_8'
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:699: stockfish] Error 1
make[1]: Leaving directory '/home/pi/sf_nnue/src'
make: *** [Makefile:594: build] Error 2
pi@pi:~/sf_nnue/src $

Adding -latomic still helps (no make clean between build):

pi@pi:~/sf_nnue/src $ make build ARCH=armv7 EXTRALDFLAGS=-latomic

Config:
debug: 'no'
sanitize: 'no'
optimize: 'yes'
arch: 'armv7'
bits: '32'
kernel: 'Linux'
os: 'GNU/Linux'
prefetch: 'yes'
popcnt: 'no'
sse: 'no'
ssse3: 'no'
sse41: 'no'
avx2: 'no'
pext: 'no'
avx512: 'no'
vnni: 'no'
neon: 'no'

Flags:
CXX: g++
CXXFLAGS: -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto
LDFLAGS: -latomic -Wl,--no-as-needed -lpthread -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto -flto=jobserver

Testing config sanity. If this fails, try 'make help' ...

make ARCH=armv7 COMP=gcc all
make[1]: Entering directory '/home/pi/sf_nnue/src'
g++ -o stockfish benchmark.o bitbase.o bitboard.o endgame.o evaluate.o main.o material.o misc.o movegen.o movepick.o pawns.o position.o psqt.o search.o thread.o timeman.o tt.o uci.o ucioption.o tune.o tbprobe.o evaluate_nnue.o half_kp.o -latomic -Wl,--no-as-needed -lpthread -Wall -Wcast-qual -fno-exceptions -std=c++17  -pedantic -Wextra -Wshadow -DNDEBUG -O3 -flto -flto=jobserver
make[1]: Leaving directory '/home/pi/sf_nnue/src'
pi@pi:~/sf_nnue/src $ 

@vondele
Copy link
Member

vondele commented Aug 14, 2020

the new warning is most likely innocent, we're requesting over-alignment. They should go away with gcc 9

so is the need to add -latomic specific to armv7 or is it conditional on the compiler version that happens to be available on that system that requires that. If we need -latomic for all armv7 builds we could easily add it. I see that also in the android issue #2860 -latomic is sometimes needed. If for both linux and android we need -latomic on armv7 it is easy enough to add to the Makefile. If however the flag interferes with building on certain OS/compiler versins it is more difficult.

@vondele
Copy link
Member

vondele commented Aug 14, 2020

so, from reading a bit up, I think it would be fine to add -latomic to the linker flags for armv7, the library seems to be needed whenever certain atomic operations are not supported by the hardware, and is named the same way for both gcc and clang.

@Dantist
Copy link

Dantist commented Aug 14, 2020

@vondele, Thank you for taking care of this! I am not a C++ dev, so can't help you with that and leave it up to you :-) We may add it and test it in the wild to see if there are any new complaints emerge. :-)

I want to recall that:

  1. My actual hardware is armv6, but it seems that it's not relevant to the situation.
  2. -latomic isn't needed on gcc 6.3, but adding -latomic was safe and just dropped performance a little bit (3%).

Offtopic:
When you just clone the repo and try to execute make help it prints the error (prior to printing the help itself):

pi@pi:~/sf_nnue/src $ make help
make: [Makefile:738: .depend] Error 1 (ignored)

To compile stockfish, type:
....

It also creates an empty .depend file in the directory and stops printing the error message on all consecutive make help executions. Also, empty .depend file speed up the building process start (obviously). Seems, that this has nothing to do with the issue we are discussing, I just wanted to inform you if you have not observed this before.

@vondele
Copy link
Member

vondele commented Aug 14, 2020

interestingly, the error on make help is not visible here. So something seems to go wrong while generating the dependencies at that point. Maybe you can see what the error message is when you apply this change

diff --git a/src/Makefile b/src/Makefile
index 38f607cb2..ba3564789 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -735,6 +735,6 @@ icc-profile-use:
        all
 
 .depend:
-       -@$(CXX) $(DEPENDFLAGS) -MM $(SRCS) > $@ 2> /dev/null
+       -@$(CXX) $(DEPENDFLAGS) -MM $(SRCS) > $@
 
 -include .depend

@Dantist
Copy link

Dantist commented Aug 14, 2020

Here it is:

pi@pi:~/sf_nnue/src $ make help
g++: error: unrecognized command line option ‘-msse’; did you mean ‘-fdse’?
make: [Makefile:738: .depend] Error 1 (ignored)

To compile stockfish, type:
....

@vondele
Copy link
Member

vondele commented Aug 14, 2020

ah, clear, somehow it passes the DEPENDFLAGS flags to dependency generation and they make no sense on arm (since -msse is x86). I'll remove that as part of the patch.

vondele added a commit to vondele/Stockfish that referenced this issue Aug 14, 2020
Pass -latomic to the linker as this appears needed for gcc on armv7

Don't pass the target specific `-msse` to the dependency generation,
as this leads to an error with `make help`

based on discussion in official-stockfish#2641
and official-stockfish#2860

No functional change.
@vondele
Copy link
Member

vondele commented Aug 14, 2020

Maybe you can have a look if that would fix this issue: #3006

@Dantist
Copy link

Dantist commented Aug 14, 2020

Definitely, this helped to build the binary on my setup. This also resolves the -msse issue. Thank you! :-)

P.S. I was eager to bench the NNUE on RPi 1 with PGO:

$ ./stockfish bench 16 1 13 default depth mixed >/dev/null
===========================
Total time (ms) : 305108
Nodes searched  : 3905447
Nodes/second    : 12800

$ ./stockfish bench 16 1 13 default depth classical >/dev/null
===========================
Total time (ms) : 196804
Nodes searched  : 4243037
Nodes/second    : 21559

$ ./stockfish bench 16 1 13 default depth NNUE >/dev/null
===========================
Total time (ms) : 503838
Nodes searched  : 4189131
Nodes/second    : 8314

profile-build vs build:

Classic, nps : 21559 vs 19024 (+13.3%)
NNUE,    nps :  8314 vs  6470 (+28.5%)

NNUE/classical ratio: 38.56

@vondele
Copy link
Member

vondele commented Aug 16, 2020

actually that speed ratio has a typo, it is a more reasonable 3.856

Any chance you do a match between NNUE and Classical on the hardware to see what the Elo difference would be? Maybe you need a light-weight game manager (see e.g. https://github.com/lucasart/c-chess-cli)

This might be the one of the few pieces of hardware where classical still outperforms NNUE, probably a bit depending on TC.

@Dantist
Copy link

Dantist commented Aug 18, 2020

actually that speed ratio has a typo, it is a more reasonable 3.856

Oh, sure there is a typo, but another one.. I mean 38.56% (profile-build) :-)

Maybe you need a light-weight game manager

I communicate with stockfish on Pi via SSH, it is something like "Remote UCI Engine" in my local network, so I can even run it using cutechess-cli on my PC.

Any chance you do a match between NNUE and Classical on the hardware

I will try to run cutechess-cli with noob_3moves.epd on STC, LTC, VLTC (with TC adaptation to this hardware, like fishtest do).
Please, reply if I do something wrong in my test and if there are other cutechess-cli options that I need to set :-)

Update:
The scale factor is 74 for this hardware. And it seems that TC should not be scaled at all if we benchmark the ELO difference on particular hardware. I will run on the usual 10+0.1, 60+0.6, and 120+1.2.

@vondele
Copy link
Member

vondele commented Aug 18, 2020

I have updated master with what I believe is the best patch so far. There might/will still be issues, let's try to improve as a follow up. Thanks for the feedback and testing.

joergoster pushed a commit to joergoster/Stockfish-old that referenced this issue Aug 18, 2020
The easiest way to use the NDK in conjunction with this Makefile (tested on linux-x86_64):

1. Download the latest NDK (r21d) from Google from https://developer.android.com/ndk/downloads
2. Place and unzip the NDK in $HOME/ndk folder
3. Export the path variable e.g., `export PATH=$PATH:$HOME/ndk/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/bin`
4. cd to your Stockfish/src dir
5. Issue `make -j ARCH=armv8 COMP=ndk build`  (use `ARCH=armv7` or `ARCH=armv7-neon` for older CPUs)
6. Optionally `make -j ARCH=armv8 COMP=ndk strip`
7. That's all. Enjoy!

Improves support from Raspberry Pi (incomplete?) and compiling on arm in general

closes official-stockfish/Stockfish#3015

fixes official-stockfish/Stockfish#2860

fixes official-stockfish/Stockfish#2641

Support is still fragile as we're missing CI on these targets. Nevertheless tested with:

```bash
  # build crosses from ubuntu 20.04 on x86 to various arch/OS combos
  # tested with suitable packages installed
  # (build-essentials, mingw-w64, g++-arm-linux-gnueabihf, NDK (r21d) from google)

  # cross to Android
  export PATH=$HOME/ndk/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/bin:$PATH
  make clean && make -j build ARCH=armv7         COMP=ndk  && make -j build ARCH=armv7 COMP=ndk strip
  make clean && make -j build ARCH=armv7-neon    COMP=ndk  && make -j build ARCH=armv7-neon COMP=ndk strip
  make clean && make -j build ARCH=armv8         COMP=ndk  && make -j build ARCH=armv8 COMP=ndk strip

  # cross to Raspberry Pi
  make clean && make -j build ARCH=armv7         COMP=gcc COMPILER=arm-linux-gnueabihf-g++
  make clean && make -j build ARCH=armv7-neon    COMP=gcc COMPILER=arm-linux-gnueabihf-g++

  # cross to Windows
  make clean && make -j build ARCH=x86-64-modern COMP=mingw
```

No functional change
lucabrivio pushed a commit to lucabrivio/Stockfish that referenced this issue Aug 18, 2020
The easiest way to use the NDK in conjunction with this Makefile (tested on linux-x86_64):

1. Download the latest NDK (r21d) from Google from https://developer.android.com/ndk/downloads
2. Place and unzip the NDK in $HOME/ndk folder
3. Export the path variable e.g., `export PATH=$PATH:$HOME/ndk/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/bin`
4. cd to your Stockfish/src dir
5. Issue `make -j ARCH=armv8 COMP=ndk build`  (use `ARCH=armv7` or `ARCH=armv7-neon` for older CPUs)
6. Optionally `make -j ARCH=armv8 COMP=ndk strip`
7. That's all. Enjoy!

Improves support from Raspberry Pi (incomplete?) and compiling on arm in general

closes official-stockfish/Stockfish#3015

fixes official-stockfish/Stockfish#2860

fixes official-stockfish/Stockfish#2641

Support is still fragile as we're missing CI on these targets. Nevertheless tested with:

```bash
  # build crosses from ubuntu 20.04 on x86 to various arch/OS combos
  # tested with suitable packages installed
  # (build-essentials, mingw-w64, g++-arm-linux-gnueabihf, NDK (r21d) from google)

  # cross to Android
  export PATH=$HOME/ndk/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/bin:$PATH
  make clean && make -j build ARCH=armv7         COMP=ndk  && make -j build ARCH=armv7 COMP=ndk strip
  make clean && make -j build ARCH=armv7-neon    COMP=ndk  && make -j build ARCH=armv7-neon COMP=ndk strip
  make clean && make -j build ARCH=armv8         COMP=ndk  && make -j build ARCH=armv8 COMP=ndk strip

  # cross to Raspberry Pi
  make clean && make -j build ARCH=armv7         COMP=gcc COMPILER=arm-linux-gnueabihf-g++
  make clean && make -j build ARCH=armv7-neon    COMP=gcc COMPILER=arm-linux-gnueabihf-g++

  # cross to Windows
  make clean && make -j build ARCH=x86-64-modern COMP=mingw
```

No functional change
Dantist pushed a commit to Dantist/Stockfish that referenced this issue Dec 22, 2020
The easiest way to use the NDK in conjunction with this Makefile (tested on linux-x86_64):

1. Download the latest NDK (r21d) from Google from https://developer.android.com/ndk/downloads
2. Place and unzip the NDK in $HOME/ndk folder
3. Export the path variable e.g., `export PATH=$PATH:$HOME/ndk/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/bin`
4. cd to your Stockfish/src dir
5. Issue `make -j ARCH=armv8 COMP=ndk build`  (use `ARCH=armv7` or `ARCH=armv7-neon` for older CPUs)
6. Optionally `make -j ARCH=armv8 COMP=ndk strip`
7. That's all. Enjoy!

Improves support from Raspberry Pi (incomplete?) and compiling on arm in general

closes official-stockfish#3015

fixes official-stockfish#2860

fixes official-stockfish#2641

Support is still fragile as we're missing CI on these targets. Nevertheless tested with:

```bash
  # build crosses from ubuntu 20.04 on x86 to various arch/OS combos
  # tested with suitable packages installed
  # (build-essentials, mingw-w64, g++-arm-linux-gnueabihf, NDK (r21d) from google)

  # cross to Android
  export PATH=$HOME/ndk/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/bin:$PATH
  make clean && make -j build ARCH=armv7         COMP=ndk  && make -j build ARCH=armv7 COMP=ndk strip
  make clean && make -j build ARCH=armv7-neon    COMP=ndk  && make -j build ARCH=armv7-neon COMP=ndk strip
  make clean && make -j build ARCH=armv8         COMP=ndk  && make -j build ARCH=armv8 COMP=ndk strip

  # cross to Raspberry Pi
  make clean && make -j build ARCH=armv7         COMP=gcc COMPILER=arm-linux-gnueabihf-g++
  make clean && make -j build ARCH=armv7-neon    COMP=gcc COMPILER=arm-linux-gnueabihf-g++

  # cross to Windows
  make clean && make -j build ARCH=x86-64-modern COMP=mingw
```

No functional change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants