-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross-compilation for RPi3 (armv7) fails on assembly #1465
Comments
This is a toolchain issue. Either the assembler doesn't support NEON dot product instructions, or the compiler emits wrong directives for the assembler. |
I ran into this also, when building tensorflow-lite 2.5.0 with a crosstool-ng based toolchain I had built. I then used the I then checked to see what the differences were, for the file that has the error: My toolchain had gcc 8.4; vs the suggested toolchain has gcc 8.3. My toolchain was using binutils 2.36.1; the suggested toolchain was using binutils 2.32. I replaced the binutils in my toolchain with 2.32 and the xnnpack build succeeded with that. A bit of git bisect led me to this commit in binutils that changes the behavior of the assembler when it sees a file test.s:
GCC 8, 9 and 10 all generate similar assembly with the extension listed first, then the fpu directive; causing the assembler to reject this input. So I think the workaround here at the moment is to either switch to binutils 2.33.1. (Or maybe it's ok to temporarily disable the fpu reset behavior by patching the latest binutils..) I'll try to report this to binutils. |
@happyalu Thank you for investigating! I don't think it is possible to work around this issue on XNNPACK side, please report to binutils. |
Thanks. Reported here. https://sourceware.org/bugzilla/show_bug.cgi?id=28078 |
@happyalu @Maratyszcza,
I go to the official arm-toolchain-compiler website, and I found that the vsdot asm instruction only support for armv8 or later Seems the only option is to avoid vectorized operation from XNNPACK just to build, but will be no performance gain. hope this help you guys. |
Cortex-A72 cores in Raspberry Pi 4 are ARMv8 cores. |
Update from the upstream issue: This has now been fixed in gcc (master, and active versions), thanks to Richard Earnshaw! |
Hi everyone, I have similar error at some point when building python 3 on ODROID xu4 with the command "python3 setup.py build". I am new to installing pytorch on hardware. Any ideas to get this fixed will be appreciated. Thanks |
me too!!! |
Until the next patch release of gcc is out, I'm not sure of any workaround besides using binutils <= 2.33.1. |
Simplified and cleaned build scripts Cleaned unecessary *.pc files Unified toolchain (using tensorflow one. Mandatory because XNNPACK needs this specific toolchain see : google/XNNPACK#1465 (comment)) All build fine but untested on raspberry
Tflite specific toolchain was incompatible with opencv build. It kept bugging with error such as (I couldn't find any fix, despite googling it everywhere) : .../c++/8.3.0/ext/concurrence.h:122:34: error: '__PTHREAD_SPINS' was not declared in this scope __gthread_mutex_t _M_mutex = __GTHREAD_MUTEX_INIT; ^~~~~~~~~~~~~~~~~~~~ However, the Ubuntu 20.04 crossbuild-essential-armhf toolchain was also incompatible with tensorflow lite build, due to issue mentionned in : google/XNNPACK#1465 (comment) This fix is a hack. I use the Ubuntu 20.04 crossbuild-essential-armhf toolchain but replace the assembler with the one coming from the tensorflow lite toolchain (binutils 2.32). Luckily, all is done within the Docker image. Both builds, local and on raspberry, works and run now.
Well, I believe the patch has been applied to gcc>=9. I use Crosstool-NG to build an aarch64 toolchain with gcc12.3 and binutils2.29.1. But building Xnnpack still gives me the errors:
What works for me is: gcc>=9 and binutils>=2.34. This is somewhat unfortunate because this prohibits me to choose a glibc with a low version number, and consequently I cannot use Xnnpack on all client systems. Does anybody have some experience with or thoughts about this issue that are still unmentioned in this thread? |
@misterBart configure with |
@Maratyszcza Yes, good that you mention that, because I used that as a workaround to build for low glibc systems. Still, using those flags felt like applying a workaround and made me wonder how future-proof that solution is. |
You can't build a software using recent CPU instructions with a compiler which doesn't support those instructions. The only solution is to disable the use of instructions not supported by the compiler. This is exactly what |
Don't get me wrong, I am using a new compiler (gcc12.3). It's glibc that I would like to keep old when building. |
In addition to the compiler, you may need to use a new version of binutils, to avoid issues like in the first post. |
Trying to cross-compile TFLite 2.5.0 for RPi3 with CMake and XNNPACK enabled:
armv8-rpi3-linux-gnueabihf
)ARMCC_FLAGS="-march=armv7-a -mfpu=neon-vfpv4 -funsafe-math-optimizations"
says this:
The text was updated successfully, but these errors were encountered: