-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
armeabi-v7a assembler error #59970
Comments
Hi, Is there any way to track Thanks |
Hi, Any update or information will be posted in this issue thread. |
Hi @RobertFlatt I have tried to build for With XNNPACK using the command Can you please let us know if you are still facing the issue? Thanks. |
yes
Edit: same result with NDK "current LTS release" |
I was able to replicate on r2.13 branch:
|
There seems to have been some analysis google/XNNPACK#4775 (comment) But I can't see (I'm outside Google) if this has resulted in an updated implementation..... And I can infer from some of those posts 'just wait for some future NDK where some future Clang will be fixed', I would consider that an insufficient response. |
Does the recommended work around in both those threads suffice for you so far? |
Hi @RobertFlatt Android NDK version=25 is actually not currently supported, the official documentation https://www.tensorflow.org/lite/android/lite_build currently recommends version 21e. Can you try with that version instead? |
Thanks for your interest. For the recommended workaround google/XNNPACK#4775 (comment) EDIT: Due to typo in NDK specification, my previous comments (below) were using gcc not Clang, and thus incorrect.
Fails with
I think the issue still exists. Can you confirm? ================================== But I'm still exploring unexpected issues. For example the value of the flag (ON/OFF) is ignored, the flag's presence disables the bogus assembly code error - the argument is always interpreted as OFF. So maybe this will be usable, I need to do more testing. But I still think the project build script should implement the fix. Because the long term viability and side effects of a workaround are not visible to an end user. And the core issue which is build interaction with Clang is clearly in the domain of the tool developers. ===================================
I think the referenced page has bit rot. The default NDK is 25c as shown here https://developer.android.com/ndk/downloads I use 25c. |
Hi @RobertFlatt, While there is always some bit rot in documentation, I have confirmed that that part is accurate. We more or less only support NDK 19, 20, 21. We do not currently support NDK version=25 for now, as such we can't help with this issue unless we are seeing this issue with the recommended version. |
This statement is clearly wrong. |
Hi @RobertFlatt , Please check the below configuration file which mentions about the supported NDK versions. Lines 38 to 41 in 6088a22
|
OK, but that is NDK versions for Bazel https://github.com/tensorflow/tensorflow/blob/master/configure.py#L737-L738 The following simple tests use Tensorflow v2.13.0-rc1 Moving away from the issue at hand we use arm64-v8a to test CMake builds with 3 NDKs Both NDK LTS r25c and r23b build without error, try:
However r19c fails with
You can easily validate this for yourself using the commands above. Returning to the issue at hand we use armeabi-v7a and CMake to test the proposed workaround The proposed workaround
But this fails in the usual Presumably I don't understand how to use the workaround. Please explain how to use |
Further investigation reveals And we should not expect it to because this is defined in the default armeabi-v7a build behavior https://github.com/google/XNNPACK/blob/master/scripts/build-android-armv7.sh#L59 and we know the default fails to build. Not clear why it was suggested as a workaround, as adding it clearly has no impact. This not-a-workaround was apparently suggested here google/XNNPACK#4775 (comment) by the same account that provided non-resolutions to this issue here google/XNNPACK#4348 (comment) , and here google/XNNPACK#4348 (comment) , and (in retrospect) here google/XNNPACK#4348 (comment) . Two of these non-resolutions are referenced in the first post in this issue. This code used to work, but somebody broke it between 2.9 and 2.12 . Enough already! Please, a resolution. |
@terryheo can you please take a look? Also can we verify NDK supported versions for CMake Android workflow? |
@pkgoogle |
### Description <!-- Describe your changes. --> Use different march flag to workaround what appears to be a clang issue. See tensorflow/tensorflow#59970 for links to various relevant pieces of info/discussions. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
Trying to compile v2.15.0 for armeabi-v7a, with NDK 25c (officially supported now I believe) and running into the same problem. Long cryptic error (click me)[ 23%] Building C object _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o
fatal error: error in backend: Cannot select: 0x83465a8: v4bf16 = ARMISD::VEXT 0x836ab38, 0x836ab38, Constant:i32<2>, xnnpack/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c:119:22
0x836ab38: v4bf16,ch = CopyFromReg 0x84e41b8, Register:v4bf16 %54, xnnpack/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c:117:9
0x851e1f0: v4bf16 = Register %54
0x836ab38: v4bf16,ch = CopyFromReg 0x84e41b8, Register:v4bf16 %54, xnnpack/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c:117:9
0x851e1f0: v4bf16 = Register %54
0x836afb0: i32 = Constant<2>
In function: xnn_bf16_gemm_minmax_ukernel_1x4c8__neonbf16_bfdot
PLEASE submit a bug report to https://github.com/android-ndk/ndk/issues and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: /home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang --target=armv7-none-linux-androideabi26 --sysroot=/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/sysroot -DEIGEN_MPL2_ONLY -DFXDIV_USE_INLINE_ASSEMBLY=0 -DNOMINMAX=1 -DPTHREADPOOL_NO_DEPRECATED_API=1 -DXNN_ENABLE_ARM_BF16=1 -DXNN_ENABLE_ARM_DOTPROD=1 -DXNN_ENABLE_ARM_FP16_SCALAR=1 -DXNN_ENABLE_ARM_FP16_VECTOR=1 -DXNN_ENABLE_ARM_I8MM=1 -DXNN_ENABLE_ASSEMBLY=1 -DXNN_ENABLE_CPUINFO=1 -DXNN_ENABLE_DWCONV_MULTIPASS=0 -DXNN_ENABLE_GEMM_M_SPECIALIZATION=1 -DXNN_ENABLE_JIT=0 -DXNN_ENABLE_MEMOPT=1 -DXNN_ENABLE_RISCV_VECTOR=1 -DXNN_ENABLE_SPARSE=1 -I/home/peter/dev/sandbox/tensorflow/tensorflow-2.15.0/third_party/xla/third_party/tsl -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/opencl_headers -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/vulkan_headers/include -I/home/peter/dev/sandbox/tensorflow/tensorflow-2.15.0/tensorflow/lite/delegates/gpu/common -I/home/peter/dev/sandbox/tensorflow/tensorflow-2.15.0/tensorflow/lite/delegates/gpu/common/task -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/xnnpack/src -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/pthreadpool-source/include -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/FXdiv-source/include -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/FP16-source/include -g -DANDROID -fdata-sections -ffunction-sections -funwind-tables -fstack-protector-strong -no-canonical-prefixes -D_FORTIFY_SOURCE=2 -march=armv7-a -mthumb -Wformat -Werror=format-security -O3 -DNDEBUG -std=c99 -fPIC -O2 -pthread -fno-math-errno -marm -march=armv8.2-a+bf16 -mfpu=neon-fp-armv8 -MD -MT _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o -MF CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o.d -o CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o -c /home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/xnnpack/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c
1. <eof> parser at end of file
2. Code generation
3. Running pass 'Function Pass Manager' on module '/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/xnnpack/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c'.
4. Running pass 'ARM Instruction Selection' on function '@xnn_bf16_gemm_minmax_ukernel_1x4c8__neonbf16_bfdot'
#0 0x00000000047d91d8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x47d91d8)
#1 0x00000000047d8340 llvm::sys::RunSignalHandlers() (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x47d8340)
#2 0x00000000047a3dc3 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x47a3dc3)
#3 0x00000000047a3d7b (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x47a3d7b)
#4 0x00000000047d7a87 llvm::sys::Process::Exit(int, bool) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x47d7a87)
#5 0x00000000040dc70a (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x40dc70a)
#6 0x0000000003083072 llvm::report_fatal_error(llvm::Twine const&, bool) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x3083072)
#7 0x000000000282b5f5 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x282b5f5)
#8 0x0000000006cf4e77 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x6cf4e77)
#9 0x000000000641f425 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x641f425)
#10 0x0000000005e86b63 llvm::SelectionDAGISel::DoInstructionSelection() (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5e86b63)
#11 0x0000000005e8710a llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5e8710a)
#12 0x0000000006417d3c llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x6417d3c)
#13 0x0000000006457ad3 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x6457ad3)
#14 0x00000000064572df (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x64572df)
#15 0x0000000005d9faea llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5d9faea)
#16 0x0000000005da0113 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5da0113)
#17 0x0000000005d9fc6f llvm::FPPassManager::runOnModule(llvm::Module&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5d9fc6f)
#18 0x00000000063aa794 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x63aa794)
#19 0x00000000065d6968 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x65d6968)
#20 0x00000000060524d5 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x60524d5)
#21 0x0000000005ea25a9 clang::ParseAST(clang::Sema&, bool, bool) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5ea25a9)
#22 0x00000000063c128d clang::FrontendAction::Execute() (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x63c128d)
#23 0x00000000063c112d clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x63c112d)
#24 0x00000000063c1541 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x63c1541)
#25 0x00000000066a9f54 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a9f54)
#26 0x00000000066a6de3 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a6de3)
#27 0x00000000066a6c92 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a6c92)
#28 0x00000000066a6c61 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a6c61)
#29 0x00000000066a69f4 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >*, bool*) const (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a69f4)
#30 0x00000000066a685f clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a685f)
#31 0x00000000066a66f2 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*> >&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a66f2)
#32 0x00000000066752ee main (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66752ee)
#33 0x00007f6833640cd0 (/usr/lib/libc.so.6+0x27cd0)
#34 0x00007f6833640d8a __libc_start_main (/usr/lib/libc.so.6+0x27d8a)
#35 0x00000000064cce69 _start (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x64cce69)
clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)
Android (9352603, based on r450784d1) clang version 14.0.7 (https://android.googlesource.com/toolchain/llvm-project 4c603efb0cca074e9238af8b4106c30add4418f6)
Target: armv7-none-linux-android26
Thread model: posix
InstalledDir: /home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin
clang: note: diagnostic msg:
********************
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/bf16-gemm-1x4c8-minmax-neonbf16-bfdot-463081.c
clang: note: diagnostic msg: /tmp/bf16-gemm-1x4c8-minmax-neonbf16-bfdot-463081.sh
clang: note: diagnostic msg:
********************
make[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-all.dir/build.make:49874: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o] Error 70
make[1]: *** [CMakeFiles/Makefile2:6653: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/all] Error 2
make: *** [Makefile:136: all] Error 2 Any update regarding this? ----- Update [ 1%] Building C object _deps/xnnpack-build/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/neoni8mm.c.o
clang: error: the clang compiler does not support '-march=armv8.2-a+i8mm' |
I was trying to build according to this docs: https://www.tensorflow.org/lite/guide/build_cmake_arm - but it failed. I was trying to use ndk in the same way as you - the same error |
the same issue, test at r2.16 with ndk:26.1.10909125. |
Hi @pfk-beta, @RobertFlatt, I got a workaround. You can directly modifiy the xnnpack source code which is inside the tflite build folder. The main idea is to bypass the error code and use the default branch. |
related onnx issue: google/XNNPACK#6164, pr: google/XNNPACK#6179 |
### Description <!-- Describe your changes. --> Use different march flag to workaround what appears to be a clang issue. See tensorflow/tensorflow#59970 for links to various relevant pieces of info/discussions. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
Click to expand!
Issue Type
Bug
Have you reproduced the bug with TF nightly?
No
Source
source
Tensorflow Version
v2.12.0-rc1
Custom Code
No
OS Platform and Distribution
Ubuntu 22.04
Mobile device
N/A
Python version
N/A
Bazel version
Using CMake
GCC/Compiler version
Clang, NDK 25b
CUDA/cuDNN version
No response
GPU model and memory
No response
Current Behaviour?
Building tensorflow lite (v2.12.0-rc1) for Android armeabi-v7a using CMake and NDK 25b, I get the following invalid assembly code error:
The cause is here (a more recent freeze) https://github.com/google/XNNPACK/blob/test_515720556/src/xnnpack/math.h#L332
Android arm64-v8a builds and runs without error. With an earlier tensorflow lite version (v2.8.0) both armeabi-v7a and arm64-v8a built and ran without error.
As I read it
'=t'
is documented as a valid constraint for "ARM family", but the assembler thinks this is not the case.https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#Simple-Constraints
https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints
Support at XNNPACK said a compiler flag
-mfpu=vfp
is required to enable the assembly code google/XNNPACK#4348 (comment) , and that the flag was set. Then suggested without reference that this was a Clang bug, and did not offer a workaround.Further investigation suggested Clang was not the issue google/XNNPACK#4348 (comment)
The CMake build script covers eight conditions for various arm 32 bit devices. Only two of these (both
-march=armv6
) set the required flag. The-mfpu=vfp
flag is not set for-march=armv7-a
, which I suspect is the cause of this issue. https://github.com/google/XNNPACK/blob/master/CMakeLists.txt#L546-L553XNNPACK support responded, but we did not communicate successfully (as shown by google/XNNPACK#4348 (comment) and google/XNNPACK#4348 (comment)) ; and we did not get a resolution. Since tflite depends on XNNPACK, I look for resolution here. Thank you.
Relevant log output
The text was updated successfully, but these errors were encountered: