armeabi-v7a assembler error #59970

RobertFlatt · 2023-03-13T18:42:44Z

Click to expand!

Issue Type

Bug

Have you reproduced the bug with TF nightly?

No

Source

source

Tensorflow Version

v2.12.0-rc1

Custom Code

No

OS Platform and Distribution

Ubuntu 22.04

Mobile device

N/A

Python version

N/A

Bazel version

Using CMake

GCC/Compiler version

Clang, NDK 25b

CUDA/cuDNN version

No response

GPU model and memory

No response

Current Behaviour?

Building tensorflow lite (v2.12.0-rc1) for Android armeabi-v7a using CMake and NDK 25b, I get the following invalid assembly code error:

tflite-runtime/tensorflow/lite/tools/pip_package/gen/tflite_pip/python3/cmake_build/xnnpack/src/xnnpack/math.h:311:13: error: invalid output constraint \'=t\' in asm\n      : [i] "=t" (i)

The cause is here (a more recent freeze) https://github.com/google/XNNPACK/blob/test_515720556/src/xnnpack/math.h#L332

Android arm64-v8a builds and runs without error. With an earlier tensorflow lite version (v2.8.0) both armeabi-v7a and arm64-v8a built and ran without error.

As I read it '=t' is documented as a valid constraint for "ARM family", but the assembler thinks this is not the case.
https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#Simple-Constraints
https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints

Support at XNNPACK said a compiler flag -mfpu=vfp is required to enable the assembly code google/XNNPACK#4348 (comment) , and that the flag was set. Then suggested without reference that this was a Clang bug, and did not offer a workaround.

Further investigation suggested Clang was not the issue google/XNNPACK#4348 (comment)

The CMake build script covers eight conditions for various arm 32 bit devices. Only two of these (both -march=armv6) set the required flag. The -mfpu=vfp flag is not set for -march=armv7-a, which I suspect is the cause of this issue. https://github.com/google/XNNPACK/blob/master/CMakeLists.txt#L546-L553

XNNPACK support responded, but we did not communicate successfully (as shown by google/XNNPACK#4348 (comment) and google/XNNPACK#4348 (comment)) ; and we did not get a resolution. Since tflite depends on XNNPACK, I look for resolution here. Thank you.



### Standalone code to reproduce the issue

```shell
This is a build issue, no extra code.

Relevant log output

tflite-runtime/tensorflow/lite/tools/pip_package/gen/tflite_pip/python3/cmake_build/xnnpack/src/xnnpack/math.h:311:13: error: invalid output constraint \'=t\' in asm\n      : [i] "=t" (i)

The text was updated successfully, but these errors were encountered:

RobertFlatt · 2023-03-30T20:27:01Z

Hi,

Is there any way to track stat:awaiting tensorflower ?

Thanks

sachinprasadhs · 2023-03-30T22:39:42Z

Hi, Any update or information will be posted in this issue thread.

pjpratik · 2023-04-22T03:45:37Z

Hi @RobertFlatt

I have tried to build for armeabi-v7a using the android cross compilation build instructions on r2.12 and was able build successfully without any error. Please find the screenshot below.

With XNNPACK using the command
cmake -DCMAKE_TOOLCHAIN_FILE=android-ndk-r25/build/cmake/android.toolchain.cmake -DANDROID_ABI=armeabi-v7a tensorflow/lite -DTFLITE_ENABLE_XNNPACK=ON

Can you please let us know if you are still facing the issue?

Thanks.

RobertFlatt · 2023-04-25T19:37:39Z

Can you please let us know if you are still facing the issue?

yes ☹️

I can replicate your response (I used git clone -b v2.12.0 --single-branch https://github.com/tensorflow/tensorflow.git tensorflow_src and ndk 25b), but a following cmake --build . -j fails with the same issue that was reported in the original post:

In file included from /home/bobf/ex/tflite_build/xnnpack/src/qc8-gemm/gen/qc8-gemm-1x1c4-minmax-fp32-armsimd32.c:15:
/home/bobf/ex/tflite_build/xnnpack/src/xnnpack/math.h:316:13: error: invalid output constraint '=t' in asm
      : [i] "=t" (i)

FYI my use case is slightly different as I am building a pip package for Python (yes, on Android), but the use case in your post illustrates presumably the same issue.

Edit: same result with NDK "current LTS release" 25c https://github.com/android/ndk/wiki#current-lts-release

pkgoogle · 2023-05-31T18:42:03Z

I was able to replicate on r2.13 branch:

cmake -DCMAKE_TOOLCHAIN_FILE=~/Android/Sdk/ndk/25.2.9519653/build/cmake/android.toolchain.cmake  -DANDROID_ABI=armeabi-v7a ../tensorflow/lite -DTFLITE_ENABLE_XNNPACK=ON
cmake --build . -j

...
gmake[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-all.dir/build.make:16694: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/qs8-igemm/gen/qs8-igemm-1x2c4-minmax-fp32-armsimd32.c.o] Error 1
In file included from In file included from /usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/qu8-gemm/gen/qu8-gemm-1x1c4-minmax-fp32-armsimd32.c/usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/qs8-vlrelu/gen/qs8-vlrelu-armsimd32-x4.c::1515:
:
/usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/xnnpack/math.h:332:13/usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/xnnpack/math.h: :332:error: 13: invalid output constraint '=t' in asm
error: invalid output constraint '=t' in asm
      : [i] "=t" (i)
            ^
      : [i] "=t" (i)
            ^
gmake[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-all.dir/build.make:16680: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/qs8-igemm/gen/qs8-igemm-1x1c4-minmax-fp32-armsimd32.c.o] Error 1
1 error generated.
1 error generated.
1 error generated.
gmake[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-all.dir/build.make:16554: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/qc8-gemm/gen/qc8-gemm-2x2c4-minmax-fp32-armsimd32.c.o] Error 1
gmake[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-all.dir/build.make:16666: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/qs8-gemm/gen/qs8-gemm-2x2c4-minmax-fp32-armsimd32.c.o] Error 1
In file included from /usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/qs8-vlrelu/gen/qs8-vlrelu-armsimd32-x8.c:15:
/usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/xnnpack/math.h:332:13: error: invalid output constraint '=t' in asm
      : [i] "=t" (i)
            ^
1 error generated.
gmake[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-all.dir/build.make:16708: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/qs8-igemm/gen/qs8-igemm-2x1c4-minmax-fp32-armsimd32.c.o] Error 1
In file included from /usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/qu8-igemm/gen/qu8-igemm-1x1c4-minmax-fp32-armsimd32.c:15:
/usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/xnnpack/math.h:332:13: gmake[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-all.dir/build.make:16736: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-armsimd32-x4.c.o] Error 1
error: invalid output constraint '=t' in asm
gmake[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-all.dir/build.make:16750: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-armsimd32-x8.c.o] Error 1
      : [i] "=t" (i)
            ^
In file included from /usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/qu8-gemm/gen/qu8-gemm-2x2c4-minmax-fp32-armsimd32.c:15:
/usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/xnnpack/math.h:332:13: error: invalid output constraint '=t' in asm
      : [i] "=t" (i)
            ^
1 error generated.
1 error generated.
In file included from /usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/qu8-vcvt/gen/qu8-vcvt-armsimd32-x4.c:15:
/usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/xnnpack/math.h:332:13: error: invalid output constraint '=t' in asm
gmake[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-all.dir/build.make:16596: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/qc8-igemm/gen/qc8-igemm-2x1c4-minmax-fp32-armsimd32.c.o] Error 1
      : [i] "=t" (i)
            ^
In file included from /usr/local/google/home/pisethk/tensorflow/tflite_build/xnnpack/src/qu8-vlrelu/gen/qu8-vlrelu-armsimd32-x4.c:15:

RobertFlatt · 2023-05-31T23:25:13Z

There seems to have been some analysis

google/XNNPACK#4775 (comment)
and
google/XNNPACK#4775 (comment)

But I can't see (I'm outside Google) if this has resulted in an updated implementation.....

And I can infer from some of those posts 'just wait for some future NDK where some future Clang will be fixed', I would consider that an insufficient response.

pkgoogle · 2023-06-01T16:58:24Z

@RobertFlatt

Does the recommended work around in both those threads suffice for you so far?

pkgoogle · 2023-06-01T21:54:42Z

Hi @RobertFlatt

Android NDK version=25 is actually not currently supported, the official documentation https://www.tensorflow.org/lite/android/lite_build currently recommends version 21e.

Can you try with that version instead?

RobertFlatt · 2023-06-02T02:20:11Z

@pkgoogle

Thanks for your interest.

For the recommended workaround google/XNNPACK#4775 (comment) -DXNNPACK_ENABLE_ARM_BF16=OFF :

EDIT:

Due to typo in NDK specification, my previous comments (below) were using gcc not Clang, and thus incorrect.

cmake -DCMAKE_TOOLCHAIN_FILE=../android-ndk-r25c/build/cmake/android.toolchain.cmake -DANDROID_ABI=armeabi-v7a -DTFLITE_ENABLE_XNNPACK=ON -DXNNPACK_ENABLE_ARM_BF16=OFF ../tensorflow_src/tensorflow/lite

Fails with

/home/bobf/ex/tflite_build/xnnpack/src/qc8-igemm/gen/qc8-igemm-1x2c4-minmax-fp32-armsimd32.c:15: /home/bobf/ex/tflite_build/xnnpack/src/xnnpack/math.h:316:13: error: invalid output constraint '=t' in asm : [i] "=t" (i) ^

I think the issue still exists. Can you confirm?

==================================
THESE PREVIOUS COMMENTS ARE INCORRECT
The compile flag does enable an error free compile with the test case above, which is a good first step.

But I'm still exploring unexpected issues.

For example the value of the flag (ON/OFF) is ignored, the flag's presence disables the bogus assembly code error - the argument is always interpreted as OFF.

So maybe this will be usable, I need to do more testing.

But I still think the project build script should implement the fix. Because the long term viability and side effects of a workaround are not visible to an end user. And the core issue which is build interaction with Clang is clearly in the domain of the tool developers.

===================================

Android NDK version=25 is actually not currently supported, the official documentation https://www.tensorflow.org/lite/android/lite_build currently recommends version 21e.

I think the referenced page has bit rot.

The default NDK is 25c as shown here https://developer.android.com/ndk/downloads
This is confirmed by the tflite CMake build which defaults to 25c if an NDK is not set.

I use 25c.

pkgoogle · 2023-06-02T21:33:59Z

Hi @RobertFlatt,

While there is always some bit rot in documentation, I have confirmed that that part is accurate. We more or less only support NDK 19, 20, 21. We do not currently support NDK version=25 for now, as such we can't help with this issue unless we are seeing this issue with the recommended version.

RobertFlatt · 2023-06-03T00:12:47Z

We do not currently support NDK version=25 for now,

This statement is clearly wrong.

sachinprasadhs · 2023-06-05T17:14:20Z

Hi @RobertFlatt ,

Please check the below configuration file which mentions about the supported NDK versions.

tensorflow/configure.py

Lines 38 to 41 in 6088a22

    
           _SUPPORTED_ANDROID_NDK_VERSIONS = [ 
        
               19, 20, 21 
        
           ]

RobertFlatt · 2023-06-11T02:45:07Z

@sachinprasadhs

OK, but that is NDK versions for Bazel https://github.com/tensorflow/tensorflow/blob/master/configure.py#L737-L738
The issue at hand is CMake.

The following simple tests use Tensorflow v2.13.0-rc1

Moving away from the issue at hand we use arm64-v8a to test CMake builds with 3 NDKs

Both NDK LTS r25c and r23b build without error, try:

cmake  -DCMAKE_TOOLCHAIN_FILE=../android-ndk-r25c/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a  ../tensorflow_src/tensorflow/lite 
cmake --build . -j

cmake  -DCMAKE_TOOLCHAIN_FILE=../android-ndk-r23b/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a  ../tensorflow_src/tensorflow/lite 
cmake --build . -j

However r19c fails with

clang: error: the clang compiler does not support '-march=armv8.2-a+bf16'

cmake  -DCMAKE_TOOLCHAIN_FILE=../android-ndk-r19c/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a  ../tensorflow_src/tensorflow/lite 
cmake --build . -j

You can easily validate this for yourself using the commands above.

Returning to the issue at hand we use armeabi-v7a and CMake to test the proposed workaround

The proposed workaround -DXNNPACK_ENABLE_ARM_BF16=OFF

cmake  -DCMAKE_TOOLCHAIN_FILE=../android-ndk-r25c/build/cmake/android.toolchain.cmake -DANDROID_ABI=armeabi-v7a -DTFLITE_ENABLE_XNNPACK=ON -DXNNPACK_ENABLE_ARM_BF16=OFF ../tensorflow_src/tensorflow/lite  
cmake --build . -j

But this fails in the usual asm : [i] "=t" (i) way.

Presumably I don't understand how to use the workaround.

Please explain how to use -DXNNPACK_ENABLE_ARM_BF16=OFF , and if you could test the explaination as well that might save us some back and forth. Thank you.

RobertFlatt · 2023-06-12T00:06:58Z

Further investigation reveals -DXNNPACK_ENABLE_ARM_BF16=OFF does not workaround the issue.

And we should not expect it to because this is defined in the default armeabi-v7a build behavior https://github.com/google/XNNPACK/blob/master/scripts/build-android-armv7.sh#L59 and we know the default fails to build.

Not clear why it was suggested as a workaround, as adding it clearly has no impact.

This not-a-workaround was apparently suggested here google/XNNPACK#4775 (comment) by the same account that provided non-resolutions to this issue here google/XNNPACK#4348 (comment) , and here google/XNNPACK#4348 (comment) , and (in retrospect) here google/XNNPACK#4348 (comment) . Two of these non-resolutions are referenced in the first post in this issue.

This code used to work, but somebody broke it between 2.9 and 2.12 .

Enough already! Please, a resolution.

pkgoogle · 2023-06-12T16:40:51Z

@terryheo can you please take a look? Also can we verify NDK supported versions for CMake Android workflow?

RobertFlatt · 2023-07-19T02:39:43Z

@pkgoogle
5 weeks have passed, any clarity on this issue?

### Description  Use different march flag to workaround what appears to be a clang issue. See tensorflow/tensorflow#59970 for links to various relevant pieces of info/discussions. ### Motivation and Context

derpda · 2023-12-08T04:59:42Z

Trying to compile v2.15.0 for armeabi-v7a, with NDK 25c (officially supported now I believe) and running into the same problem.
Applying the fixes from microsoft/onnxruntime@8d298f6 to XNNPACK's CMakeLists.txt gets me past the first issue, but the build then later fails on XNNPACK microkernel compilation with below (rather cryptic) error:

Long cryptic error (click me)

[ 23%] Building C object _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o
fatal error: error in backend: Cannot select: 0x83465a8: v4bf16 = ARMISD::VEXT 0x836ab38, 0x836ab38, Constant:i32<2>, xnnpack/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c:119:22
  0x836ab38: v4bf16,ch = CopyFromReg 0x84e41b8, Register:v4bf16 %54, xnnpack/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c:117:9
    0x851e1f0: v4bf16 = Register %54
  0x836ab38: v4bf16,ch = CopyFromReg 0x84e41b8, Register:v4bf16 %54, xnnpack/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c:117:9
    0x851e1f0: v4bf16 = Register %54
  0x836afb0: i32 = Constant<2>
In function: xnn_bf16_gemm_minmax_ukernel_1x4c8__neonbf16_bfdot
PLEASE submit a bug report to https://github.com/android-ndk/ndk/issues and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang --target=armv7-none-linux-androideabi26 --sysroot=/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/sysroot -DEIGEN_MPL2_ONLY -DFXDIV_USE_INLINE_ASSEMBLY=0 -DNOMINMAX=1 -DPTHREADPOOL_NO_DEPRECATED_API=1 -DXNN_ENABLE_ARM_BF16=1 -DXNN_ENABLE_ARM_DOTPROD=1 -DXNN_ENABLE_ARM_FP16_SCALAR=1 -DXNN_ENABLE_ARM_FP16_VECTOR=1 -DXNN_ENABLE_ARM_I8MM=1 -DXNN_ENABLE_ASSEMBLY=1 -DXNN_ENABLE_CPUINFO=1 -DXNN_ENABLE_DWCONV_MULTIPASS=0 -DXNN_ENABLE_GEMM_M_SPECIALIZATION=1 -DXNN_ENABLE_JIT=0 -DXNN_ENABLE_MEMOPT=1 -DXNN_ENABLE_RISCV_VECTOR=1 -DXNN_ENABLE_SPARSE=1 -I/home/peter/dev/sandbox/tensorflow/tensorflow-2.15.0/third_party/xla/third_party/tsl -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/opencl_headers -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/vulkan_headers/include -I/home/peter/dev/sandbox/tensorflow/tensorflow-2.15.0/tensorflow/lite/delegates/gpu/common -I/home/peter/dev/sandbox/tensorflow/tensorflow-2.15.0/tensorflow/lite/delegates/gpu/common/task -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/xnnpack/src -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/pthreadpool-source/include -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/FXdiv-source/include -I/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/FP16-source/include -g -DANDROID -fdata-sections -ffunction-sections -funwind-tables -fstack-protector-strong -no-canonical-prefixes -D_FORTIFY_SOURCE=2 -march=armv7-a -mthumb -Wformat -Werror=format-security -O3 -DNDEBUG -std=c99 -fPIC -O2 -pthread -fno-math-errno -marm -march=armv8.2-a+bf16 -mfpu=neon-fp-armv8 -MD -MT _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o -MF CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o.d -o CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o -c /home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/xnnpack/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '/home/peter/dev/sandbox/tensorflow/build_2.15.0_ndk_25c_armeabi-v7a/xnnpack/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c'.
4.	Running pass 'ARM Instruction Selection' on function '@xnn_bf16_gemm_minmax_ukernel_1x4c8__neonbf16_bfdot'
 #0 0x00000000047d91d8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x47d91d8)
 #1 0x00000000047d8340 llvm::sys::RunSignalHandlers() (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x47d8340)
 #2 0x00000000047a3dc3 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x47a3dc3)
 #3 0x00000000047a3d7b (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x47a3d7b)
 #4 0x00000000047d7a87 llvm::sys::Process::Exit(int, bool) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x47d7a87)
 #5 0x00000000040dc70a (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x40dc70a)
 #6 0x0000000003083072 llvm::report_fatal_error(llvm::Twine const&, bool) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x3083072)
 #7 0x000000000282b5f5 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x282b5f5)
 #8 0x0000000006cf4e77 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x6cf4e77)
 #9 0x000000000641f425 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x641f425)
#10 0x0000000005e86b63 llvm::SelectionDAGISel::DoInstructionSelection() (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5e86b63)
#11 0x0000000005e8710a llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5e8710a)
#12 0x0000000006417d3c llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x6417d3c)
#13 0x0000000006457ad3 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x6457ad3)
#14 0x00000000064572df (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x64572df)
#15 0x0000000005d9faea llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5d9faea)
#16 0x0000000005da0113 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5da0113)
#17 0x0000000005d9fc6f llvm::FPPassManager::runOnModule(llvm::Module&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5d9fc6f)
#18 0x00000000063aa794 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x63aa794)
#19 0x00000000065d6968 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x65d6968)
#20 0x00000000060524d5 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x60524d5)
#21 0x0000000005ea25a9 clang::ParseAST(clang::Sema&, bool, bool) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x5ea25a9)
#22 0x00000000063c128d clang::FrontendAction::Execute() (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x63c128d)
#23 0x00000000063c112d clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x63c112d)
#24 0x00000000063c1541 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x63c1541)
#25 0x00000000066a9f54 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a9f54)
#26 0x00000000066a6de3 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a6de3)
#27 0x00000000066a6c92 (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a6c92)
#28 0x00000000066a6c61 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a6c61)
#29 0x00000000066a69f4 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >*, bool*) const (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a69f4)
#30 0x00000000066a685f clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a685f)
#31 0x00000000066a66f2 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*> >&) (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66a66f2)
#32 0x00000000066752ee main (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x66752ee)
#33 0x00007f6833640cd0 (/usr/lib/libc.so.6+0x27cd0)
#34 0x00007f6833640d8a __libc_start_main (/usr/lib/libc.so.6+0x27d8a)
#35 0x00000000064cce69 _start (/home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin/clang+0x64cce69)
clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)
Android (9352603, based on r450784d1) clang version 14.0.7 (https://android.googlesource.com/toolchain/llvm-project 4c603efb0cca074e9238af8b4106c30add4418f6)
Target: armv7-none-linux-android26
Thread model: posix
InstalledDir: /home/peter/dev/sandbox/tensorflow/android-ndk-r25c/toolchains/llvm/prebuilt/linux-x86_64/bin
clang: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/bf16-gemm-1x4c8-minmax-neonbf16-bfdot-463081.c
clang: note: diagnostic msg: /tmp/bf16-gemm-1x4c8-minmax-neonbf16-bfdot-463081.sh
clang: note: diagnostic msg: 

********************
make[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-all.dir/build.make:49874: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o] Error 70
make[1]: *** [CMakeFiles/Makefile2:6653: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

Any update regarding this?

----- Update
Building with NDK 21e also fails with below error

[  1%] Building C object _deps/xnnpack-build/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/neoni8mm.c.o
clang: error: the clang compiler does not support '-march=armv8.2-a+i8mm'

pfk-beta · 2024-01-16T13:56:05Z

I was trying to build according to this docs: https://www.tensorflow.org/lite/guide/build_cmake_arm - but it failed. I was trying to use ndk in the same way as you - the same error invalid output constraint '=t' in asm. Bazel built without error.

zihaomu · 2024-03-12T07:36:42Z

the same issue, test at r2.16 with ndk:26.1.10909125.

zihaomu · 2024-03-12T09:40:20Z

Hi @pfk-beta, @RobertFlatt, I got a workaround. You can directly modifiy the xnnpack source code which is inside the tflite build folder.

The main idea is to bypass the error code and use the default branch.

zihaomu · 2024-03-14T02:27:57Z

related onnx issue: google/XNNPACK#6164, pr: google/XNNPACK#6179

### Description  Use different march flag to workaround what appears to be a clang issue. See tensorflow/tensorflow#59970 for links to various relevant pieces of info/discussions. ### Motivation and Context

google-ml-butler bot added type:bug Bug type:support Support issues labels Mar 13, 2023

google-ml-butler bot assigned pjpratik Mar 13, 2023

pjpratik added type:build/install Build and install issues comp:lite TF Lite related issues comp:lite-xnnpack TensorFlow Lite XNNPack related issues and removed type:support Support issues labels Mar 14, 2023

pjpratik assigned sachinprasadhs and unassigned pjpratik Mar 15, 2023

sachinprasadhs assigned terryheo Mar 21, 2023

sachinprasadhs added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Mar 21, 2023

pjpratik self-assigned this Apr 22, 2023

pjpratik added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Apr 22, 2023

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Apr 25, 2023

sachinprasadhs added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Apr 26, 2023

pkgoogle added stat:awaiting response Status - Awaiting response from author type:feature Feature requests and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower type:bug Bug labels Jun 1, 2023

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Jun 2, 2023

sachinprasadhs removed their assignment Jun 2, 2023

pkgoogle added the stat:awaiting response Status - Awaiting response from author label Jun 2, 2023

pkgoogle self-assigned this Jun 2, 2023

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Jun 3, 2023

sachinprasadhs added the stat:awaiting response Status - Awaiting response from author label Jun 5, 2023

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Jun 11, 2023

pkgoogle added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jun 12, 2023

skottmckay mentioned this issue Nov 6, 2023

Fix xnnpack compile error on arm32 microsoft/onnxruntime#18291

Merged

pkgoogle unassigned pjpratik Mar 12, 2024

armeabi-v7a assembler error #59970

armeabi-v7a assembler error #59970

Comments

RobertFlatt commented Mar 13, 2023 • edited

Issue Type

Have you reproduced the bug with TF nightly?

Source

Tensorflow Version

Custom Code

OS Platform and Distribution

Mobile device

Python version

Bazel version

GCC/Compiler version

CUDA/cuDNN version

GPU model and memory

Current Behaviour?

Relevant log output

RobertFlatt commented Mar 30, 2023

sachinprasadhs commented Mar 30, 2023

pjpratik commented Apr 22, 2023

RobertFlatt commented Apr 25, 2023 • edited

pkgoogle commented May 31, 2023

RobertFlatt commented May 31, 2023

pkgoogle commented Jun 1, 2023

pkgoogle commented Jun 1, 2023

RobertFlatt commented Jun 2, 2023 • edited

pkgoogle commented Jun 2, 2023

RobertFlatt commented Jun 3, 2023

sachinprasadhs commented Jun 5, 2023

RobertFlatt commented Jun 11, 2023

RobertFlatt commented Jun 12, 2023 • edited

pkgoogle commented Jun 12, 2023

RobertFlatt commented Jul 19, 2023

derpda commented Dec 8, 2023 • edited

pfk-beta commented Jan 16, 2024

zihaomu commented Mar 12, 2024

zihaomu commented Mar 12, 2024

zihaomu commented Mar 14, 2024 • edited

RobertFlatt commented Mar 13, 2023 •

edited

RobertFlatt commented Apr 25, 2023 •

edited

RobertFlatt commented Jun 2, 2023 •

edited

RobertFlatt commented Jun 12, 2023 •

edited

derpda commented Dec 8, 2023 •

edited

zihaomu commented Mar 14, 2024 •

edited