Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xnnpack still builds with +dotprod and +fp16 with -DXNNPACK_ENABLE_ARM_DOTPROD=OFF -DXNNPACK_ENABLE_ARM_FP16_SCALAR=OFF -DXNNPACK_ENABLE_ARM_FP16_VECTOR=OFF #6165

Open
misterBart opened this issue Mar 12, 2024 · 10 comments

Comments

@misterBart
Copy link

I'm building aan Arm64 target with a fairly old toolchain (gcc 7.5, binutils 2.29.1) in order to support old Linux platforms.
I use:
-DXNNPACK_ENABLE_ARM_BF16=OFF -DXNNPACK_ENABLE_ARM_I8MM=OFF -DXNNPACK_ENABLE_ARM_DOTPROD=OFF -DXNNPACK_ENABLE_ARM_FP16_SCALAR=OFF -DXNNPACK_ENABLE_ARM_FP16_VECTOR=OFF
Yet Xnnpack still seems to build with +dotprod and +fp16:

In file included from /home/personau/LinuxToolchainsTest/tflite_aarch64_release/xnnpack/src/f16-dwconv2d-chw/gen/5x5s2p2-minmax-neonfp16arith-1x4.c:12:0:
/home/personau/x-tools/aarch64-unknown-linux-gnu-glibc2.25-gcc7.5/lib/gcc/aarch64-unknown-linux-gnu/7.5.0/include/arm_neon.h:17259:1: note: expected 'const float16_t * {aka const __fp16 *}' but argument is of type 'const uint16_t * {aka const short unsigned int *
'
 vld1_dup_f16 (const float16_t* __a)
 ^~~~~~~~~~~~
cc1: error: invalid feature modifier in '-march=armv8.2-a+fp16+dotprod'
gmake[2]: *** [_deps/xnnpack-build/CMakeFiles/XNNPACK.dir/build.make:4093: _deps/xnnpack-build/CMakeFiles/XNNPACK.dir/src/f16-gemm/gen-inc/1x8inc-minmax-aarch64-neonfp16arith-ld64.S.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:6137: _deps/xnnpack-build/CMakeFiles/XNNPACK.dir/all] Error 2
gmake[1]: *** Waiting for unfinished jobs....
gmake: *** [Makefile:136: all] Error 2

@fbarchard
Copy link
Contributor

the build system determines which kernels to build. the macros reflect what was enabled and wont test/use the disabled kernels. with bazel there are flags to control each instruction set:

--define=xnn_enable_arm_fp16_vector=false
--define=xnn_enable_arm_dotprod=false

cmake has options, but I'm not familiar with the usage

XNNPACK_ENABLE_ARM_FP16_VECTOR
XNNPACK_ENABLE_ARM_DOTPROD

On Intel I added some gcc version checking to force the flags off, and that could be done for arm gcc with a change to CMakeLists.txt.. it would be something like:

IF(CMAKE_C_COMPILER_ID STREQUAL "GNU")
  IF(CMAKE_C_COMPILER_VERSION VERSION_LESS "11")
    SET(XNNPACK_ENABLE_ARM_FP16_VECTOR OFF)
    SET(XNNPACK_ENABLE_ARM_DOTPROD OFF)
  ENDIF()
ENDIF()```

@misterBart
Copy link
Author

cmake has options, but I'm not familiar with the usage

XNNPACK_ENABLE_ARM_FP16_VECTOR
XNNPACK_ENABLE_ARM_DOTPROD

Yes, I already turned these off, see my opening post. The problem is that, even though I set these CMake options to OFF, Xnnpack still builds with +dotprod and +fp16.

@alankelly
Copy link
Collaborator

What version of XNNPack are you building? The failing file was removed on Sep 26, 2022

@misterBart
Copy link
Author

The version part of TfLite 2.10. (Can I check the specific Xnnpack version in the TfLite source code?)
TfLite 2.10.1 was released Nov 16, 2022. Perhaps that TfLite still includes the failing file.

@alankelly
Copy link
Collaborator

Can you update to the latest release? We can't fix old releases.

@misterBart
Copy link
Author

misterBart commented Mar 28, 2024

Still getting the errors with the latest TfLite release (2.16):

cc1: error: invalid feature modifier in '-march=armv8.2-a+fp16+dotprod'
gmake[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-prod.dir/build.make:173: _deps/xnnpack-build/CMakeFiles/microkernels-prod.dir/src/f16-gemm/gen/f16-gemm-1x8-minmax-asm-aarch64-neonfp16arith-ld64.S.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:6832: _deps/xnnpack-build/CMakeFiles/microkernels-prod.dir/all] Error 2
gmake[1]: *** Waiting for unfinished jobs....
cc1: error: invalid feature modifier in '-march=armv8.2-a+fp16+dotprod'
gmake[2]: *** [_deps/xnnpack-build/CMakeFiles/microkernels-all.dir/build.make:40157: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/src/f16-gemm/gen/f16-gemm-1x8-minmax-asm-aarch64-neonfp16arith-ld64.S.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:6806: _deps/xnnpack-build/CMakeFiles/microkernels-all.dir/all] Error 2
gmake: *** [Makefile:136: all] Error 2

Steps I execute:

git clone --single-branch --branch r2.16 https://github.com/tensorflow/tensorflow tensorflow_src
cmake -DCMAKE_TOOLCHAIN_FILE=../toolchain_aarch64.cmake -DCMAKE_BUILD_TYPE=release -DXNNPACK_ENABLE_ARM_BF16=OFF -DXNNPACK_ENABLE_ARM_I8MM=OFF -DXNNPACK_ENABLE_ARM_DOTPROD=OFF -DXNNPACK_ENABLE_ARM_FP16_SCALAR=OFF -DXNNPACK_ENABLE_ARM_FP16_VECTOR=OFF ../tensorflow_src/tensorflow/lite
cmake --build . -j 8 --config release

@alankelly
Copy link
Collaborator

Can you try adding -DXNNPACK_ENABLE_ASSEMBLY=OFF?

@misterBart
Copy link
Author

After adding that option TfLite 2.16 builds without errors, and I can run a test program on an Arm64 board using TfLite 2.16. But before I cheer too early, the test program runs slower now, which naturally comes from disabling the use of assembly code. -DXNNPACK_ENABLE_ASSEMBLY=OFF is too profound. The Arm64 board does not support float16, etc. but I would still like to use the other assembly micro-kernels in Xnnpack.

@alankelly
Copy link
Collaborator

Ok, we know what the problem is now. The solution is to get the update-microkernels script to split the assembly files into ones with and without arm V8 and to create new targets with the appropriate compilation options. Would you like to send a PR to do this?

@misterBart
Copy link
Author

A PR suggests I know what to fix in the codebase, which I don't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants