New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error building Tensorflow Lite on AARCH64 #26731
Comments
I build tensorflow-lite on raspberry-3B+,and met this problem。 |
It also happened to me when cross-compiling for Raspberry Pi
|
I have been doing some testing, I share here this in case it helps. I get the error during the compilation when executing.
But, if I remove I also noticed than when executing
|
I had the exact same error on a Pine64 A64+ Board. |
Same problom with master branch of tensorflow for ARMv8 platform |
It also happened to me when cross-compiling for ARMv8 platform |
Same for me. |
I think the code was not tested with newer gcc. If you build it with gcc or clang for
It goes well. To build it for aarch64 machines running Linux with gcc, either natively or cross-compiling, as suggested in the error message I'll submit a PR for this issue later. |
it seem the problem is gone after 152095e. Those who met the problem may want to |
@freedomtan I tried it again in my Pine64 A64+ with the process described in the guide and this time it ended with a different error: nnapi_delegate.cc:(.text+0x28): undefined reference to `NnApiImplementation()'
/home/pine/Downloads/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/lib/libtensorflow-lite.a(nnapi_delegate.o): In function `tflite::NNAPIAllocation::NNAPIAllocation(char const*, tflite::ErrorReporter*)':
nnapi_delegate.cc:(.text+0x18c): undefined reference to `NnApiImplementation()'
/home/pine/Downloads/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/lib/libtensorflow-lite.a(nnapi_delegate.o): In function `tflite::NNAPIDelegate::~NNAPIDelegate()':
nnapi_delegate.cc:(.text+0x200): undefined reference to `NnApiImplementation()'
nnapi_delegate.cc:(.text+0x21c): undefined reference to `NnApiImplementation()'
/home/pine/Downloads/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/lib/libtensorflow-lite.a(nnapi_delegate.o): In function `tflite::addTensorOperands(tflite::Subgraph*, ANeuralNetworksModel*, unsigned int*, std::vector<long, std::allocator<long> >*)':
nnapi_delegate.cc:(.text+0x298): undefined reference to `NnApiImplementation()'
/home/pine/Downloads/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/lib/libtensorflow-lite.a(nnapi_delegate.o):nnapi_delegate.cc:(.text+0x578): more undefined references to `NnApiImplementation()' follow
collect2: error: ld returned 1 exit status
tensorflow/lite/tools/make/Makefile:227: recipe for target '/home/pine/Downloads/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/bin/minimal' failed
make: *** [/home/pine/Downloads/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/bin/minimal] Error 1 Edit 1: Maybe this is the same as Issue 25120? Edit 2: Indeed this appears to be issue 25120. I did as suggested there, modified the Makefile to set |
I have a travis-ci build running to build tflite and can reproduce the error on the latest commit: https://travis-ci.org/kmader/tflite_lib_builder/jobs/529685375 |
Hi, I still have the same error under both the latest master branch and |
@freedomtan /usr/bin/llvm-ar-6.0: creating /home/pi/code/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/lib/libtensorflow-lite.a
/usr/bin/clang++ -O3 -DNDEBUG -fPIC --std=c++11 -march=armv8-a -funsafe-math-optimizations -ftree-vectorize -fPIC -I. -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/../../../../../ -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/../../../../../../ -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/ -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/eigen -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/absl -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/gemmlowp -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/neon_2_sse -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/farmhash/src -I -I/usr/local/include \
-o /home/pi/code/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/bin/minimal /home/pi/code/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/obj/tensorflow/lite/examples/minimal/minimal.o \
/home/pi/code/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/lib/libtensorflow-lite.a -Wl,--no-export-dynamic -Wl,--exclude-libs,ALL -Wl,--gc-sections -Wl,--as-needed -lrt -lstdc++ -lpthread -lm -ldl/home/pi/code/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/lib/libtensorflow-lite.a(audio_spectrogram.o): In function `flexbuffers::Reference::AsUInt64() const':
audio_spectrogram.cc:(.text._ZNK11flexbuffers9Reference8AsUInt64Ev[_ZNK11flexbuffers9Reference8AsUInt64Ev]+0x2f0): undefined reference to `flatbuffers::ClassicLocale::instance_'
audio_spectrogram.cc:(.text._ZNK11flexbuffers9Reference8AsUInt64Ev[_ZNK11flexbuffers9Reference8AsUInt64Ev]+0x2f4): undefined reference to `flatbuffers::ClassicLocale::instance_'
/home/pi/code/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/lib/libtensorflow-lite.a(while.o): In function `flexbuffers::Reference::AsInt64() const':
while.cc:(.text._ZNK11flexbuffers9Reference7AsInt64Ev[_ZNK11flexbuffers9Reference7AsInt64Ev]+0x2f0): undefined reference to `flatbuffers::ClassicLocale::instance_'
while.cc:(.text._ZNK11flexbuffers9Reference7AsInt64Ev[_ZNK11flexbuffers9Reference7AsInt64Ev]+0x2f4): undefined reference to `flatbuffers::ClassicLocale::instance_'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
tensorflow/lite/tools/make/Makefile_clang:264: recipe for target '/home/pi/code/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/bin/minimal' failed
make: *** [/home/pi/code/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/bin/minimal] Error 1
make: *** Waiting for unfinished jobs.... And adding aarch64-linux-gnu-g++ -O3 -DNDEBUG -fPIC -flax-vector-conversions --std=c++11 -march=armv8-a -funsafe-math-optimizations -ftree-vectorize -fPIC -I. -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/../../../../../ -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/../../../../../../ -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/ -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/eigen -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/absl -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/gemmlowp -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/neon_2_sse -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/farmhash/src -I/home/pi/code/tensorflow/tensorflow/lite/tools/make/downloads/flatbuffers/include -I -I/usr/local/include -c tensorflow/lite/kernels/dequantize.cc -o /home/pi/code/tensorflow/tensorflow/lite/tools/make/gen/aarch64_armv8-a/obj/tensorflow/lite/kernels/dequantize.o
In file included from ./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8.h:23:0,
from ./tensorflow/lite/kernels/internal/optimized/depthwiseconv_multithread.h:22,
from tensorflow/lite/kernels/depthwise_conv.cc:28:
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h: In static member function ‘static void tflite::optimized_ops::depthwise_conv::KernelMacroBlock<(tflite::DepthwiseConvImplementation)3, (tflite::DepthwiseConvDepthMultiplication)0, 2>::Run(const int8*, const int8*, const int32*, uint8*, const tflite::optimized_ops::depthwise_conv::DepthwiseConvDotProdParams*)’:
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:8255:3: error: x29 cannot be used in asm here |
@PhilipXue try adding |
@freedomtan
It seems that flatbuffer is to blame, I am trying both lasted github flatbuffer and one download by |
To resolve the following two issues on a linux_aarch64 system: 1. prohibited conversions between vectors (tensorflow#26731 (comment)) 2. prohibited explicit use of frame pointer register x29 in asm (tensorflow#26731 (comment))
I turned in a pull request at #29515 that can hopefully resolve the two issues. |
I have the same issue. Any solutions yet? |
Anyone has a workaround to the error with flatbuffer? Have also that problem. |
I don't have the flatbuffer issue when applying the patch at #29515 and using gcc. Do you guys have to use llvm? |
Haven't really checked what happened to cmake, |
we're working on that internally |
Hi, can you sync to the head and try again? thanks! |
I successfully built head: https://github.com/tensorflow/tensorflow/tree/fc7bce9b4ada6ef123b899ed88889923c9fafae6 Thanks |
mark this issue as fixed, plz reopen if any other issue occurs. |
System information
Describe the problem
I am trying to build Tensorflow Lite for ARM64 boards.
I followed the instructions on https://tensorflow.google.cn/lite/guide/build_arm64 and executed the following commands:
But at the last step got lots of errors such as:
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3823:22: note: use -flax-vector-conversions to permit conversions between vectors with differing element types or numbers of subparts
filter_reg_0_b = vdupq_n_u8(kSignBit);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3823:22: error: cannot convert ‘uint8x16_t {aka __vector(16) unsigned char}’ to ‘int8x16_t {aka __vector(16) signed char}’ in assignment
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3824:22: error: cannot convert ‘uint8x16_t {aka __vector(16) unsigned char}’ to ‘int8x16_t {aka __vector(16) signed char}’ in assignment
filter_reg_1_b = vdupq_n_u8(kSignBit);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3825:22: error: cannot convert ‘uint8x16_t {aka __vector(16) unsigned char}’ to ‘int8x16_t {aka __vector(16) signed char}’ in assignment
filter_reg_2_b = vdupq_n_u8(kSignBit);
^
In file included from ./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8.h:21:0,
from tensorflow/lite/kernels/depthwise_conv.cc:25:
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:50:71: error: cannot convert ‘int8x16_t {aka __vector(16) signed char}’ to ‘uint64x2_t {aka __vector(2) long unsigned int}’ for argument ‘2’ to ‘uint64x2_t vld1q_lane_u64(const uint64_t*, uint64x2_t, int)’
vld1q_lane_u64(reinterpret_cast<const uint64_t*>(src), reg, lane_num)
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3828:24: note: in expansion of macro ‘vld1q_lane_s8x8’
filter_reg_0_a = vld1q_lane_s8x8(filter_block_ptr, filter_reg_0_a, 0);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:50:71: error: cannot convert ‘int8x16_t {aka __vector(16) signed char}’ to ‘uint64x2_t {aka __vector(2) long unsigned int}’ for argument ‘2’ to ‘uint64x2_t vld1q_lane_u64(const uint64_t*, uint64x2_t, int)’
vld1q_lane_u64(reinterpret_cast<const uint64_t*>(src), reg, lane_num)
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3830:24: note: in expansion of macro ‘vld1q_lane_s8x8’
filter_reg_0_b = vld1q_lane_s8x8(filter_block_ptr, filter_reg_0_b, 0);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:50:71: error: cannot convert ‘int8x16_t {aka __vector(16) signed char}’ to ‘uint64x2_t {aka __vector(2) long unsigned int}’ for argument ‘2’ to ‘uint64x2_t vld1q_lane_u64(const uint64_t*, uint64x2_t, int)’
vld1q_lane_u64(reinterpret_cast<const uint64_t*>(src), reg, lane_num)
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3832:24: note: in expansion of macro ‘vld1q_lane_s8x8’
filter_reg_0_a = vld1q_lane_s8x8(filter_block_ptr, filter_reg_0_a, 1);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:50:71: error: cannot convert ‘int8x16_t {aka __vector(16) signed char}’ to ‘uint64x2_t {aka __vector(2) long unsigned int}’ for argument ‘2’ to ‘uint64x2_t vld1q_lane_u64(const uint64_t*, uint64x2_t, int)’
vld1q_lane_u64(reinterpret_cast<const uint64_t*>(src), reg, lane_num)
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3834:24: note: in expansion of macro ‘vld1q_lane_s8x8’
filter_reg_1_a = vld1q_lane_s8x8(filter_block_ptr, filter_reg_1_a, 0);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:50:71: error: cannot convert ‘int8x16_t {aka __vector(16) signed char}’ to ‘uint64x2_t {aka __vector(2) long unsigned int}’ for argument ‘2’ to ‘uint64x2_t vld1q_lane_u64(const uint64_t*, uint64x2_t, int)’
vld1q_lane_u64(reinterpret_cast<const uint64_t*>(src), reg, lane_num)
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3836:24: note: in expansion of macro ‘vld1q_lane_s8x8’
filter_reg_1_b = vld1q_lane_s8x8(filter_block_ptr, filter_reg_1_b, 0);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:50:71: error: cannot convert ‘int8x16_t {aka __vector(16) signed char}’ to ‘uint64x2_t {aka __vector(2) long unsigned int}’ for argument ‘2’ to ‘uint64x2_t vld1q_lane_u64(const uint64_t*, uint64x2_t, int)’
vld1q_lane_u64(reinterpret_cast<const uint64_t*>(src), reg, lane_num)
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3838:24: note: in expansion of macro ‘vld1q_lane_s8x8’
filter_reg_1_a = vld1q_lane_s8x8(filter_block_ptr, filter_reg_1_a, 1);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:50:71: error: cannot convert ‘int8x16_t {aka __vector(16) signed char}’ to ‘uint64x2_t {aka __vector(2) long unsigned int}’ for argument ‘2’ to ‘uint64x2_t vld1q_lane_u64(const uint64_t*, uint64x2_t, int)’
vld1q_lane_u64(reinterpret_cast<const uint64_t*>(src), reg, lane_num)
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3840:24: note: in expansion of macro ‘vld1q_lane_s8x8’
filter_reg_2_a = vld1q_lane_s8x8(filter_block_ptr, filter_reg_2_a, 0);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:50:71: error: cannot convert ‘int8x16_t {aka __vector(16) signed char}’ to ‘uint64x2_t {aka __vector(2) long unsigned int}’ for argument ‘2’ to ‘uint64x2_t vld1q_lane_u64(const uint64_t*, uint64x2_t, int)’
vld1q_lane_u64(reinterpret_cast<const uint64_t*>(src), reg, lane_num)
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3842:24: note: in expansion of macro ‘vld1q_lane_s8x8’
filter_reg_2_b = vld1q_lane_s8x8(filter_block_ptr, filter_reg_2_b, 0);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:50:71: error: cannot convert ‘int8x16_t {aka __vector(16) signed char}’ to ‘uint64x2_t {aka __vector(2) long unsigned int}’ for argument ‘2’ to ‘uint64x2_t vld1q_lane_u64(const uint64_t*, uint64x2_t, int)’
vld1q_lane_u64(reinterpret_cast<const uint64_t*>(src), reg, lane_num)
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3844:24: note: in expansion of macro ‘vld1q_lane_s8x8’
filter_reg_2_a = vld1q_lane_s8x8(filter_block_ptr, filter_reg_2_a, 1);
^
In file included from ./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8.h:21:0,
from tensorflow/lite/kernels/depthwise_conv.cc:25:
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3846:57: error: cannot convert ‘const uint8x16_t {aka const __vector(16) unsigned char}’ to ‘int8x16_t {aka __vector(16) signed char}’ for argument ‘2’ to ‘int8x16_t veorq_s8(int8x16_t, int8x16_t)’
filter_reg_0_a = veorq_s8(filter_reg_0_a, sign_bit);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3847:57: error: cannot convert ‘const uint8x16_t {aka const __vector(16) unsigned char}’ to ‘int8x16_t {aka __vector(16) signed char}’ for argument ‘2’ to ‘int8x16_t veorq_s8(int8x16_t, int8x16_t)’
filter_reg_0_b = veorq_s8(filter_reg_0_b, sign_bit);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3848:57: error: cannot convert ‘const uint8x16_t {aka const __vector(16) unsigned char}’ to ‘int8x16_t {aka __vector(16) signed char}’ for argument ‘2’ to ‘int8x16_t veorq_s8(int8x16_t, int8x16_t)’
filter_reg_1_a = veorq_s8(filter_reg_1_a, sign_bit);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3849:57: error: cannot convert ‘const uint8x16_t {aka const __vector(16) unsigned char}’ to ‘int8x16_t {aka __vector(16) signed char}’ for argument ‘2’ to ‘int8x16_t veorq_s8(int8x16_t, int8x16_t)’
filter_reg_1_b = veorq_s8(filter_reg_1_b, sign_bit);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3850:57: error: cannot convert ‘const uint8x16_t {aka const __vector(16) unsigned char}’ to ‘int8x16_t {aka __vector(16) signed char}’ for argument ‘2’ to ‘int8x16_t veorq_s8(int8x16_t, int8x16_t)’
filter_reg_2_a = veorq_s8(filter_reg_2_a, sign_bit);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3851:57: error: cannot convert ‘const uint8x16_t {aka const __vector(16) unsigned char}’ to ‘int8x16_t {aka __vector(16) signed char}’ for argument ‘2’ to ‘int8x16_t veorq_s8(int8x16_t, int8x16_t)’
filter_reg_2_b = veorq_s8(filter_reg_2_b, sign_bit);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h: In static member function ‘static void tflite::optimized_ops::depthwise_conv::PackMacroBlock<(tflite::DepthwiseConvImplementation)3, (tflite::DepthwiseConvDepthMultiplication)0, 0>::PackMacroBlockNeon(const uint8*, int8*, const tflite::optimized_ops::depthwise_conv::DepthwiseConvDotProdParams*)’:
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3954:53: error: cannot convert ‘uint8x16_t {aka __vector(16) unsigned char}’ to ‘const int8x16_t {aka const __vector(16) signed char}’ in initialization
const int8x16_t perm_data_0 = vld1q_u8(perm_data);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3955:58: error: cannot convert ‘uint8x16_t {aka __vector(16) unsigned char}’ to ‘const int8x16_t {aka const __vector(16) signed char}’ in initialization
const int8x16_t perm_data_1 = vld1q_u8(perm_data + 16);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3956:58: error: cannot convert ‘uint8x16_t {aka __vector(16) unsigned char}’ to ‘const int8x16_t {aka const __vector(16) signed char}’ in initialization
const int8x16_t perm_data_2 = vld1q_u8(perm_data + 32);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:3957:58: error: cannot convert ‘uint8x16_t {aka __vector(16) unsigned char}’ to ‘const int8x16_t {aka const __vector(16) signed char}’ in initialization
const int8x16_t perm_data_3 = vld1q_u8(perm_data + 48);
How can I fix it?
The text was updated successfully, but these errors were encountered: