Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support cross compilation for ARM Linux #36

Closed
llhe opened this issue Jul 4, 2018 · 50 comments
Closed

Support cross compilation for ARM Linux #36

llhe opened this issue Jul 4, 2018 · 50 comments
Assignees
Labels
feature request Feature request discussion

Comments

@llhe
Copy link
Member

llhe commented Jul 4, 2018

有需求的可以在这里提一下。顺便提供一下:

  1. 芯片类型和参数
  2. Linux版本
  3. 交叉编译器版本
  4. 如果支持GPU的话支持提供一下OpenCL的版本和动态库的路径(32/64位)
@zhy520xp
Copy link

zhy520xp commented Jul 4, 2018

交叉编译的话,不需要提供芯片类型和参数吧。按理说,只要交叉编译链支持c++11(MACE用了c++11),只需要提供一个如何进行交叉编译的教程就行了额。
再说了,芯片的种类很多的,每种芯片的交叉编译链都不一样呢。。。

@leogift
Copy link

leogift commented Jul 4, 2018

感谢mace这么赞的工作。arm linux需求巨大,支持armv8/aarch64加mali就很通用了。一个栗子
1.rk3399,双a72+4个a53
2.ubuntu16.04
3.aarch64-linux-g**
4.opencl 1.2

再次感谢

@llhe
Copy link
Member Author

llhe commented Jul 4, 2018

@zhy520xp
需要提供一个bazel toolchain的一个guide,或者你如果跑通了可以反馈一下。

另外需要确认OpenCL的支持情况以及兼容的可行性,因为不同系统差别较大。我们没有开发环境,所以希望收集一下信息,同样如果你跑通了,也请反馈一下,非常感谢。

@colorfulCloud
Copy link

as @leogift mentioned, rk3399 is widely used in industry

@xiaqing10
Copy link

确实呀, RK3399,这个真的是需要的

@xiaqing10
Copy link

https://github.com/zhy520xp/mace-makefile-project 做好了交叉编译,可以参考参考

@llhe llhe added the feature request Feature request discussion label Jul 4, 2018
@zhy520xp
Copy link

zhy520xp commented Jul 5, 2018

像3559A和3536这类嵌入式平台,如何编译出能跑gpu版本的东西,是否应该支持一下。。

@hbwangjinwu
Copy link

现在bazel 编译卡住在了 protobuf 的编译问题上。 mace 编译会调用protobuf 交叉编译出来的工具,这样就导致了执行格式错误。

@hbwangjinwu
Copy link

我用bazel 工具在arm linux 上编译出了mace 库,但是跑起来时间很长,请问bazel 规则中的neon是默认enable的吗?

@hbwangjinwu
Copy link

@llhe 目前基本已经整理出了bazel 编译的方法,不过我在3516D 上跑的时间不理想。可以把前面整理的方法提供出来

@llhe
Copy link
Member Author

llhe commented Jul 23, 2018

@hbwangjinwu 赞,可以发一个PR吗?需要显式打开,可以参考此处宏定义:https://github.com/XiaoMi/mace/blob/master/mace/kernels/arm/conv_2d_neon_15x1.cc#L76

@hbwangjinwu
Copy link

https://github.com/hbwangjinwu/mace_cross_compile_guide guide 和我的交叉编译设定
@llhe 还不清楚如何用bazel 规则打开 -DMACE_ENABLE_NEON

@llhe
Copy link
Member Author

llhe commented Jul 23, 2018

bazel 命令加上 --define neon=true

@madhavajay
Copy link
Contributor

What about the ASUS TinkerBoard with RK3288?

@madhavajay
Copy link
Contributor

I got it working on the TinkerBoard and made a fork of the Makefile repo with English instructions:
https://github.com/madhavajay/mace-makefile-project

#167

@shuxiao9058
Copy link

@madhavajay 你好,试着运行了mace-makefile-project的demo程序,但是运行结果为什么CPU反而比GPU还要快?

@zhy520xp
Copy link

@shuxiao9058 一般来讲,移动端GPU和CPU跑同一个模型,GPU是比CPU慢一些。比如3288的GPU性能就只有CPU的一半。这其实和平台很相关的,具体来讲就是你的CPU是啥型号,GPU是啥型号,同时GPU核心数也是很重要的,核心越多越快。比如同样是Mali-G71,4个核心就比2个核心快很多

@madhavajay
Copy link
Contributor

@shuxiao9058 @zhy520xp I don't know why the performance is as it is, perhaps i can improve with different compile flags? However 6 fps is much better than 0.8-1 fps on a RPi 3, so I think the acceleration of MACE is fairly impressive. Any suggestion on faster improvement would be appreciated. I read Winograd is theoretically 2.2x faster?

@zhy520xp
Copy link

@madhavajay if your model deploy on cpu,you can try NCNN(https://github.com/Tencent/ncnn)。NCNN uses Winograd and 8-bit quantization for convolution computation。You are worth a try!

@madhavajay
Copy link
Contributor

@zhy520xp okay great I will look at it, thank you! :)

@madhavajay
Copy link
Contributor

@zhy520xp do you know if it supports SSD MobileNet architecture?

@llhe
Copy link
Member Author

llhe commented Sep 21, 2018

We will add Linaro toolchain soon which will enable official cross compiling for ARM Linux.

@llhe
Copy link
Member Author

llhe commented Sep 21, 2018

近期将会添加一个基于Linaro的默认ARM交叉编译器工具链,支持ARM Linux的交叉编译

@tataganesh95
Copy link

tataganesh95 commented Sep 24, 2018

@llhe Would the MACE documentation be updated with this? And can the build command be added to build-standalone-lib.sh? Thank you!

@llhe
Copy link
Member Author

llhe commented Sep 25, 2018

@tataganesh95 It will be integrated into the tools soon, and the documents will be updated accordingly.

Before that, you can simply build it with the following commands (not tested yet):

bazel build -s --config aarch64_linux --define openmp=true --define opencl=true --define neon=true //mace/libmace:libmace.so

Note: there is an issue with armeabi-v7a + NEON, which will be resolved soon.

@madhavajay
Copy link
Contributor

@llhe you guys are awesome! Will test this soon. I assume that this will build on Ubuntu and just needs the correct arm toolchain installed? Is there more instructions for someone like me who isnt experienced with cross compiling?

@madhavajay
Copy link
Contributor

Where do the neon headers come from are they part of the arm toolchain?

@tataganesh95
Copy link

@llhe Sorry for extending this discussion, but I am not very experienced with cross-compilation, and a little confused regarding the model deployment process. Here are the steps I followed for cross compilation -
Note- I used the mace lite edition docker image

  1. Changed build-standalone-lib.sh to cross-compile for aarch64. This creates the shared and static libraries.
  2. Convert the model ( In this case, the mobilenetV2 model ) using
    python tools/converter.py convert --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml with the configuration file present in mace-models. This generates the .pb and .data files.
  3. Now, I am assuming here, that running python tools/converter.py run --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml --example generates an executable in bazel-bin, and this executable, along with the static and shared libraries are to be deployed on the target machine? ( Target OS - Linux, Target architecture - aarch64 ).
    Thank you!

@llhe llhe changed the title ARM Linux 开发板支持 Support cross compilation for ARM Linux Sep 26, 2018
@llhe
Copy link
Member Author

llhe commented Sep 26, 2018

@tataganesh95 Do you need Android aarch64 or Linux aarch64? They are different ABIs and need different toolchains. Currently Android build is well supported and you can follow the steps in the documents.

This issue addresses the Linux aarch64 build which is not fully supported yet (you can use the previously mentioned bazel command to make the compilation). But the python tools wrapper is not well supported (e.g. tools/converter.py may not work) now (for example, it assumes adb to connect the device which is not true for Linux aarch64 boards), and we will be working on these tasks.

@llhe
Copy link
Member Author

llhe commented Sep 26, 2018

@tataganesh95 If you want to try with ARM Linux aarch64 before all the tools ready, you can build libmace.so by bazel build -s --config aarch64_linux --define openmp=true --define opencl=true --define neon=true //mace/libmace:libmace.so and make the model conversion using the current tools.

@tataganesh95
Copy link

Do you need Android aarch64 or Linux aarch64?

Linux aarch64
Cool! I will wait for Linux aarch64 to be fully supported. I was just trying to understand whether I am cross-compiling mace for linux aarch64 correctly.

if you want to try with ARM Linux aarch64 before all the tools ready, you can build libmace.so by bazel build -s --config aarch64_linux --define openmp=true --define opencl=true --define neon=true //mace/libmace:libmace.so and make the model conversion using the current tools.

I have done the same, ran the example script as well. But since my target os is linux, I am not sure how am I supposed to run example.cc. I could see that an executable named example_static is created when I run python tools/converter.py run --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml --example. When I ran this executable on my aarch64 device, I am able to pass command line parameters to it ( --input_node, --output_node, --device etc ), and run the executable as well ( I am still not getting the desired output for Mobilenet-V2, though.)

I just wanted to know, whether the steps I followed for cross-compilating mace , and running the example on the target device, are right, or have I missed something? ( Steps have been mentioned in my previous comment ),

Eagerly looking forward to tools that can facilitate cross compilation for Arm linux! Thank you once again for such a prompt response!

@llhe
Copy link
Member Author

llhe commented Sep 26, 2018

@tataganesh95 The result of python tools/converter.py run --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml --example is targeted for Anrdoid ABI now. It's undefined when you run the binary in normal ARM Linux system. You can check the difference here: https://wiki.linaro.org/WorkingGroups/ToolChain/FAQ#What_is_the_difference_between_arm-linux-androideabi_arm-linux-gnueabi_toolchain_linux_toolchain.3F

@tataganesh95
Copy link

tataganesh95 commented Sep 26, 2018

@llhe I wasn't aware of that. Thank you! I will wait till the tools for ARM linux are built.

@madhavajay
Copy link
Contributor

@tataganesh95 I was able to skip the Android stuff just by installing adb which is really easy:
#176

I didnt install the NDK, to get rid of the ADB error I installed adb from here:
https://askubuntu.com/questions/34702/how-do-i-set-up-android-adb

Then:

python tools/converter.py convert --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml

@tataganesh95
Copy link

@madhavajay I was able to run the convert script and obtained .data and .pb ( For mobilenet ). But, I was trying to run example.cc through the same script. With few minor changes I was able to run that as well, and that in turn generated an exectuble example_static. I ran this on my target machine and it did ran successfully but generated the wrong output for a sample image ( I was testing the grace hopper image ), so I was just wondering whether I missed a step or am I doing something wrong here.

@madhavajay
Copy link
Contributor

@tataganesh95 you have gotten further than me. I literally just got it compiled and ran the test code in the make file project. My task was to evaluate if it was possible, I havent had a chance to actually use it yet. Sorry. :) Please let me know if you solve the issue as I will likely have the same problems.

@pathwayai
Copy link

@llhe is there any update on when the Linux aarch 64 architecture might be ready? Do you have any provisional benchmark results for it?

@madhavajay
Copy link
Contributor

I ran this:

$ bazel build --config arm_linux --define openmp=true --define opencl=true --define neon=true //mace/libmace:libmace.so --sandbox_debug

Got this issue:

arm-linux-gnueabihf-gcc: error: mace_version_script.lds: No such file or directory

Full output:

INFO: Analysed target //mace/libmace:libmace.so (0 packages loaded).
INFO: Found 1 target...
ERROR: /home/pathwayai/mace/mace/libmace/BUILD:47:1: Linking of rule '//mace/libmace:libmace.so' failed (Exit 1): linux-sandbox failed: error executing command
  (cd /home/pathwayai/.cache/bazel/_bazel_pathwayai/c7986ad00fc23123aa9184aa80c62d45/execroot/mace && \
  exec env - \
    PATH=/home/pathwayai/bin:/home/pathwayai/.local/bin:/home/pathwayai/caffe/build/install/bin:/home/pathwayai/bin:/home/pathwayai/.local/bin:/home/pathwayai/caffe/build/install/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/pathwayai/bin:/home/pathwayai/bin \
    PWD=/proc/self/cwd \
    TMPDIR=/tmp \
  /home/pathwayai/.cache/bazel/_bazel_pathwayai/c7986ad00fc23123aa9184aa80c62d45/execroot/mace/_bin/linux-sandbox -t 15 -w /home/pathwayai/.cache/bazel/_bazel_pathwayai/c7986ad00fc23123aa9184aa80c62d45/sandbox/linux-sandbox/1/execroot/mace -w /tmp -w /dev/shm -D -- tools/arm_compiler/linaro_linux_gcc/arm-linux-gnueabihf-gcc -shared -o bazel-out/armeabi-v7a-fastbuild/bin/mace/libmace/libmace.so -Wl,-soname,libmace.so -Wl,--version-script mace_version_script.lds -fopenmp '--sysroot=external/gcc_linaro_7_3_1_arm_linux_gnueabihf/arm-linux-gnueabihf/libc' '-fuse-ld=gold' -Wl,-no-as-needed -no-canonical-prefixes -v -Wl,-z,relro,-z,now '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,-S -Wl,@bazel-out/armeabi-v7a-fastbuild/bin/mace/libmace/libmace.so-2.params)
src/main/tools/linux-sandbox.cc:154: linux-sandbox-pid1 has PID 28549
src/main/tools/linux-sandbox-pid1.cc:175: working dir: /home/pathwayai/.cache/bazel/_bazel_pathwayai/c7986ad00fc23123aa9184aa80c62d45/sandbox/linux-sandbox/1/execroot/mace
src/main/tools/linux-sandbox-pid1.cc:194: writable: /home/pathwayai/.cache/bazel/_bazel_pathwayai/c7986ad00fc23123aa9184aa80c62d45/sandbox/linux-sandbox/1/execroot/mace
src/main/tools/linux-sandbox-pid1.cc:194: writable: /tmp
src/main/tools/linux-sandbox-pid1.cc:194: writable: /dev/shm
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /dev
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /dev/pts
src/main/tools/linux-sandbox-pid1.cc:265: remount rw: /dev/shm
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /dev/hugepages
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /dev/mqueue
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /run
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /run/lock
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /run/user/1005
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/kernel/security
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/systemd
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/pids
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/rdma
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/cpu,cpuacct
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/blkio
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/perf_event
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/devices
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/cpuset
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/net_cls,net_prio
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/hugetlb
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/freezer
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/cgroup/memory
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/pstore
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/kernel/debug
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/kernel/debug/tracing
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/fs/fuse/connections
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /sys/kernel/config
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /proc
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /proc/sys/fs/binfmt_misc
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /proc/sys/fs/binfmt_misc
src/main/tools/linux-sandbox-pid1.cc:265: remount ro: /var/lib/lxcfs
src/main/tools/linux-sandbox-pid1.cc:265: remount rw: /home/pathwayai/.cache/bazel/_bazel_pathwayai/c7986ad00fc23123aa9184aa80c62d45/sandbox/linux-sandbox/1/execroot/mace
src/main/tools/linux-sandbox-pid1.cc:265: remount rw: /home/pathwayai/.cache/bazel/_bazel_pathwayai/c7986ad00fc23123aa9184aa80c62d45/sandbox/linux-sandbox/1/execroot/mace
src/main/tools/linux-sandbox-pid1.cc:265: remount rw: /tmp
src/main/tools/linux-sandbox-pid1.cc:265: remount rw: /dev/shm
src/main/tools/process-tools.cc:118: sigaction(32, &sa, nullptr) failed
src/main/tools/process-tools.cc:118: sigaction(33, &sa, nullptr) failed
arm-linux-gnueabihf-gcc: error: mace_version_script.lds: No such file or directory
Using built-in specs.
COLLECT_GCC=external/gcc_linaro_7_3_1_arm_linux_gnueabihf/bin/arm-linux-gnueabihf-gcc
COLLECT_LTO_WRAPPER=external/gcc_linaro_7_3_1_arm_linux_gnueabihf/bin/../libexec/gcc/arm-linux-gnueabihf/7.3.1/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: '/home/tcwg-buildslave/workspace/tcwg-make-release/builder_arch/amd64/label/tcwg-x86_64-build/target/arm-linux-gnueabihf/snapshots/gcc.git~linaro-7.3-2018.05/configure' SHELL=/bin/bash --with-mpc=/home/tcwg-buildslave/workspace/tcwg-make-release/builder_arch/amd64/label/tcwg-x86_64-build/target/arm-linux-gnueabihf/_build/builds/destdir/x86_64-unknown-linux-gnu --with-mpfr=/home/tcwg-buildslave/workspace/tcwg-make-release/builder_arch/amd64/label/tcwg-x86_64-build/target/arm-linux-gnueabihf/_build/builds/destdir/x86_64-unknown-linux-gnu --with-gmp=/home/tcwg-buildslave/workspace/tcwg-make-release/builder_arch/amd64/label/tcwg-x86_64-build/target/arm-linux-gnueabihf/_build/builds/destdir/x86_64-unknown-linux-gnu --with-gnu-as --with-gnu-ld --disable-libmudflap --enable-lto --enable-shared --without-included-gettext --enable-nls --with-system-zlib --disable-sjlj-exceptions --enable-gnu-unique-object --enable-linker-build-id --disable-libstdcxx-pch --enable-c99 --enable-clocale=gnu --enable-libstdcxx-debug --enable-long-long --with-cloog=no --with-ppl=no --with-isl=no --disable-multilib --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb --with-tune=cortex-a9 --with-arch=armv7-a --enable-threads=posix --enable-multiarch --enable-libstdcxx-time=yes --enable-gnu-indirect-function --with-build-sysroot=/home/tcwg-buildslave/workspace/tcwg-make-release/builder_arch/amd64/label/tcwg-x86_64-build/target/arm-linux-gnueabihf/_build/sysroots/arm-linux-gnueabihf --with-sysroot=/home/tcwg-buildslave/workspace/tcwg-make-release/builder_arch/amd64/label/tcwg-x86_64-build/target/arm-linux-gnueabihf/_build/builds/destdir/x86_64-unknown-linux-gnu/arm-linux-gnueabihf/libc --enable-checking=release --disable-bootstrap --enable-languages=c,c++,fortran,lto --build=x86_64-unknown-linux-gnu --host=x86_64-unknown-linux-gnu --target=arm-linux-gnueabihf --prefix=/home/tcwg-buildslave/workspace/tcwg-make-release/builder_arch/amd64/label/tcwg-x86_64-build/target/arm-linux-gnueabihf/_build/builds/destdir/x86_64-unknown-linux-gnu
Thread model: posix
gcc version 7.3.1 20180425 [linaro-7.3-2018.05 revision d29120a424ecfbc167ef90065c0eeb7f91977701] (Linaro GCC 7.3-2018.05)
src/main/tools/linux-sandbox-pid1.cc:437: waitpid returned 2
src/main/tools/linux-sandbox-pid1.cc:457: child exited with code 1
src/main/tools/linux-sandbox.cc:204: child exited normally with exitcode 1
Target //mace/libmace:libmace.so failed to build
INFO: Elapsed time: 28.551s, Critical Path: 0.29s
INFO: 0 processes.
FAILED: Build did NOT complete successfully

@madhavajay
Copy link
Contributor

@llhe any idea why i cant build the same cross compile libmace.so that the gitlab ci file says it builds?

@pathwayai
Copy link

Anyone know how to solve the above? I'm also trying to solve this.

@llhe
Copy link
Member Author

llhe commented Oct 16, 2018

@pathwayai Do you have the same problem?

@llhe
Copy link
Member Author

llhe commented Oct 16, 2018

@madhavajay
Can you reproduce this error by executing this in the shell?

cd /home/pathwayai/.cache/bazel/_bazel_pathwayai/c7986ad00fc23123aa9184aa80c62d45/execroot/mace

/home/pathwayai/.cache/bazel/_bazel_pathwayai/c7986ad00fc23123aa9184aa80c62d45/execroot/mace/_bin/linux-sandbox -t 15 -w /home/pathwayai/.cache/bazel/_bazel_pathwayai/c7986ad00fc23123aa9184aa80c62d45/sandbox/linux-sandbox/1/execroot/mace -w /tmp -w /dev/shm -D -- tools/arm_compiler/linaro_linux_gcc/arm-linux-gnueabihf-gcc -shared -o bazel-out/armeabi-v7a-fastbuild/bin/mace/libmace/libmace.so -Wl,-soname,libmace.so -Wl,--version-script mace_version_script.lds -fopenmp '--sysroot=external/gcc_linaro_7_3_1_arm_linux_gnueabihf/arm-linux-gnueabihf/libc' '-fuse-ld=gold' -Wl,-no-as-needed -no-canonical-prefixes -v -Wl,-z,relro,-z,now '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,-S -Wl,@bazel-out/armeabi-v7a-fastbuild/bin/mace/libmace/libmace.so-2.params

@Alnlll
Copy link

Alnlll commented Dec 12, 2018

Depend on

Before that, you can simply build it with the following commands (not tested yet): bazel build -s --config aarch64_linux --define openmp=true --define opencl=true --define neon=true //mace/libmace:libmace.so

I modified the "tools/build-standalone-lib.sh" to get aarch64 libraries:

echo "build shared lib for aarch64 + cpu"
bazel build -s --config optimization --config aarch64_linux --define openmp=true --define opencl=false --define neon=true //mace/libmace:libmace_dynamic
# bazel build --config android --config optimization mace/libmace:libmace_dynamic --define neon=true --define openmp=true --define opencl=true --define quantize=true --cpu=arm64-v8a
cp bazel-bin/mace/libmace/libmace.so $LIB_DIR/aarch64/cpu/

if [[ "$OSTYPE" != "darwin"* ]];then
	echo "build shared lib for linux-x86-64"
	bazel build mace/libmace:libmace_dynamic --config optimization --define quantize=true --define openmp=true
	cp bazel-bin/mace/libmace/libmace.so $LIB_DIR/linux-x86-64/
fi

then got a benchmark result on mobilenet-v1:

---------------------------------------------------------------------
                               Warm Up
----------------------------------------------------------------------
| round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) |   std |
----------------------------------------------------------------------
|     1 |   598.455 |  598.455 | 598.455 | 598.455 | 598.455 | 0.000 |
----------------------------------------------------------------------

-----------------------------------------------------------------------
                         Run without statistics
------------------------------------------------------------------------
| round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) |     std |
------------------------------------------------------------------------
|    18 |   566.921 |  566.962 | 566.583 | 568.177 | 566.871 | 315.777 |
------------------------------------------------------------------------

-----------------------------------------------------------------------
                          Run with statistics
------------------------------------------------------------------------
| round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) |     std |
------------------------------------------------------------------------
|    18 |   568.146 |  566.793 | 566.776 | 570.255 | 567.344 | 851.766 |
------------------------------------------------------------------------

but comparing to the benchmark result on the same model when using the latest release version mace without building libs for aarch64, it is a bad latency performance degradation:

---------------------------------------------------------------------
                               Warm Up
----------------------------------------------------------------------
| round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) |   std |
----------------------------------------------------------------------
|     1 |   229.561 |  229.561 | 229.561 | 229.561 | 229.561 | 0.000 |
----------------------------------------------------------------------

-------------------------------------------------------------------------
                          Run without statistics
--------------------------------------------------------------------------
| round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) |       std |
--------------------------------------------------------------------------
|    48 |   216.909 |  205.191 | 204.749 | 424.953 | 210.717 | 31344.958 |
--------------------------------------------------------------------------

------------------------------------------------------------------------
                          Run with statistics
-------------------------------------------------------------------------
| round | first(ms) | curr(ms) | min(ms) | max(ms) | avg(ms) |      std |
-------------------------------------------------------------------------
|    49 |   212.687 |  205.001 | 204.994 | 212.687 | 205.952 | 1464.004 |
-------------------------------------------------------------------------

Question

  • All the test running on host machine (ubuntu 16.04 X86_64 Intel)
  • Is this latency performance degradation normal?
  • Or I'm doing something wrong about this?

nolanliou added a commit that referenced this issue Dec 13, 2018
1. Abstact android and arm linux to one format
2. Support cross compilation for ARM linux
3. Related issue #36
@llhe
Copy link
Member Author

llhe commented Dec 14, 2018

@Alnlll You can try with official support. If there is a abnormal performance degradation, please file a new issue.

@llhe
Copy link
Member Author

llhe commented Dec 14, 2018

Close this which is official supported.

@llhe llhe closed this as completed Dec 14, 2018
@ysyyork
Copy link

ysyyork commented Dec 15, 2018

any one already got some benchmark on RK33XX series? we are using RK3399 and we'd love to leverage the Mali 860 on the board. But it seems a hard task. Just wanna get some sense about if the GPU can actually out perform the CPU cus I noticed in the above conversation it seems GPU is not always faster than CPU on mobile platform. BTW, this is really an awesome project! Thanks guys!

@nolanliou
Copy link
Member

@ysyyork we have tested the mobilenet-v1 on RK3399 with GPU, Only Buffer-based OpenCL implementation could outperform the CPU, but only a small amount of Ops support Buffer-based OpenCL now. The detailed usage please refer to document.

@ysyyork
Copy link

ysyyork commented Dec 19, 2018

@nolanliou thanks so much! This is very helpful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Feature request discussion
Projects
None yet
Development

No branches or pull requests