Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Support Cross Compiling with tfcompile #9661

Closed
cancan101 opened this issue May 4, 2017 · 21 comments
Closed

[feature] Support Cross Compiling with tfcompile #9661

cancan101 opened this issue May 4, 2017 · 21 comments
Labels
comp:xla XLA stale This label marks the issue/pr stale - to be closed automatically if no activity stat:contribution welcome Status - Contributions welcome type:feature Feature requests

Comments

@cancan101
Copy link
Contributor

Tensorflow (using XLA) is able to AOT compile a graph using tfcompile. There does not seem to be a way to, or it it not documented, cross compile the graph (ie compile on OS X for deployment on iOS). (Related SO question).

I suggest adding a means of performing this cross compilation.

@skye
Copy link
Member

skye commented May 4, 2017

I don't think anyone is currently working on this, but maybe @tatatodd can correct me. I'm gonna mark as contributions welcome for now.

@skye skye added stat:contribution welcome Status - Contributions welcome type:feature Feature requests labels May 4, 2017
@petecoup
Copy link

Can you get what you want by changing the --target_triple option handed to tfcompile? I don't have either target to see if the result works here...but from Linux, I can do:
--target_triple=thumbv7-apple-macosx10.6.7
and
--target_triple=thumbv7-apple-ios
which creates a Mach-O object (cross compiling from a Linux host to ios/macosx targets).
Or do you mean something else?

@lissyx
Copy link
Contributor

lissyx commented Jul 25, 2017

I can confirm that doing what @petecoup suggests does work. I'm able to cross-compile from linux amd64 targetting rpi3 hardware (though, I still lack proper tooling to set the --target_triple= properly, but changing by hand works fine so far).

@ruppeshnalwaya1993
Copy link

@petecoup @lissyx In which file do we need to make changes, tfcompile.bzl or tfcompile_main.cc

I see the line " --target_triple=" + target_llvm_triple() + in tfcompile.bzl and flags.target_triple = "x86_64-pc-linux"; in tfcompile_main.cc

I want to compile for both android and ios

For ios I tried first tried, just replacing in tfcompile_main.cc with flags.target_triple = "arm64-apple-ios";. The bazel build completed successfully but when I include the .h and .o in ios app, it says .o is not built for arm64.

Next, in addition to changes in tfcompile_main.cc, when I also made the change in tfcompile.bzl by replacing the line with " --target_triple=" + "arm64-apple-ios" +, the bazel build itself failed with error

ERROR: /tensorflow/test_graph_build/BUILD:4:1: Linking of rule '//test_graph_build:test_graph_tfmatmul' failed: link_dynamic_library.sh failed: error executing command external/bazel_tools/tools/cpp/link_dynamic_library.sh no ignored ignored ignored /usr/bin/gcc -shared -o bazel-out/local-opt/bin/test_graph_build/libtest_graph_tfmatmul.so '-fuse-ld=gold' ... (remaining 7 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
/usr/bin/ld.gold: error: bazel-out/local-opt/genfiles/test_graph_build/test_graph_tfmatmul.o:1:1: invalid character
collect2: error: ld returned 1 exit status
Target //test_graph_build:test_graph_tfmatmul failed to build

@lissyx
Copy link
Contributor

lissyx commented Aug 28, 2017

@ruppeshnalwaya1993 This is what I have in our code, inside the tf_library() section:

tfcompile_flags = select({
        "//tensorflow:rpi3": str('--target_triple="armv6-linux-gnueabihf" --target_cpu="cortex-a53" --target_features="+neon-fp-armv8"'),
--target_features="+sse4.1" --target_features="+sse4.2" --target_features="+avx" --target_features="+avx2" --target_features="+fma"',
        "//conditions:default": str('')
    }),

So basically your BUILD file would contain:

$ cat native_client/BUILD 
load("//tensorflow/compiler/aot:tfcompile.bzl",
     "tf_library")

tf_library(
    name = "project_model",
    cpp_class = "class::method",
    graph = "model.pb",
    config = "tfcompile.config.pbtxt",
    tfcompile_flags = select({
        "//tensorflow:rpi3": str('--target_triple="armv6-linux-gnueabihf" --target_cpu="cortex-a53" --target_features="+neon-fp-armv8"'),
        "//conditions:default": str('')
    }),
)

@ruppeshnalwaya1993
Copy link

@lissyx Thanks for your reply! I want to build for iOS and not raspberry pie. Nevertheless, to make sure I understand the process correctly, I tried to build using the changes that you suggested. One thing I am not sure of is given

tfcompile_flags = select({
        "//tensorflow:rpi3": str('--target_triple="armv6-linux-gnueabihf" --target_cpu="cortex-a53" --target_features="+neon-fp-armv8"'),
--target_features="+sse4.1" --target_features="+sse4.2" --target_features="+avx" --target_features="+avx2" --target_features="+fma"',
        "//conditions:default": str('')
    }),

how do specify in the bazel build command in terminal to build for //tensorflow:rpi3 and not //conditions:default or anything else?

Since I was unsure how to specify the target in the terminal command, I specified the default one to be also raspberry pie like this:

tfcompile_flags = select({
        "//tensorflow:rpi3": str('--target_triple="armv6-linux-gnueabihf" --target_cpu="cortex-a53" --target_features="+neon-fp-armv8"'),
--target_features="+sse4.1" --target_features="+sse4.2" --target_features="+avx" --target_features="+avx2" --target_features="+fma"',
        "//conditions:default": str('--target_triple="armv6-linux-gnueabihf" --target_cpu="cortex-a53" --target_features="+neon-fp-armv8"')
    }),

I tried building the test graph mentioned at tensorflow aot page by adding tfcompile_flags as written above in the BUILD file and running the command bazel build test_graph_tfmatmul. BUILD file is as follows kept in a directory //test_graph_build.

load("//tensorflow/compiler/aot:tfcompile.bzl", "tf_library")
tf_library(
    name = "test_graph_tfmatmul",
    cpp_class = "foo::bar::MatMulComp",
    graph = "test_graph_tfmatmul.pb",
    config = "test_graph_tfmatmul.config.pbtxt",
    tfcompile_flags = select({
        "//tensorflow:ios": str('--target_triple="arm64-apple-ios"'),
        "//conditions:default": str('--target_triple="armv6-linux-gnueabihf" --target_cpu="cortex-a53" --target_features="+neon-fp-armv8"')
    }),
)

But after a lot of time compiling, this does throws an error as follows:

INFO: Reading 'startup' options from /etc/bazel.bazelrc: --batch
WARNING: /tensorflow/tensorflow/core/BUILD:1634:1: in includes attribute of cc_library rule //tensorflow/core:framework_headers_lib: '../../external/nsync/public' resolves to 'external/nsync/public' not below the relative path of its package 'tensorflow/core'. This will be an error in the future. Since this rule was created by the macro 'cc_header_only_library', the error might have been caused by the macro implementation in /tensorflow/tensorflow/tensorflow.bzl:911:30.
INFO: Found 1 target...
INFO: From Executing genrule //test_graph_build:gen_test_graph_tfmatmul:
2017-08-29 20:20:29.702831: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
'+neon-fp-armv8' is not a recognized feature for this target (ignoring feature)
'+neon-fp-armv8' is not a recognized feature for this target (ignoring feature)
'+neon-fp-armv8' is not a recognized feature for this target (ignoring feature)
'+neon-fp-armv8' is not a recognized feature for this target (ignoring feature)
'+neon-fp-armv8' is not a recognized feature for this target (ignoring feature)
ERROR: /tensorflow/test_graph_build/BUILD:4:1: Linking of rule '//test_graph_build:test_graph_tfmatmul' failed: link_dynamic_library.sh failed: error executing command external/bazel_tools/tools/cpp/link_dynamic_library.sh no ignored ignored ignored /usr/bin/gcc -shared -o bazel-out/local-opt/bin/test_graph_build/libtest_graph_tfmatmul.so '-fuse-ld=gold' ... (remaining 7 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
/usr/bin/ld.gold: fatal error: bazel-out/local-opt/genfiles/test_graph_build/test_graph_tfmatmul.o: unsupported ELF machine number 40
collect2: error: ld returned 1 exit status
Target //test_graph_build:test_graph_tfmatmul failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 16.549s, Critical Path: 3.46s

What would be the right way to build this ?

@lissyx
Copy link
Contributor

lissyx commented Aug 29, 2017

I'm not sure about your error. What does file bazel-out/local-opt/genfiles/test_graph_build/test_graph_tfmatmul.o says? It should be some ARM64/iOS stuff I guess. However, the error "+neon-fp-armv8' is not a recognized feature for this target (ignoring feature) would indicate that you might be trying to issue some ARMv6.

According to tensorflow/tensorflow.bzl, the //tensorflow:ios should be the proper way to check. The fact that you have those +neon-fp-armv8 errors would indicate that it does not gets selected.

My best bet is that you should not be using ld.gold but your arch's ld. In our case, we disable the test and benchmark binaries with gen_test=False, gen_benchmark=False. You should make sure you have properly configured Tensorflow for cross-compiling for iOS (you should not see the warnings about +neon-fp-armv8). And then you may want to disable test and benchmark binaries, in case it is still unable to properly cross-build them.

@ruppeshnalwaya1993
Copy link

@lissyx I resolved the error by building tfcompile separately using command bazel build tensorflow/compiler/aot:tfcompile and then using tfcompile tool directly in terminal like this

bazel-bin/tensorflow/compiler/aot/tfcompile --graph=test_graph_build/test_graph_tfmatmul.pb --config=test_graph_build/test_graph_tfmatmul.config.pbtxt --cpp_class=foo::bar::MatMulComp --out_header=test_graph_build/test_graph_tfmatmul.h --out_object=test_graph_build/test_graph_tfmatmul.o --target_triple=armv64-apple-ios

The above command generated test_graph_tfmatmul.h and test_graph_tfmatmul.o for ios arm64 which I link to the in simple ios example project . I copied the basic program to run the graph as given on tensorflow aot page and built the app. But for some reason some symbols are not found and gives error:

Undefined symbols for architecture armv7:
  "___xla_cpu_runtime_EigenMatMulF32", referenced from:
      _entry in test_graph_tfmatmul.o
  "tensorflow::tfcompile::runtime::FreeContiguous(void*)", referenced from:
      foo::bar::MatMulComp::~MatMulComp() in RunModelViewController.o
  "tensorflow::tfcompile::runtime::MallocContiguousBuffers(long const*, unsigned long, void**, bool)", referenced from:
      foo::bar::MatMulComp::MatMulComp(foo::bar::MatMulComp::AllocMode) in RunModelViewController.o
  "xla::ExecutableRunOptions::set_intra_op_thread_pool(Eigen::ThreadPoolDevice const*)", referenced from:
      foo::bar::MatMulComp::set_thread_pool(Eigen::ThreadPoolDevice const*) in RunModelViewController.o
ld: symbol(s) not found for architecture armv7
clang: error: linker command failed with exit code 1 (use -v to see invocation)

I had already built the simple ios example project once before, hence I am assuming all the header search paths and linker paths and libs are properly set in build settings of the project. But after additional adding the .h and .o generated from AOT, with accompanying code to test it, the project it not building. What am I missing? Is there something additional I have to link?

@lissyx
Copy link
Contributor

lissyx commented Aug 30, 2017

@ruppeshnalwaya1993 Ah, I hit the same kind of stuff, I just chose for now to build the related libs, link my binary against and share them. I am not convinced it is the proper way, I would have preferred they get statically linked by Bazel, but for now it's good enough for me. Targets are: //tensorflow/compiler/aot:runtime //tensorflow/compiler/xla/service/cpu:runtime_matmul //tensorflow/compiler/xla:executable_run_options, and the resulting shared objects are -lruntime -lruntime_matmul -lexecutable_run_options under bazel-bin/tensorflow/compiler/[...] (subdirectories).

@ruppeshnalwaya1993
Copy link

@lissyx What will the bazel build command to build say //tensorflow/compiler/aot:runtime for iOS or android target? Just running bazel build tensorflow/compiler/aot:runtime builds default libruntime.a which is neither for target android or iOS.

@lissyx
Copy link
Contributor

lissyx commented Aug 31, 2017

I don't know, I'm just documenting how I got things to work in my case.

@ruppeshnalwaya1993
Copy link

@lissyx Ok, np. Thanks for your help. :)
@cancan101 Any update on cross-compilation documentation? Any comment from you will be very helpful.

@ruppeshnalwaya1993
Copy link

@lissyx @cancan101 I got some success. I ran the following command to build lruntime for Android and similarly built -lruntime_matmul -lexecutable_run_options

bazel build -c opt --cxxopt='-std=c++11' tensorflow/compiler/aot:runtime --crosstool_top=//external:android/crosstool --host_crosstool_top=@bazel_tools//tools/cpp:toolchain --cpu=armeabi-v7a

And got success in running the test program with the test graph.

Now I try running the frozen and optimised graph of pix2pix(https://github.com/affinelayer/pix2pix-tensorflow) and I get the output to be buffer with all values zero. I double checked that the passed values of input image are correct (floats between 0-1). Has anyone faces such an issue? Is there a way to debug aot generated code?

@ruppeshnalwaya1993
Copy link

ruppeshnalwaya1993 commented Sep 11, 2017

@cancan101 @skye @petecoup I also checked with a graph that has a just an input node and a following identity node as output. Even in this case the result from xla is buffer with all values either 0.000 or -nan

@cassinaooo
Copy link

@lissyx , I'm trying to do basically the same thing, cross-compile tfcompile binary to raspberrypi. However, I'm using the VGG16 network in Keras. How did you find the targets needed for your model //tensorflow/compiler/aot:runtime //tensorflow/compiler/xla/service/cpu:runtime_matmul //tensorflow/compiler/xla:executable_run_options and how did you compile them for arm?

@powderluv
Copy link
Contributor

@ruppeshnalwaya1993 were you able to get pix2pix running on ios ? How did you debug it further ?

@ruppeshnalwaya1993
Copy link

@powderluv yes, I was able to run pix2pix using XLA AOT compilation by I did not finally use it as there were no gains in speed as compared to regular tensorflow.
Anyways, other than the things mentioned above, I remember that I had remove all the code and dependencies of normal tensorflow for xla to work. Seems both of them together can't function properly. I have not checked the recent development in xla though.

@lhx0612
Copy link

lhx0612 commented Mar 1, 2018

@ruppeshnalwaya1993 can you share your BUILD file to cross-compile your model to both android and ios? I managed to use aot to compile my model into a shared library and run on linux, but cannot find anything workable in android and ios. Thanks!

@xujialiang
Copy link

@ruppeshnalwaya1993 can you share your BUILD file to cross-compile your model to both android and ios? I managed to use aot to compile my model into a shared library and run on linux, but cannot find anything workable in android and ios. Thanks!

can u share something?

@mohantym mohantym added the comp:xla XLA label Oct 7, 2022
@github-actions
Copy link

github-actions bot commented Apr 6, 2023

This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Apr 6, 2023
Copy link

github-actions bot commented Apr 6, 2024

This issue was closed because it has been inactive for 1 year.

@github-actions github-actions bot closed this as completed Apr 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:xla XLA stale This label marks the issue/pr stale - to be closed automatically if no activity stat:contribution welcome Status - Contributions welcome type:feature Feature requests
Projects
None yet
Development

No branches or pull requests

10 participants