hawkinsp and tensorflower-gardener [TF:XLA] Add partial implementation of tf.FIFOQueue for XLA devices (…
…e.g., TPU).

The idea is to have a host-side queue of device tensors.

Operators dequeue_many, enqueue_many, and dequeue_up_to are not yet implemented because they require splitting/concatenating tensors, which will require calling into a compiled XLA compilation.

Refactor queue operator implementations into libraries separate from the kernel registrations.

Add support for ResourceOpKernels that are placed on non-CPU devices. Add support for allocating host-memory tensors during OpKernel construction.

PiperOrigin-RevId: 202590292
Latest commit 5083915 Jun 29, 2018
Permalink
..
Failed to load latest commit information.
samples Merge changes from github. Feb 22, 2018
sub_makefiles Merge changes from github. Feb 7, 2018
test Add a makefile for tests in contrib/quantization Aug 18, 2016
.gitignore Fix android build of protobuf with android_ndk r10e Jun 15, 2016
BUILD Remove all_opensource_files. It's not needed any more. Mar 29, 2018
Dockerfile Merge changes from github. Nov 2, 2017
Makefile Speed up safe_strtod and safe_strtof functions by using double-conver… Apr 20, 2018
README.md Merge changes from github. Mar 21, 2018
build_all_android.sh Add warning in TFMobile. Jun 20, 2018
build_all_ios.sh Add warning in TFMobile. Jun 20, 2018
build_all_linux.sh Add a BUILD file for makefile build. Oct 20, 2017
build_helper.subr Lower number of parallel jobs to prevent OOM. Apr 27, 2017
build_with_docker.sh Merge changes from github. Sep 26, 2017
compile_android_protobuf.sh Merge changes from github. Nov 2, 2017
compile_ios_protobuf.sh Delete trailing whitespace Nov 27, 2017
compile_ios_tensorflow.sh Merge changes from github. Nov 22, 2017
compile_linux_protobuf.sh Merge changes from github. Apr 22, 2017
compile_nsync.sh Merge changes from github. Jun 19, 2018
compile_pi_protobuf.sh Merge changes from github. Sep 28, 2017
create_ios_frameworks.sh Merge changes from github. Jul 11, 2017
download_dependencies.sh Merge changes from github. Jun 19, 2018
gen_file_lists.sh Merge changes from github. Jul 25, 2016
proto_text_cc_files.txt Break FileSystem's dependency on ThreadPool. Mar 30, 2018
proto_text_pb_cc_files.txt Internal change. Oct 21, 2017
proto_text_pb_h_files.txt Internal change. Oct 21, 2017
rename_protobuf.sh Delete trailing whitespace Nov 27, 2017
rename_protoc.sh Merge changes from github. Aug 16, 2016
rename_prototext.sh Merge changes from github. Aug 16, 2016
tf_op_files.txt [TF:XLA] Add partial implementation of tf.FIFOQueue for XLA devices (… Jun 29, 2018
tf_pb_text_files.txt Internal change. Oct 21, 2017
tf_proto_files.txt Added kernels and estimators for Gradient Boosting Trees algorithm. Mar 29, 2018

README.md

TensorFlow Makefile

The recommended way to build TensorFlow from source is using the Bazel open-source build system. Sometimes this isn't possible. For example, if you are building for iOS, you currently need to use the Makefile.

  • The build system may not have the RAM or processing power to support Bazel.
  • Bazel or its dependencies may not be available.
  • You may want to cross-compile for an unsupported target system.

This experimental project supplies a Makefile automatically derived from the dependencies listed in the Bazel project that can be used with GNU's make tool. With it, you can compile the core C++ runtime into a static library.

This static library will not contain:

  • Python or other language bindings
  • GPU support

You can target:

  • iOS
  • OS X (macOS)
  • Android
  • Raspberry-PI

You will compile tensorflow and protobuf libraries that you can link into other applications. You will also compile the benchmark application that will let you check your application.

Before you start (all platforms)

First, clone this TensorFlow repository.

You will need to download all dependencies as well. We have provided a script that does so, to be run (as with all commands) at the root of the repository:

tensorflow/contrib/makefile/download_dependencies.sh

You should only need to do this step once. It downloads the required libraries like Eigen in the tensorflow/contrib/makefile/downloads/ folder.

You should download the example graph from https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip.

Building on Linux

Note: This has only been tested on Ubuntu.

As a first step, you need to make sure the required packages are installed:

sudo apt-get install autoconf automake libtool curl make g++ unzip zlib1g-dev \
git python

You should then be able to run the build_all_linux.sh script to compile:

tensorflow/contrib/makefile/build_all_linux.sh

This should compile a static library in tensorflow/contrib/makefile/gen/lib/libtensorflow-core.a, and create an example executable at tensorflow/contrib/makefile/gen/bin/benchmark.

Get the graph file, if you have not already:

mkdir -p ~/graphs
curl -o ~/graphs/inception.zip \
 https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip \
 && unzip ~/graphs/inception.zip -d ~/graphs/inception

To run the executable, use:

tensorflow/contrib/makefile/gen/bin/benchmark \
 --graph=$HOME/graphs/inception/tensorflow_inception_graph.pb

Android

First, you will need to download and unzip the Native Development Kit (NDK). You will not need to install the standalone toolchain, however.

Assign your NDK location to $NDK_ROOT:

export NDK_ROOT=/absolute/path/to/NDK/android-ndk-rxxx/

Download the graph if you haven't already:

mkdir -p ~/graphs
curl -o ~/graphs/inception.zip \
 https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip \
 && unzip ~/graphs/inception.zip -d ~/graphs/inception

Then, execute the following:

tensorflow/contrib/makefile/download_dependencies.sh
tensorflow/contrib/makefile/compile_android_protobuf.sh -c
export HOST_NSYNC_LIB=`tensorflow/contrib/makefile/compile_nsync.sh`
export TARGET_NSYNC_LIB=`CC_PREFIX="${CC_PREFIX}" NDK_ROOT="${NDK_ROOT}" \
	tensorflow/contrib/makefile/compile_nsync.sh -t android -a armeabi-v7a`
make -f tensorflow/contrib/makefile/Makefile TARGET=ANDROID

At this point, you will have compiled libraries in gen/lib/* and the benchmark app compiled for Android.

Run the benchmark by pushing both the benchmark and the graph file to your attached Android device:

adb push ~/graphs/inception/tensorflow_inception_graph.pb /data/local/tmp/
adb push tensorflow/contrib/makefile/gen/bin/benchmark /data/local/tmp/
adb shell '/data/local/tmp/benchmark \
 --graph=/data/local/tmp/tensorflow_inception_graph.pb \
 --input_layer="input:0" \
 --input_layer_shape="1,224,224,3" \
 --input_layer_type="float" \
 --output_layer="output:0"
'

For more details, see the benchmark documentation.

CUDA support for Tegra devices running Android (Nvidia Shield TV, etc)

With the release of TF 1.6 and JetPack for Android 3.2 (currently pending), you can now build a version of TensorFlow for compatible devices according to the following instructions which will receive the full benefits of GPU acceleration.

Environment setup:

First, download and install JetPack for Android version 3.2 or greater from Nvidia. Note that as of the TF 1.6 release the JetPack for Android 3.2 release is still pending, and regular JetPack for L4T will not work.

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
JETPACK=$HOME/JetPack_Android_3.2
TEGRA_LIBS="$JETPACK/cuDNN/aarch64/cuda/lib64/libcudnn.so  $JETPACK/cuda-9.0/extras/CUPTI/lib64/libcupti.so $JETPACK/cuda/targets/aarch64-linux-androideabi/lib64/libcufft.so"

Building all CUDA-enabled native binaries:

This will build CUDA-enabled versions of libtensorflow_inference.so and the benchmark binary. (libtensorflow_demo.so will also be built incidentally, but it does not support CUDA)

NDK_ROOT=$JETPACK/android-ndk-r13b
CC_PREFIX=ccache tensorflow/contrib/makefile/build_all_android.sh -s tensorflow/contrib/makefile/sub_makefiles/android/Makefile.in -t "libtensorflow_inference.so libtensorflow_demo.so all" -a tegra

(add -T on subsequent builds to skip protobuf downloading/building)

Testing the CUDA-enabled benchmark via adb:

Build binaries first as above, then run:

adb shell mkdir -p /data/local/tmp/lib64
adb push $TEGRA_LIBS /data/local/tmp/lib64
adb push tensorflow/contrib/makefile/gen/bin/android_arm64-v8a/benchmark /data/local/tmp
wget  https://ci.tensorflow.org/view/Nightly/job/nightly-android/lastSuccessfulBuild/artifact/out/tensorflow_demo.apk
unzip tensorflow_demo.apk -d /tmp/tensorflow_demo
adb push /tmp/tensorflow_demo/assets/*.pb /data/local/tmp
adb shell "LD_LIBRARY_PATH=/data/local/tmp/lib64 /data/local/tmp/benchmark --graph=/data/local/tmp/tensorflow_inception_graph.pb"

Building the CUDA-enabled TensorFlow AAR with Bazel:

Build the native binaries first as above. Then, build the aar and package the native libs by executing the following:

mkdir -p /tmp/tf/jni/arm64-v8a
cp tensorflow/contrib/makefile/gen/lib/android_tegra/libtensorflow_*.so /tmp/tf/jni/arm64-v8a/
cp $TEGRA_LIBS /tmp/tf/jni/arm64-v8a
bazel build //tensorflow/contrib/android:android_tensorflow_inference_java.aar
cp bazel-bin/tensorflow/contrib/android/android_tensorflow_inference_java.aar /tmp/tf/tensorflow.aar
cd /tmp/tf
chmod +w tensorflow.aar
zip -ur tensorflow.aar $(find jni -name *.so)

Building the CUDA-enabled TensorFlow Android demo with Bazel:

Build binaries first as above, then edit tensorflow/examples/android/BUILD and replace:

    srcs = [
       ":libtensorflow_demo.so",
       "//tensorflow/contrib/android:libtensorflow_inference.so",
    ],

with:

srcs = glob(["libs/arm64-v8a/*.so"]),

If you are building for Android TV (Shield TV devices), replace "portrait" with "landscape" for android:screenOrientation in all four activities in tensorflow/examples/android/AndroidManifest.xml

Then run:

# Create dir for native libs
mkdir -p tensorflow/examples/android/libs/arm64-v8a

# Copy JetPack libs
cp $TEGRA_LIBS  tensorflow/examples/android/libs/arm64-v8a

# Copy native TensorFlow libraries
cp tensorflow/contrib/makefile/gen/lib/android_arm64-v8a/libtensorflow_*.so tensorflow/examples/android/libs/arm64-v8a/

# Build APK
bazel build -c opt --fat_apk_cpu=arm64-v8a tensorflow/android:tensorflow_demo

# Install
adb install -r -f bazel-bin/tensorflow/examples/android/tensorflow_demo.apk 

Building the CUDA-enabled Android demo with gradle/Android Studio:

Add tensorflow/examples/android as an Android project in Android Studio as normal.

Edit build.gradle and:

  • set nativeBuildSystem = 'makefile'
  • set cpuType = 'arm64-v8a'
  • in "buildNativeMake", replace cpuType with 'tegra' (optional speedups like -T and ccache also work)
  • set the environment "NDK_ROOT" var to $JETPACK/android-ndk-r13b

Click "build apk" to build.

Install:

adb install -r -f tensorflow/examples/android/gradleBuild/outputs/apk/debug/android-debug.apk

iOS

Note: To use this library in an iOS application, see related instructions in the iOS examples directory.

Install XCode 7.3 or more recent. If you have not already, you will need to install the command-line tools using xcode-select:

xcode-select --install

If this is a new install, you will need to run XCode once to agree to the license before continuing.

(You will also need to have Homebrew installed.)

Then install automake/libtool:

brew install automake
brew install libtool

Also, download the graph if you haven't already:

mkdir -p ~/graphs
curl -o ~/graphs/inception.zip \
 https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip \
 && unzip ~/graphs/inception.zip -d ~/graphs/inception

Building all at once

If you just want to get the libraries compiled in a hurry, you can run this from the root of your TensorFlow source folder:

tensorflow/contrib/makefile/build_all_ios.sh

This process will take around twenty minutes on a modern MacBook Pro.

When it completes, you will have a unified library for all architectures (i386sim, x86_64sim, armv7, armv7s and arm64) and the benchmark program. Although successfully compiling the benchmark program is a sign of success, the program is not a complete iOS app.

If you would only like to build only one architecture to save time: (iOS 11+ only supports 64bit so you can get away with arm64)

tensorflow/contrib/makefile/build_all_ios.sh -a arm64

After the first build if you would like to just build the tensorflow library you can pass the -T flag to avoid a clean & rebuild. This should take you just a few seconds to generate the library if you modified one file.

tensorflow/contrib/makefile/build_all_ios.sh -a arm64 -T

To see TensorFlow running on iOS, the example Xcode project in tensorflow/examples/ios shows how to use the static library in a simple app.

Building by hand

This section covers each step of building. For all the code in one place, see build_all_ios.sh.

If you have not already, you will need to download dependencies:

tensorflow/contrib/makefile/download_dependencies.sh

Next, you will need to compile protobufs for iOS (optionally takes the -a $ARCH flag):

tensorflow/contrib/makefile/compile_ios_protobuf.sh

Then, you will need to compile the nsync library for iOS (optionally takes -a $ARCH flag):

export HOST_NSYNC_LIB=`tensorflow/contrib/makefile/compile_nsync.sh`
export TARGET_NSYNC_LIB=`tensorflow/contrib/makefile/compile_nsync.sh -t ios`

Then, you can run the makefile specifying iOS as the target, along with the architecture you want to build for:

make -f tensorflow/contrib/makefile/Makefile \
 TARGET=IOS \
 IOS_ARCH=ARM64

This creates a library in tensorflow/contrib/makefile/gen/lib/libtensorflow-core.a that you can link any xcode project against.

To see TensorFlow running on iOS, the example Xcode project in tensorflow/examples/ios shows how to use the static library in a simple app.

Universal binaries

In some situations, you will need a universal library. In that case, you will still need to run compile_ios_protobuf.sh and compile_nsync.sh, but this time follow it with:

compile_ios_tensorflow.sh

compile_ios_tensorflow.sh takes the -a flag to build only for one architecture. In case you run into issues with unresolved symbols with nsync you can also pass -h ${HOST_NSYNC_LIB} and -n {TARGET_NSYNC_LIB} so it would look like:

tensorflow/contrib/makefile/compile_ios_tensorflow.sh -f "-O3" -h tensorflow/contrib/makefile/downloads/nsync/builds/default.macos.c++11/nsync.a -n tensorflow/contrib/makefile/downloads/nsync/builds/lipo.ios.c++11/nsync.a -a arm64

In XCode, you will need to use -force_load in the linker flags section of the build settings to pull in the global constructors that are used to register ops and kernels.

Optimization

The build_all_ios.sh script can take optional command-line arguments to selectively register only for the operators used in your graph.

tensorflow/contrib/makefile/build_all_ios.sh -a arm64 -g $HOME/graphs/inception/tensorflow_inception_graph.pb

Please note this is an aggressive optimization of the operators and the resulting library may not work with other graphs but will reduce the size of the final library.

The compile_ios_tensorflow.sh script can take optional command-line arguments. The first argument will be passed as a C++ optimization flag and defaults to debug mode. If you are concerned about performance or are working on a release build, you would likely want a higher optimization setting, like so:

compile_ios_tensorflow.sh -f "-Os"

For other variations of valid optimization flags, see clang optimization levels.

Raspberry Pi

Building on the Raspberry Pi is similar to a normal Linux system. First download the dependencies, install the required packages and build protobuf:

tensorflow/contrib/makefile/download_dependencies.sh
sudo apt-get install -y autoconf automake libtool gcc-4.8 g++-4.8
cd tensorflow/contrib/makefile/downloads/protobuf/
./autogen.sh
./configure
make
sudo make install
sudo ldconfig  # refresh shared library cache
cd ../../../../..
export HOST_NSYNC_LIB=`tensorflow/contrib/makefile/compile_nsync.sh`
export TARGET_NSYNC_LIB="$HOST_NSYNC_LIB"

Once that's done, you can use make to build the library and example:

make -f tensorflow/contrib/makefile/Makefile HOST_OS=PI TARGET=PI OPTFLAGS="-Os" CXX=g++-4.8

If you're only interested in building for Raspberry Pi's 2 and 3, you can supply some extra optimization flags to give you code that will run faster:

make -f tensorflow/contrib/makefile/Makefile HOST_OS=PI TARGET=PI \
 OPTFLAGS="-Os -mfpu=neon-vfpv4 -funsafe-math-optimizations -ftree-vectorize" CXX=g++-4.8

One thing to be careful of is that the gcc version 4.9 currently installed on Jessie by default will hit an error mentioning __atomic_compare_exchange. This is why the examples above specify CXX=g++-4.8 explicitly, and why we install it using apt-get. If you have partially built using the default gcc 4.9, hit the error and switch to 4.8, you need to do a make -f tensorflow/contrib/makefile/Makefile clean before you build. If you don't, the build will appear to succeed but you'll encounter malloc(): memory corruption errors when you try to run any programs using the library.

For more examples, look at the tensorflow/contrib/pi_examples folder in the source tree, which contains code samples aimed at the Raspberry Pi.

Other notes

Supported Systems

The Make script has been tested on Ubuntu and OS X. If you look in the Makefile itself, you'll see it's broken up into host and target sections. If you are cross-compiling, you should look at customizing the target settings to match what you need for your desired system.

Dependency Management

The Makefile loads in a list of dependencies stored in text files. These files are generated from the main Bazel build by running tensorflow/contrib/makefile/gen_file_lists.sh. You'll need to re-run this i you make changes to the files that are included in the build.

Header dependencies are not automatically tracked by the Makefile, so if you make header changes you will need to run this command to recompile cleanly:

make -f tensorflow/contrib/makefile/Makefile clean

Cleaning up

In some situations, you may want to completely clean up. The dependencies, intermediate stages, and generated files are stored in:

tensorflow/contrib/makefile/downloads
tensorflow/contrib/makefile/gen

Those directories can safely be removed, but you will have to start over with download_dependencies.sh once you delete them.

Fixing Makefile Issues

Because the main development of TensorFlow is done using Bazel, changes to the codebase can sometimes break the makefile build process. If you find that tests relying on this makefile are failing with a change you're involved in, here are some trouble-shooting steps:

  • Try to reproduce the issue on your platform. If you're on Linux, running make -f tensorflow/contrib/makefile/Makefile should be enough to recreate most issues. For other platforms, see the sections earlier in this document.

  • The most common cause of breakages are files that have been added to the Bazel build scripts, but that the makefile isn't aware of. Typical symptoms of this include linker errors mentioning missing symbols or protobuf headers that aren't found. To address these problems, take a look at the *.txt files in tensorflow/contrib/makefile. If you have a new operator, you may need to add it to tf_op_files.txt, or for a new proto to tf_proto_files.txt.

  • There's also a wildcard system in Makefile that defines what core C++ files are included in the library. This is designed to match the equivalent rule in tensorflow/core/BUILD, so if you change the wildcards there to include new files you'll need to also update CORE_CC_ALL_SRCS and CORE_CC_EXCLUDE_SRCS in the makefile.

  • Some of the supported platforms use clang instead of gcc as their compiler, so if you're hitting compile errors you may need to tweak your code to be more friendly to different compilers by avoiding gcc extensions or idioms.

These are the most common reasons for makefile breakages, but it's also possible you may hit something unusual, like a platform incompatibility. For those, you'll need to see if you can reproduce the issue on that particular platform and debug it there. You can also reach out to the broader TensorFlow team by filing a Github issue to ask for help.