Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Add TVMDSOOp to integrate any TVM operator with TensorFlow #4464

Closed
tobegit3hub opened this issue Dec 4, 2019 · 16 comments
Closed

[RFC] Add TVMDSOOp to integrate any TVM operator with TensorFlow #4464

tobegit3hub opened this issue Dec 4, 2019 · 16 comments

Comments

@tobegit3hub
Copy link
Contributor

Problem

TensorFlow is one of the most popular machine learning libraries and most developers are used to train/inference models with TensorFlow/TensorFlow Serving. TVM is the flexible compiler to run computation efficiently in different devices. Although TensorFlow has implemented some efficient GPU operators, developers can benifit from TVM to get more than 10 times speedup and FPGA support. But TensorFlow and TVM have two different code stacks and runtime APIs to use.

There are two ways to integrated TVM with TensorFlow. The first one is tensorflow-to-tvm which has been support by relay importer. Most TensorFlow operators can be “translated” to TVM operators which is useful if want to run the TVM stack with the model structure from other frameworks.

The second one is tvm-to-tensorflow. This requires to embed TVM operators in TensorFlow graph so that we can use TensorFlow session to run preset operators and TVM-optimized operators. This is really helpful if we want to use TVM to optimize part of the computation graph while developers can use TensorFlow Python API to describe the model and use TensorFlow Serving for inference. Embedding TVM in TensorFlow requires the minimal cost to use TVM optimiztion on existing models and extend TensorFlow functionalities such as FPGA support.

This RFC describes how we design to support tvm-to-tensorflow with TensorFlow custom op API and the detail of implementation.

Considerations

Developers can use TVM stack to build operators without limitation.

Developers can use TVM Python package to import and load TVM operators in TensorFlow graph.

Developers can specify the output_shape/output_dtype/target_device/memory_align for TVM operators.

Proposal

The best way to extends TensorFlow functionality is building the TensorFlow custom op for TVM runtime. We build the operator called TVMDSOOp and it has implemented CPU and GPU kernels to load any TVM dynamic library. We can run TensorFlow graph with this op which invokes TVM inference with zero-copy Tensor data. Here is the walk-through examples.

Developer can implement the TVM operators with TVM Python API. All they need to do is exporting the dynamic libraries to local file system.

n = tvm.var("n")
A = tvm.placeholder((n,), name='A')
B = tvm.compute(A.shape, lambda *i: A(*i) + 1, name='B')
s = tvm.create_schedule(B.op)
fadd_dylib = tvm.build(s, [A, B], "llvm", name="addone")
fadd_dylib.export_library("tvm_addone_dll.so")

bx, tx = s[B].split(B.op.axis[0], factor=64)
s[B].bind(bx, tvm.thread_axis("blockIdx.x"))
s[B].bind(tx, tvm.thread_axis("threadIdx.x"))
fadd_dylib = tvm.build(s, [A, B], "cuda", name="addone")
fadd_dylib.export_library("tvm_addone_cuda_dll.so")

With the code in our pull-request, we will set set(USE_TFOP ON) and use CMake to build the TVM from scratch. It would generate the tvm_dso_op.so file and provide the tvm.contrib.tf_op in Python API. Then we can use TensorFlow and TVM to build the graph with TVM operators and run by TensorFlow session.

import tensorflow as tf
from tvm.contrib import tf_op

def main():
  mod = tf_op.Module("tvm_addone_dll.so")
  addone = mod.func("addone", output_shape=[2])

  with tf.Session() as sess:
    with tf.device("/cpu:0"):
      placeholder = tf.placeholder("float32", shape=[2])
      print(sess.run(addone(placeholder), feed_dict={placeholder: [1.0, 2.0]}))

    with tf.device("/gpu:0"):
      placeholder = tf.placeholder("float32")
      addone_gpu = tf_op.Module("tvm_addone_cuda_dll.so")["addone"]
      print(sess.run(addone_gpu(placeholder), feed_dict={placeholder: [1.0, 2.0]}))

if __name__ == "__main__":
  main()

Since every TensorFlow custom op should has specified input tensors, we wrap TVM Python API to support operators with up to 8 input tensors. Users can pass multiple TensorFlow tensors to TVMDSOOp if we support multiple inputs in TVM operators. The Python API looks the same as single input.

import tensorflow as tf
from tvm.contrib import tf_op

def main():
  left = tf.placeholder("float32", shape=[4])
  right = tf.placeholder("float32", shape=[4])

  feed_dict = {
    left: [1.0, 2.0, 3.0, 4.0],
    right: [5.0, 6.0, 7.0, 8.0]
  }

  module = tf_op.Module("tvm_add_dll.so")
  add = module.func("vector_add", output_shape=tf.shape(left), output_dtype="float")

  with tf.Session() as sess:
    with tf.device("/cpu:0"):
      print(sess.run(add(left, right), feed_dict))

if __name__ == "__main__":
  main()

For more examples, please refer to https://github.com/tobegit3hub/tftvm/tree/master/examples .

All the TVM operators can be embedded into TensorFlow graph with this TVMDSOOp and Python API. We don't need to copy data from TensorFlow(Tensor) to TVM(DLPack) with zero-copy therefore the performance should be great.

@tobegit3hub
Copy link
Contributor Author

The implementation of this proposal has been submit to #4459 .

Anyone can try to test their TVM operators by re-compiling TVM with set(USE_TFOP ON).

@tqchen
Copy link
Member

tqchen commented Dec 4, 2019

cc @jwfromm @jroesch @soiferj

@jwfromm
Copy link
Contributor

jwfromm commented Dec 6, 2019

The motivations of this RFC are extremely similar to those in pytorch-tvm, however the two implementations are very different and it is worth discussing the tradeoffs.

  • torch-tvm is self contained, it doesn't use any special functions or classes in TVM. Instead it modifies torch script to use existing TVM functions.
  • torch-tvm uses relay to represent subgraphs and then dynamically builds functions rather than using prebuilt libraries as proposed here.

I understand that the current implementation is the shortest path to getting tvm functions working in TensorFlow and that a torch-tvm approach would be a much larger undertaking. However, I don't think it will be able to scale well. The use of prebuilt libraries means there will be a lot of back and forth between regular tvm and tensorflow-tvm during development, and it seems like developers would be better off just importing their tf model to relay and doing everything within tvm. Contrast this to the torch-tvm approach where all the tvm magic happens transparently, making it very straight forward for pytorch users.

We should also consider where the code belongs. I personally prefer having projects like torch-tvm and tf-tvm being separate from the main tvm repo if possible as it we already are dealing with frontend bloat.

All that said, I think something like tf-tvm is a great idea and something we should work towards. I just want to make sure we make the first step carefully.

@tobegit3hub
Copy link
Contributor Author

Thanks @jwfromm and you're definitely right. This is the fastest way to integrate TVM functions into TensorFlow if we can not convert the whole model to TVM. This may be meaningful for TensorFlow developers if they can to try TVM and leverage the sub-graph optimizaition from TVM.

Actually this project is the TensorFlow custom op with TVM runtime. We originally develop in the standalone project https://github.com/tobegit3hub/tftvm . Since it depends on TVM and TensorFlow to compile, it is okay to be one of the TVM contrib libraries or maintain in the independent project.

@jwfromm
Copy link
Contributor

jwfromm commented Dec 11, 2019

That makes sense, you're right that having it in contrib clears up a lot of my concerns. Thanks for those clarifications!

@tobegit3hub
Copy link
Contributor Author

Hi @tqchen @jwfromm @jroesch @soiferj , do you have any other comment?

We may add more docs about implementation and usage so that everyone can know it works.

@tobegit3hub
Copy link
Contributor Author

The PR has been merged and we will close this issue.

@652994331
Copy link

@tobegit3hub Hi guys, i built tvm before and i built tvmsoop separately(not from USE_TF_TVMSOOP=ON) follow this:https://github.com/tobegit3hub/tftvm/tree/master/examples .
After i got libxx.so and link them to my tvm home, i run the test

import tensorflow as tf
from tvm.contrib import tf_op

mod = tf_op.Module("tvm_addone_dll.so")
addone = mod.func("addone", output_shape=[4])

with tf.Session() as sess:
a = tf.constant([10.1, 20.0, 11.2, -30.3])
b = addone(a)
print(sess.run(b))

and i got this error:
Traceback (most recent call last):
File "test_python.py", line 5, in
addone = mod.func("add_one, output_shape=[4]")
File "/opt/cephfs1/asr/users/qizhou.huang/.local/lib/python3.6/site-packages/tvm-0.7.dev1-py3.6-linux-x86_64.egg/tvm/contrib/tf_op/module.py", line 27, in func
return Func(self.lib_path, name, output_dtype, output_shape)
File "/opt/cephfs1/asr/users/qizhou.huang/.local/lib/python3.6/site-packages/tvm-0.7.dev1-py3.6-linux-x86_64.egg/tvm/contrib/tf_op/module.py", line 55, in init
self.module = load_library.load_op_library('tvm_dso_op.so')
File "/opt/cephfs1/asr/users/qizhou.huang/anaconda3/envs/tvm/lib/python3.6/site-packages/tensorflow_core/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /opt/cephfs1/asr/users/qizhou.huang/qizhou/PycharmProjects/incubator-tvm/build/tvm_dso_op.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrESs

my gcc is 6.4.0,my tensorflow flow is tf-1.15.0 i use bazel build it from source and set -D_GLIBCXX_CXX11_ABI=1.
Btw, i also tried the pip install tensorflow,-1.13.1, same error. Couild you please help me out, thanks

@tobegit3hub
Copy link
Contributor Author

tobegit3hub commented Aug 14, 2020

@652994331 You should not use tftvm which is deprecated and please rebuild TVM with USE_TF_TVMSOOP=ON. Here is the complete tutorial https://discuss.tvm.ai/t/add-the-document-for-tvmdsoop/6622 .

@652994331
Copy link

@tobegit3hub i tried before, but unfortunately, i had this error:
Traceback (most recent call last):
File "", line 1, in
ModuleNotFoundError: No module named 'tensorflow'
CMake Error at cmake/modules/contrib/TF_TVMDSOOP.cmake:25 (message):
Fail to get TensorFlow compile flags
Call Stack (most recent call first):
CMakeLists.txt:334 (include)

@652994331
Copy link

@tobegit3hub and here is the entire cmake log:
udo cmake ..
-- The C compiler identification is GNU 6.4.0
-- The CXX compiler identification is GNU 6.4.0
-- Check for working C compiler: /bin/cc
-- Check for working C compiler: /bin/cc - works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /bin/c++
-- Check for working CXX compiler: /bin/c++ - works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Build with RPC support...
-- Build with Graph runtime support...
-- VTA build with VTA_HW_PATH=/opt/cephfs1/asr/users/qizhou.huang/qizhou/PycharmProjects/incubator-tvm/3rdparty/vta-hw
-- Build VTA runtime with target: sim
-- Build with standalone CRT
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda
-- Found CUDA_CUDA_LIBRARY=/usr/local/cuda/targets/x86_64-linux/lib/stubs/libcuda.so
-- Found CUDA_CUDART_LIBRARY=/usr/local/cuda/lib64/libcudart.so
-- Found CUDA_NVRTC_LIBRARY=/usr/local/cuda/lib64/libnvrtc.so
-- Found CUDA_CUDNN_LIBRARY=/usr/lib64/libcudnn.so
-- Found CUDA_CUBLAS_LIBRARY=/usr/local/cuda/lib64/libcublas.so
-- Found CUDA_CUBLASLT_LIBRARY=CUDA_CUBLASLT_LIBRARY-NOTFOUND
-- Build with CUDA support
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Build with OpenMP /opt/cephfs1/asr/users/qizhou.huang/qizhou/intel-mkl/lib/intel64/libiomp5.so
-- Use llvm-config=/opt/cephfs1/asr/users/qizhou.huang/qizhou/llvm/bin/llvm-config
-- /opt/cephfs1/asr/users/qizhou.huang/qizhou/llvm/include
-- Found LLVM_INCLUDE_DIRS=/opt/cephfs1/asr/users/qizhou.huang/qizhou/llvm/include
-- Found LLVM_DEFINITIONS= -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
-- Found TVM_LLVM_VERSION=90
-- Build with LLVM
-- Set TVM_LLVM_VERSION=90
-- Use BLAS library /opt/cephfs1/asr/users/qizhou.huang/qizhou/intel-mkl/mkl/lib/intel64/libmkl_rt.so
-- Use MKLDNN library /usr/local/lib64/libdnnl.so
-- Build with contrib.sort
-- Build with contrib.hybriddump
-- Found Python3: /usr/local/bin/python3.6 (found version "3.6.8") found components: Interpreter
Traceback (most recent call last):
File "", line 1, in
ModuleNotFoundError: No module named 'tensorflow'
CMake Error at cmake/modules/contrib/TF_TVMDSOOP.cmake:25 (message):
Fail to get TensorFlow compile flags
Call Stack (most recent call first):
CMakeLists.txt:334 (include)

@tobegit3hub
Copy link
Contributor Author

@652994331 You need to install tensorflow so that TVMDSOOp could link to TensorFlow libraries.

Here is the error message from your cmake.

ModuleNotFoundError: No module named 'tensorflow'

@652994331
Copy link

@tobegit3hub thanks for the reply, i used pip install and also build tensorflow from source , if i use import tensorflow as tf; tf.version i can find 1.15.0 in my env. i guess there's something wrong with the path?

@652994331
Copy link

@tobegit3hub Maybe i should set up the tensorflow path in the config.cmake before i cmake tvm, sorry i am not pretty sure

@652994331
Copy link

@tobegit3hub It seems there are some lines about tensorflow path from the cmakelist.txt of tftvm projects(which's deprecated like u said) https://github.com/tobegit3hub/tftvm/blob/master/CMakeLists.txt
but in the cmakelist.txt of incubator-tvm projects(i am using the master branch), i cant find these lines about tensorflow path.
i think this is the problem, am i right using the master branch of incubator-tvm to build tvm with tvmsdoop.

Thanks

@652994331
Copy link

@tobegit3hub Hi, i checked the cmake file again, i think the problem is: the TF_TVMDSOOP.cmake file use findpackages() to find {python3_executable}, the path it found's /usr/local/bin/python, but actually i am using a anaconda3 env and i installed tensorflow in this env. i tried to set python3_executable to my anaconda python path, not worked. Could you please help me about this, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants