New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal messages mixing C libtensorflow with python tensorflow #7541

Closed
pwaller opened this Issue Feb 15, 2017 · 12 comments

Comments

Projects
None yet
10 participants
@pwaller

pwaller commented Feb 15, 2017

I'm trying to write mixed C tensorflow code with python tensorflow code by
embedding the CPython interpreter in my application.

I'm mainly doing this because defining the model is only really possible in
Python at the moment due to the lack of gradients (#6268), and I want to define
new models from the C side at speed without needing to invoke or
communicate to an external python process to get a new model.

To reproduce the problem is quite straightforward, simply import tensorflow
in python after the libtensorflow library has already been dynamically linked.
Here is a quick reproducer in pure python which will not run:

import ctypes

tf_dll = ctypes.CDLL("/usr/local/lib/libtensorflow.so")

import tensorflow

libtensorflow can be obtained like so:

TF_TYPE=cpu # Set to gpu for GPU support
TF_OS=linux
curl -L \
  "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${TF_OS}-x86_64-1.0.0.tar.gz" |
sudo tar -C /usr/local -xz

Here are two fatal messages I have encountered (the first from the Python reproducer above, the second from a C program):

F tensorflow/stream_executor/cuda/cuda_platform.cc:180] Check failed: ::perftools::gputools::port::Status::OK() == (MultiPlatformManager::RegisterPlatform(std::move(platform))) (OK vs. Internal: platform is already registered with name: "CUDA")
F tensorflow/core/lib/monitoring/collection_registry.cc:77] Cannot register 2 metrics with the same name: /tensorflow/cc/saved_model/load_attempt_count

I assume the problem is that the _pywrap_tensorflow.so has tensorflow
statically linked into them, so they don't use libtensorflow. Then you have
two shared libraries conflicting with one another.

Is there a way to avoid this conflict?

@aselle

This comment has been minimized.

Show comment
Hide comment
@aselle

aselle Feb 15, 2017

Member

@jhseu, do you have any insight on this. I think @pwaller's analysis is correct. There is clearly some singleton that is getting double initialized.

Member

aselle commented Feb 15, 2017

@jhseu, do you have any insight on this. I think @pwaller's analysis is correct. There is clearly some singleton that is getting double initialized.

@jhseu

This comment has been minimized.

Show comment
Hide comment
@jhseu

jhseu Feb 16, 2017

Member

Yeah, I think this happens because we import pywrap_tensorflow with RTLD_GLOBAL. Unfortunately I don't think there's an easy way to make this work at the moment. It's possible we can go the other way and make the C API accessible from import tensorflow.

Keeping this open, but it's uncommon and difficult, so we may not work on it anytime soon. A possible workaround is to mangle the symbols (namespace?) in the TensorFlow shared library.

Member

jhseu commented Feb 16, 2017

Yeah, I think this happens because we import pywrap_tensorflow with RTLD_GLOBAL. Unfortunately I don't think there's an easy way to make this work at the moment. It's possible we can go the other way and make the C API accessible from import tensorflow.

Keeping this open, but it's uncommon and difficult, so we may not work on it anytime soon. A possible workaround is to mangle the symbols (namespace?) in the TensorFlow shared library.

@fritzo

This comment has been minimized.

Show comment
Hide comment
@fritzo

fritzo Jun 6, 2017

Contributor

@jhseu it would be great if the C API were available through import tensorflow. I'd be happy to submit a PR if you could suggest how. My first guess was to add //tensorflow/c:c_api to the deps of the pywrap_tensorflow_internal BUILD rule, but it is already there. Any ideas?

Contributor

fritzo commented Jun 6, 2017

@jhseu it would be great if the C API were available through import tensorflow. I'd be happy to submit a PR if you could suggest how. My first guess was to add //tensorflow/c:c_api to the deps of the pywrap_tensorflow_internal BUILD rule, but it is already there. Any ideas?

@fritzo

This comment has been minimized.

Show comment
Hide comment
@fritzo

fritzo Jun 6, 2017

Contributor

I have gotten @jhseu's solution to work in #10469.

Before #10469:

$ bazel build -c opt //tensorflow/python:pywrap_tensorflow_internal
$ nm -D bazel-bin/tensorflow/python/_pywrap_tensorflow_internal.so | grep '\<TF_New'

After #10469:

$ bazel build -c opt //tensorflow/python:pywrap_tensorflow_internal
$ nm -D bazel-bin/tensorflow/python/_pywrap_tensorflow_internal.so | grep '\<TF_New'
0000000001acb4d0 T TF_NewBuffer
0000000001acb500 T TF_NewBufferFromString
0000000001acb5b0 T TF_NewDeprecatedSession
0000000001accfe0 T TF_NewGraph
0000000001acd3e0 T TF_NewImportGraphDefOptions
0000000001acb970 T TF_NewOperation
0000000001acd770 T TF_NewSession
0000000001acb470 T TF_NewSessionOptions
0000000001acaf10 T TF_NewStatus
0000000001acb160 T TF_NewTensor
0000000001adc990 T TF_NewWhile
Contributor

fritzo commented Jun 6, 2017

I have gotten @jhseu's solution to work in #10469.

Before #10469:

$ bazel build -c opt //tensorflow/python:pywrap_tensorflow_internal
$ nm -D bazel-bin/tensorflow/python/_pywrap_tensorflow_internal.so | grep '\<TF_New'

After #10469:

$ bazel build -c opt //tensorflow/python:pywrap_tensorflow_internal
$ nm -D bazel-bin/tensorflow/python/_pywrap_tensorflow_internal.so | grep '\<TF_New'
0000000001acb4d0 T TF_NewBuffer
0000000001acb500 T TF_NewBufferFromString
0000000001acb5b0 T TF_NewDeprecatedSession
0000000001accfe0 T TF_NewGraph
0000000001acd3e0 T TF_NewImportGraphDefOptions
0000000001acb970 T TF_NewOperation
0000000001acd770 T TF_NewSession
0000000001acb470 T TF_NewSessionOptions
0000000001acaf10 T TF_NewStatus
0000000001acb160 T TF_NewTensor
0000000001adc990 T TF_NewWhile
@bipedalBit

This comment has been minimized.

Show comment
Hide comment
@bipedalBit

bipedalBit Nov 29, 2017

I'm trying to mix C++ tensorflow code with python tensorflow code instead, with tensorflow 1.4.0. Same fatal messages encountered. Is it suggested that I should link _pywrap_tensorflow_internal.so instead of libtensorflow_cc.so when compiling the C++ DDL to avoid the conflict temporarily? Unfortunately I failed. It seem that some of C++ APIs I used are not included in C APIs and have not been exported to _pywrap_tensorflow_internal.so.

bipedalBit commented Nov 29, 2017

I'm trying to mix C++ tensorflow code with python tensorflow code instead, with tensorflow 1.4.0. Same fatal messages encountered. Is it suggested that I should link _pywrap_tensorflow_internal.so instead of libtensorflow_cc.so when compiling the C++ DDL to avoid the conflict temporarily? Unfortunately I failed. It seem that some of C++ APIs I used are not included in C APIs and have not been exported to _pywrap_tensorflow_internal.so.

@tensorflowbutler

This comment has been minimized.

Show comment
Hide comment
@tensorflowbutler

tensorflowbutler Dec 20, 2017

Member

It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

Member

tensorflowbutler commented Dec 20, 2017

It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

@fritzo

This comment has been minimized.

Show comment
Hide comment
@fritzo

fritzo Dec 20, 2017

Contributor

I think something may have broken in the C symbol export from TF 1.3 -> 1.4, e.g. nimble-dev/nimble#638

Contributor

fritzo commented Dec 20, 2017

I think something may have broken in the C symbol export from TF 1.3 -> 1.4, e.g. nimble-dev/nimble#638

@tensorflowbutler

This comment has been minimized.

Show comment
Hide comment
@tensorflowbutler

tensorflowbutler Jan 4, 2018

Member

It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

Member

tensorflowbutler commented Jan 4, 2018

It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

@gunan

This comment has been minimized.

Show comment
Hide comment
@gunan

gunan Feb 8, 2018

Member

@allenlavoie fixed the issue @jhseu mentioned. Allen, can we close the original issue now?
There may be a new issue now, but should we create a new github issue for that?

Member

gunan commented Feb 8, 2018

@allenlavoie fixed the issue @jhseu mentioned. Allen, can we close the original issue now?
There may be a new issue now, but should we create a new github issue for that?

@gunan gunan closed this Feb 8, 2018

@allenlavoie

This comment has been minimized.

Show comment
Hide comment
@allenlavoie

allenlavoie Feb 8, 2018

Member

If you want to use the C API along with Python, you need in TF >=1.4 to link against _pywrap_tensorflow_internal.so (which includes the C API) and libtensorflow_framework.so (for protobuf symbols; also includes many C++ symbols). You can also build with --config=monolithic if you want the protobuf symbols wrapped up in _pywrap_tensorflow_internal.so.

The Python and C language bindings both include ops and kernels, so you (still) can't use both of those together, even though they both link against libtensorflow_framework.so. Potentially you could remove a bunch of duplicate registration errors to make it work, or build a version of the libtensorflow.so without ops or kernels (making that easy to build manually would be neat).

I think these are issues tracked elsewhere.

Member

allenlavoie commented Feb 8, 2018

If you want to use the C API along with Python, you need in TF >=1.4 to link against _pywrap_tensorflow_internal.so (which includes the C API) and libtensorflow_framework.so (for protobuf symbols; also includes many C++ symbols). You can also build with --config=monolithic if you want the protobuf symbols wrapped up in _pywrap_tensorflow_internal.so.

The Python and C language bindings both include ops and kernels, so you (still) can't use both of those together, even though they both link against libtensorflow_framework.so. Potentially you could remove a bunch of duplicate registration errors to make it work, or build a version of the libtensorflow.so without ops or kernels (making that easy to build manually would be neat).

I think these are issues tracked elsewhere.

@kerolos

This comment has been minimized.

Show comment
Hide comment
@kerolos

kerolos Jul 24, 2018

hello, peter @pwaller ;

Did you solve this error of mixing python and c++ code?
I tried to install Tensorflow from source and I added this flag (tensorflow/tools/pip_package:build_pip_package ) to download the python package, which should be the same c++ version, but unfortunately I face this error ''Cannot register 2 metrics with the same name: /tensorflow/cc/saved_model/load_attempt_count ''.

thanks in advance.

kerolos commented Jul 24, 2018

hello, peter @pwaller ;

Did you solve this error of mixing python and c++ code?
I tried to install Tensorflow from source and I added this flag (tensorflow/tools/pip_package:build_pip_package ) to download the python package, which should be the same c++ version, but unfortunately I face this error ''Cannot register 2 metrics with the same name: /tensorflow/cc/saved_model/load_attempt_count ''.

thanks in advance.

@pwaller

This comment has been minimized.

Show comment
Hide comment
@pwaller

pwaller Jul 25, 2018

@kerolos I didn't get this working in this way in the end, but I would still be interested in seeing it be possible.

pwaller commented Jul 25, 2018

@kerolos I didn't get this working in this way in the end, but I would still be interested in seeing it be possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment