Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal messages mixing C libtensorflow with python tensorflow #7541

Closed
pwaller opened this issue Feb 15, 2017 · 13 comments
Closed

Fatal messages mixing C libtensorflow with python tensorflow #7541

pwaller opened this issue Feb 15, 2017 · 13 comments
Labels
stat:contribution welcome Status - Contributions welcome type:feature Feature requests

Comments

@pwaller
Copy link

pwaller commented Feb 15, 2017

I'm trying to write mixed C tensorflow code with python tensorflow code by
embedding the CPython interpreter in my application.

I'm mainly doing this because defining the model is only really possible in
Python at the moment due to the lack of gradients (#6268), and I want to define
new models from the C side at speed without needing to invoke or
communicate to an external python process to get a new model.

To reproduce the problem is quite straightforward, simply import tensorflow
in python after the libtensorflow library has already been dynamically linked.
Here is a quick reproducer in pure python which will not run:

import ctypes

tf_dll = ctypes.CDLL("/usr/local/lib/libtensorflow.so")

import tensorflow

libtensorflow can be obtained like so:

TF_TYPE=cpu # Set to gpu for GPU support
TF_OS=linux
curl -L \
  "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${TF_OS}-x86_64-1.0.0.tar.gz" |
sudo tar -C /usr/local -xz

Here are two fatal messages I have encountered (the first from the Python reproducer above, the second from a C program):

F tensorflow/stream_executor/cuda/cuda_platform.cc:180] Check failed: ::perftools::gputools::port::Status::OK() == (MultiPlatformManager::RegisterPlatform(std::move(platform))) (OK vs. Internal: platform is already registered with name: "CUDA")
F tensorflow/core/lib/monitoring/collection_registry.cc:77] Cannot register 2 metrics with the same name: /tensorflow/cc/saved_model/load_attempt_count

I assume the problem is that the _pywrap_tensorflow.so has tensorflow
statically linked into them, so they don't use libtensorflow. Then you have
two shared libraries conflicting with one another.

Is there a way to avoid this conflict?

@aselle
Copy link
Contributor

aselle commented Feb 15, 2017

@jhseu, do you have any insight on this. I think @pwaller's analysis is correct. There is clearly some singleton that is getting double initialized.

@aselle aselle added stat:awaiting tensorflower Status - Awaiting response from tensorflower type:feature Feature requests labels Feb 15, 2017
@jhseu
Copy link
Contributor

jhseu commented Feb 16, 2017

Yeah, I think this happens because we import pywrap_tensorflow with RTLD_GLOBAL. Unfortunately I don't think there's an easy way to make this work at the moment. It's possible we can go the other way and make the C API accessible from import tensorflow.

Keeping this open, but it's uncommon and difficult, so we may not work on it anytime soon. A possible workaround is to mangle the symbols (namespace?) in the TensorFlow shared library.

@fritzo
Copy link
Contributor

fritzo commented Jun 6, 2017

@jhseu it would be great if the C API were available through import tensorflow. I'd be happy to submit a PR if you could suggest how. My first guess was to add //tensorflow/c:c_api to the deps of the pywrap_tensorflow_internal BUILD rule, but it is already there. Any ideas?

@fritzo
Copy link
Contributor

fritzo commented Jun 6, 2017

I have gotten @jhseu's solution to work in #10469.

Before #10469:

$ bazel build -c opt //tensorflow/python:pywrap_tensorflow_internal
$ nm -D bazel-bin/tensorflow/python/_pywrap_tensorflow_internal.so | grep '\<TF_New'

After #10469:

$ bazel build -c opt //tensorflow/python:pywrap_tensorflow_internal
$ nm -D bazel-bin/tensorflow/python/_pywrap_tensorflow_internal.so | grep '\<TF_New'
0000000001acb4d0 T TF_NewBuffer
0000000001acb500 T TF_NewBufferFromString
0000000001acb5b0 T TF_NewDeprecatedSession
0000000001accfe0 T TF_NewGraph
0000000001acd3e0 T TF_NewImportGraphDefOptions
0000000001acb970 T TF_NewOperation
0000000001acd770 T TF_NewSession
0000000001acb470 T TF_NewSessionOptions
0000000001acaf10 T TF_NewStatus
0000000001acb160 T TF_NewTensor
0000000001adc990 T TF_NewWhile

@bipedalBit
Copy link

bipedalBit commented Nov 29, 2017

I'm trying to mix C++ tensorflow code with python tensorflow code instead, with tensorflow 1.4.0. Same fatal messages encountered. Is it suggested that I should link _pywrap_tensorflow_internal.so instead of libtensorflow_cc.so when compiling the C++ DDL to avoid the conflict temporarily? Unfortunately I failed. It seem that some of C++ APIs I used are not included in C APIs and have not been exported to _pywrap_tensorflow_internal.so.

@tensorflowbutler
Copy link
Member

It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

@fritzo
Copy link
Contributor

fritzo commented Dec 20, 2017

I think something may have broken in the C symbol export from TF 1.3 -> 1.4, e.g. nimble-dev/nimble#638

@tensorflowbutler
Copy link
Member

It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

@jhseu jhseu removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jan 4, 2018
@shivaniag shivaniag added the stat:contribution welcome Status - Contributions welcome label Jan 9, 2018
@gunan
Copy link
Contributor

gunan commented Feb 8, 2018

@allenlavoie fixed the issue @jhseu mentioned. Allen, can we close the original issue now?
There may be a new issue now, but should we create a new github issue for that?

@gunan gunan closed this as completed Feb 8, 2018
@allenlavoie
Copy link
Member

If you want to use the C API along with Python, you need in TF >=1.4 to link against _pywrap_tensorflow_internal.so (which includes the C API) and libtensorflow_framework.so (for protobuf symbols; also includes many C++ symbols). You can also build with --config=monolithic if you want the protobuf symbols wrapped up in _pywrap_tensorflow_internal.so.

The Python and C language bindings both include ops and kernels, so you (still) can't use both of those together, even though they both link against libtensorflow_framework.so. Potentially you could remove a bunch of duplicate registration errors to make it work, or build a version of the libtensorflow.so without ops or kernels (making that easy to build manually would be neat).

I think these are issues tracked elsewhere.

@kerolos
Copy link

kerolos commented Jul 24, 2018

hello, peter @pwaller ;

Did you solve this error of mixing python and c++ code?
I tried to install Tensorflow from source and I added this flag (tensorflow/tools/pip_package:build_pip_package ) to download the python package, which should be the same c++ version, but unfortunately I face this error ''Cannot register 2 metrics with the same name: /tensorflow/cc/saved_model/load_attempt_count ''.

thanks in advance.

@pwaller
Copy link
Author

pwaller commented Jul 25, 2018

@kerolos I didn't get this working in this way in the end, but I would still be interested in seeing it be possible.

@dbousamra
Copy link

Any updates this double singleton initialisation issue? I am trying to call some Python code (that imports and runs Tensorflow) from within a C program that is linked against Tensorflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:contribution welcome Status - Contributions welcome type:feature Feature requests
Projects
None yet
Development

No branches or pull requests