New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors_impl - NotFoundError - stringpiece #6473
Comments
No more problems with new version 0.12 on MacOS. |
The problem still in Linux as said above. I want to add another information: to compile the plugin correctly on Linux I had to add libprotobuf and libprotobuf_lite as dependencies. I took the libraries from the compiled directory after bazel build execution. |
@jart, could you take a look at this issue? |
So you're writing a custom op. Can you post your Bazel build configuration? |
I'm sorry but what do you mean as build configuration? By the way I was working on a clean system with Ubuntu 16.04. The fresh installation has only the base things to build tensorflow, I followed the instructions from the main website and for bazel I used the repo installation. The system was previously configured with Cuda framework with the proper video driver and toolkit, also complete with cudnn. To compile the tensorflow library I added avx instructions set. To build the custom op I'm using gcc as described in the tutorial and so I don't have a project with bazel for the custom op. Hope these information could be useful for the moment. If you need more specific info I will try to respond as soon as I can. Thank you for the support. |
I'm glad to be of service. But please note for future reference that support is community-driven on StackOverflow. That is a more appropriate venue for issues like these. We try to keep this issue tracker limited to bugs and feature requests. That said, you're most likely forgetting to link one of the TensorFlow shared objects into your program. Wild guess but try saying Also, I strongly recommend using Bazel. I don't know why the documentation says to use gcc. Basically there's a directory called tensorflow/user_ops that shows you how to do what you want to do. You can customize that directory to your heart's content. If this doesn't solve your issue, let me know and I'll reopen this. |
I will switch to bazel but basically I followed these instructions : https://www.tensorflow.org/versions/r0.11/how_tos/adding_an_op/#building_the_op_library I wrote here because the problem happens only with the binary package provided on the official website and I didn't have any problems with the package compiled from source. For future similar cases I will use stackoverflow. Thank you for the response. |
Thank you for clarifying. So this only happens if you try to use something in stringpiece.h with the binary release, but compiling TensorFlow from source on your computer works fine. So maybe this is an ABI compatibility type issue. Maybe you're using a different version of GCC than what we used to build the release. Hey @mrry would we consider something like this to be a bug? Does |
Yes it looks like the extension and the binary package use a different definition of
When I look for the corresponding symbol in the binary package I get the following:
Note the difference between There are a few workarounds, which seem to involve involve defining |
I already compiled the extension with 'D_GLIBCXX_USE_CXX11_ABI=0' parameter but I have to try the suggested define inside my sources. I will do it soon but I think the problem is perfectly explained. Is it possible to add that suggestion inside the documentation? Thank you so much for the support, please be patient for a response, I can't try the fix right now. |
I just took a look at the docs, and apparently there is a note buried in there:
If there's something you'd like to improve there, please feel free to submit a PR! |
I solved the problem with the tip you gave me but I want to explain better what is the situation: I have a new op for TensorFlow and I used gcc to compile it and I already use the flag mentioned above, the one you have in the note of the documentation, but I forgot it for some libraries and so I had the mismatch of the functions. With the correct build of the op (using -D_GLIBCXX_USE_CXX11_ABI=0) I had similar problems because I use a TensorFlow python package builded from sources, with bazel. Building from source without the option For the moment I don't have the project of the new op that uses the bazel toolchain, so I compiled again the TensorFlow package from sources with that option and all went well. If I can guess a conclusion, working with the sources of TF and building the new op from there have no problem, despite the use or not of I do not know if it is appropriate add a note on the building from source with that flag, to remain aligned with the official package. This is only a misunderstanding because you can choose to compile the new op with different options. I'm sorry for the delay but I made some tests to be sure about the situation. |
Thank you for providing more useful information for future people googling this issue. If you want to contribute to the documentation, we would be happy to review any pull requests you would be generous enough to contribute. |
I will try to write something for the documentations. Thank you for the support! |
What related GitHub issues or StackOverflow threads have you found by searching the web for your problem?
None.
Environment info
Operating System: Ubuntu 16.04.1 LTS
Installed version of CUDA and cuDNN: None
Binary pip package info:
python -c "import tensorflow; print(tensorflow.__version__)"
: 0.12.0The error comes from the binary packages indicated above. I had no problems with the package builded from source.
Source info:
git rev-parse HEAD
): 48fb73abazel version
:If possible, provide a minimal reproducible example (We usually don't have time to read hundreds of lines of your code)
I guess seeing the problem is that I use inside a new Op some functions from this part of the core:
What other attempted solutions have you tried?
This thing is strange because as I said with the package created from the source code I have no error during the execution of the script, but with the binary package provided from the official website I had the runtime error below.
Logs or other output that would be helpful
tensorflow.python.framework.errors_impl.NotFoundError: /.../newop.so: undefined symbol: _ZN10tensorflow9LogMemory21RecordRawDeallocationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEExPvPNS_9AllocatorEb
The text was updated successfully, but these errors were encountered: