New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined reference to symbol 'ceil@@GLIBC_2.2.5' in building Tensorflow #934

Closed
drgriffis opened this Issue Feb 18, 2016 · 7 comments

Comments

Projects
None yet
3 participants
@drgriffis

drgriffis commented Feb 18, 2016

(Copy of Tensorflow issue #1171; it was suggested that I bring this problem over here, since it's a build issue) I'm working on compiling Tensorflow from source, using non-standard GCC/etc. installations. Environment info: RHEL 6.7, GCC 5.2.1, Bazel 0.1.5. I'm installing Tensorflow from HEAD (commit f82ad36). I'm using a non-CUDA configuration. I've followed the steps @sethbruder suggests in his comment on #649, including copying the contents of tools from bazel into tensorflow/tools/ and into tensorflow/google/protobuf/tools/. This is possibly related to Tensorflow issue #332, as I'm getting the same error, but at build time as opposed to API usage.

I've tried to set up the relevant paths for my non-standard system resource install using the following settings before invoking bazel:

export LDFLAGS="-Wl,-rpath,/opt/rh/devtoolset-4/root/usr/lib64 -lrt -lm"
export CC="/opt/rh/devtoolset-4/root/usr/bin/gcc"
export CXX="/opt/rh/devtoolset-4/root/usr/bin/g++"
export JAVA_HOME="/u/drspeech/opt/jdks/jdk1.8.0_25"
export LD_LIBRARY_PATH="/opt/rh/devtoolset-4/root/usr/lib:${LD_LIBRARY_PATH}"
export BAZEL_ARGS="--verbose_failures"
export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --linkopt=-Wl,-rpath,/opt/rh/devtoolset-4/root/usr/lib64"
export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --linkopt=-Wl,-rpath,/u/drspeech/opt/jdks/jdk1.8.0_25/lib"
export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --linkopt=-lz"
#export EXTRA_BAZEL_ARGS="${BAZEL_ARGS} --linkopt=-Wl,-rpath,/usr/local/cuda-7.0/lib64"
export PYTHON_MAJOR_VERSION=3
export PYTHON_BINARY=/u/drspeech/opt/python-3.5.1/bin/python3
export MYBAZEL=/u/drspeech/opt/bazel-0.1.5/0.1.5/bazel-0.1.5/output/bazel

(there may be some leftover settings; this is adapted from what I used for building bazel in #925)

When I invoke bazel with

${MYBAZEL} build -c opt //tensorflow/tools/pip_package:build_pip_package

I get the following error output:

WARNING: Output base '/homes/2/griffisd/.cache/bazel/_bazel_griffisd/294e12ab714f8384c060bacb49311f55' is on NFS. This may lead to surprising failures and undetermined behavior.
WARNING: Sandboxed execution is not supported on your system and thus hermeticity of actions cannot be guaranteed. See http://bazel.io/docs/bazel-user-manual.html#sandboxing for more information. You can turn off this warning via --ignore_unsupported_sandboxing.
____Loading...
____Found 1 target...
____Building...
____[1 / 12] Compiling google/protobuf/python/google/protobuf/pyext/descriptor.cc
____[1 / 147] Compiling external/re2/re2/compile.cc
ERROR: /homes/0/drspeech/opt/tensorflow-0.6.0/0.7.0/tensorflow/google/protobuf/BUILD:272:1: Linking of rule '//google/protobuf:protoc' failed: gcc failed: error executing command 
  (cd /homes/2/griffisd/.cache/bazel/_bazel_griffisd/294e12ab714f8384c060bacb49311f55/tensorflow && \
  exec env - \
  /opt/rh/devtoolset-4/root/usr/bin/gcc -o bazel-out/host/bin/google/protobuf/protoc -B/opt/rh/devtoolset-4/root/usr/bin/ -Wl,-z,relro,-z,now -no-canonical-prefixes -pass-exit-codes '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,-S -Wl,--gc-sections -Wl,@bazel-out/host/bin/google/protobuf/protoc-2.params): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1: gcc failed: error executing command 
  (cd /homes/2/griffisd/.cache/bazel/_bazel_griffisd/294e12ab714f8384c060bacb49311f55/tensorflow && \
  exec env - \
  /opt/rh/devtoolset-4/root/usr/bin/gcc -o bazel-out/host/bin/google/protobuf/protoc -B/opt/rh/devtoolset-4/root/usr/bin/ -Wl,-z,relro,-z,now -no-canonical-prefixes -pass-exit-codes '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,-S -Wl,--gc-sections -Wl,@bazel-out/host/bin/google/protobuf/protoc-2.params): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
/opt/rh/devtoolset-4/root/usr/bin/ld: /opt/rh/devtoolset-4/root/usr/lib/gcc/x86_64-redhat-linux/5.2.1/libstdc++_nonshared.a(hashtable_c++0x.o): undefined reference to symbol 'ceil@@GLIBC_2.2.5'
//lib64/libm.so.6: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
Target //tensorflow/tools/pip_package:build_pip_package failed to build
____Elapsed time: 0.707s, Critical Path: 0.34s

It seems like I may be missing an "-lm" flag in the invocation to gcc to the protobuf target; I've tried including it in LDFLAGS as shown above, but it doesn't seem to be appearing in the gcc invocation.

@damienmg

This comment has been minimized.

Show comment
Hide comment
@damienmg

damienmg Feb 19, 2016

Contributor

try doing that: echo "build ${EXTRA_BAZEL_ARGS}" >~/.bazelrc and retyping the bazel build?

Contributor

damienmg commented Feb 19, 2016

try doing that: echo "build ${EXTRA_BAZEL_ARGS}" >~/.bazelrc and retyping the bazel build?

@damienmg

This comment has been minimized.

Show comment
Hide comment
@damienmg

damienmg Feb 19, 2016

Contributor

(of course with EXTRA_BAZEL_ARGS being setting correctly)

Contributor

damienmg commented Feb 19, 2016

(of course with EXTRA_BAZEL_ARGS being setting correctly)

@drgriffis

This comment has been minimized.

Show comment
Hide comment
@drgriffis

drgriffis Feb 20, 2016

Still getting the same issue with that in my ~/.bazelrc. With some other edits to handle a custom Swig install (per Tensorflow#706), I've managed to get a GPU build working, but not the CPU build.

For CUDA compatibility, I rebuilt Bazel with GCC 4.9.8 and am using that for TensorFlow as well. My updated environment settings (stored in a .sh script and sourced before executing the build command) are:

export LDFLAGS="-Wl,-rpath,/opt/rh/devtoolset-3/root/usr/lib64 -lrt -lm"
export CC="/opt/rh/devtoolset-3/root/usr/bin/gcc"
export CXX="/opt/rh/devtoolset-3/root/usr/bin/g++"
export JAVA_HOME="/u/drspeech/opt/jdks/jdk1.8.0_25"
export LD_LIBRARY_PATH="/opt/rh/devtoolset-3/root/usr/lib:${LD_LIBRARY_PATH}"
export EXTRA_BAZEL_ARGS="--verbose_failures"
export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --linkopt=-Wl,-rpath,/opt/rh/devtoolset-3/root/usr/lib64"
export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --linkopt=-Wl,-rpath,/u/drspeech/opt/jdks/jdk1.8.0_25/lib"
export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --linkopt=-lz --linkopt=-lrt --linkopt=-lm"
#export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --linkopt=-Wl,-rpath,/usr/local/cuda-7.0/lib64"
export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --genrule_strategy=standalone --spawn_strategy=standalone"
export PYTHON_MAJOR_VERSION=3
export PYTHON_BINARY=/u/drspeech/opt/python-3.5.1/bin/python3

export PATH="/u/drspeech/opt/swig-3.0.8/bin:${PATH}"
export PATH="/opt/rh/devtoolset-3/root/usr/bin:${PATH}"

echo "build ${EXTRA_BAZEL_ARGS}" > ~/.bazelrc

export MYBAZEL=/u/drspeech/opt/bazel-0.1.5/0.1.5/bazel-0.1.5.devtools-3/output/bazel

~/.bazelrc looks like:

build --verbose_failures --linkopt=-Wl,-rpath,/opt/rh/devtoolset-3/root/usr/lib64 --linkopt=-Wl,-rpath,/u/drspeech/opt/jdks/jdk1.8.0_25/lib --linkopt=-lz --linkopt=-lrt --linkopt=-lm --genrule_strategy=standalone --spawn_strategy=standalone

And updated build command is:

$MYBAZEL build -c opt --genrule_strategy=standalone --spawn_strategy=standalone //tensorflow/tools/pip_package:build_pip_package

As I said, I'm able to generate a GPU build with no issues; the only difference between the builds being that I enable GPU support when running the configure script, and that the EXTRA_BAZEL_ARGS value to link to the cuda libraries above is uncommented.

Thanks a lot for being so helpful with this!

drgriffis commented Feb 20, 2016

Still getting the same issue with that in my ~/.bazelrc. With some other edits to handle a custom Swig install (per Tensorflow#706), I've managed to get a GPU build working, but not the CPU build.

For CUDA compatibility, I rebuilt Bazel with GCC 4.9.8 and am using that for TensorFlow as well. My updated environment settings (stored in a .sh script and sourced before executing the build command) are:

export LDFLAGS="-Wl,-rpath,/opt/rh/devtoolset-3/root/usr/lib64 -lrt -lm"
export CC="/opt/rh/devtoolset-3/root/usr/bin/gcc"
export CXX="/opt/rh/devtoolset-3/root/usr/bin/g++"
export JAVA_HOME="/u/drspeech/opt/jdks/jdk1.8.0_25"
export LD_LIBRARY_PATH="/opt/rh/devtoolset-3/root/usr/lib:${LD_LIBRARY_PATH}"
export EXTRA_BAZEL_ARGS="--verbose_failures"
export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --linkopt=-Wl,-rpath,/opt/rh/devtoolset-3/root/usr/lib64"
export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --linkopt=-Wl,-rpath,/u/drspeech/opt/jdks/jdk1.8.0_25/lib"
export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --linkopt=-lz --linkopt=-lrt --linkopt=-lm"
#export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --linkopt=-Wl,-rpath,/usr/local/cuda-7.0/lib64"
export EXTRA_BAZEL_ARGS="${EXTRA_BAZEL_ARGS} --genrule_strategy=standalone --spawn_strategy=standalone"
export PYTHON_MAJOR_VERSION=3
export PYTHON_BINARY=/u/drspeech/opt/python-3.5.1/bin/python3

export PATH="/u/drspeech/opt/swig-3.0.8/bin:${PATH}"
export PATH="/opt/rh/devtoolset-3/root/usr/bin:${PATH}"

echo "build ${EXTRA_BAZEL_ARGS}" > ~/.bazelrc

export MYBAZEL=/u/drspeech/opt/bazel-0.1.5/0.1.5/bazel-0.1.5.devtools-3/output/bazel

~/.bazelrc looks like:

build --verbose_failures --linkopt=-Wl,-rpath,/opt/rh/devtoolset-3/root/usr/lib64 --linkopt=-Wl,-rpath,/u/drspeech/opt/jdks/jdk1.8.0_25/lib --linkopt=-lz --linkopt=-lrt --linkopt=-lm --genrule_strategy=standalone --spawn_strategy=standalone

And updated build command is:

$MYBAZEL build -c opt --genrule_strategy=standalone --spawn_strategy=standalone //tensorflow/tools/pip_package:build_pip_package

As I said, I'm able to generate a GPU build with no issues; the only difference between the builds being that I enable GPU support when running the configure script, and that the EXTRA_BAZEL_ARGS value to link to the cuda libraries above is uncommented.

Thanks a lot for being so helpful with this!

@damienmg

This comment has been minimized.

Show comment
Hide comment
@damienmg

damienmg Feb 25, 2016

Contributor

Sorry, I missed your message.

Looking back at it, you need to edit the tools/cpp/CROSSTOOL file to input the correct path to your compilers, have you done so? GPU works because it use some wrappers around gcc so it manage to find it.

Contributor

damienmg commented Feb 25, 2016

Sorry, I missed your message.

Looking back at it, you need to edit the tools/cpp/CROSSTOOL file to input the correct path to your compilers, have you done so? GPU works because it use some wrappers around gcc so it manage to find it.

@drgriffis

This comment has been minimized.

Show comment
Hide comment
@drgriffis

drgriffis Mar 7, 2016

I had in fact edited those; I ended up finding the issue in another location. Somehow the protobuf part of the build isn't picking up the linker flags, specifically -lm (despite those being added and the right paths in google/protobuf/tools/cpp/CROSSTOOL).

What ended up fixing the issue was editing google/protobuf/BUILD, and changing

# Bazel should provide portable link_opts for pthread.
LINK_OPTS = ["-lpthread"]

to also include the -lrt, -lm switches:

# Bazel should provide portable link_opts for pthread.
LINK_OPTS = ["-lpthread","-lrt","-lm"]

This made the pip package compile with no issues.

drgriffis commented Mar 7, 2016

I had in fact edited those; I ended up finding the issue in another location. Somehow the protobuf part of the build isn't picking up the linker flags, specifically -lm (despite those being added and the right paths in google/protobuf/tools/cpp/CROSSTOOL).

What ended up fixing the issue was editing google/protobuf/BUILD, and changing

# Bazel should provide portable link_opts for pthread.
LINK_OPTS = ["-lpthread"]

to also include the -lrt, -lm switches:

# Bazel should provide portable link_opts for pthread.
LINK_OPTS = ["-lpthread","-lrt","-lm"]

This made the pip package compile with no issues.

@damienmg

This comment has been minimized.

Show comment
Hide comment
@damienmg

damienmg Mar 7, 2016

Contributor

Ok great!

Contributor

damienmg commented Mar 7, 2016

Ok great!

@damienmg damienmg closed this Mar 7, 2016

vrv added a commit to tensorflow/tensorflow that referenced this issue Jun 30, 2016

solve the building issues "Undefined reference to symbol 'ceil@@GLIBC…
…_2.2.5'" (#3097)

* solve the building issues

This is to solve #3070,
the solution is inspired from bazelbuild/bazel#934 and #1171

* Update BUILD
@sammoes

This comment has been minimized.

Show comment
Hide comment
@sammoes

sammoes Aug 4, 2016

@drgriffis what did you do to get the GPU build? I'm trying to build syntaxnet but get the same error you got:

git clone --recursive https://github.com/tensorflow/models.git
cd models/syntaxnet/tensorflow
./configure
cd ..
bazel test syntaxnet/... util/utf8/...

gives
ERROR: /home/tf/.cache/bazel/_bazel_sam/5cd71b2b91989f3dd022ee2c43ab916c/external/org_tensorflow/tensorflow/tools/proto_text/BUILD:31:1: Linking of rule '@org_tensorflow//tensorflow/tools/proto_text:gen_proto_text_functions' failed: gcc failed: error executing command /usr/bin/gcc -o bazel-out/host/bin/external/org_tensorflow/tensorflow/tools/proto_text/gen_proto_text_functions -pthread -no-canonical-prefixes -B/usr/bin -B/usr/bin -pass-exit-codes '-Wl,--build-id=md5' ... (remaining 12 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
/usr/bin/ld: bazel-out/host/bin/external/org_tensorflow/tensorflow/core/liblib_internal.a(numbers.o): undefined reference to symbol 'ceil@@GLIBC_2.2.5'
//lib/x86_64-linux-gnu/libm.so.6: error adding symbols: DSO missing from command line

I have CUDA 8, cudnn and ubuntu 16.04, bazel 0.2.2
What is the problem exactly?
Is the exra tensorflow build for syntaxnet required?

sammoes commented Aug 4, 2016

@drgriffis what did you do to get the GPU build? I'm trying to build syntaxnet but get the same error you got:

git clone --recursive https://github.com/tensorflow/models.git
cd models/syntaxnet/tensorflow
./configure
cd ..
bazel test syntaxnet/... util/utf8/...

gives
ERROR: /home/tf/.cache/bazel/_bazel_sam/5cd71b2b91989f3dd022ee2c43ab916c/external/org_tensorflow/tensorflow/tools/proto_text/BUILD:31:1: Linking of rule '@org_tensorflow//tensorflow/tools/proto_text:gen_proto_text_functions' failed: gcc failed: error executing command /usr/bin/gcc -o bazel-out/host/bin/external/org_tensorflow/tensorflow/tools/proto_text/gen_proto_text_functions -pthread -no-canonical-prefixes -B/usr/bin -B/usr/bin -pass-exit-codes '-Wl,--build-id=md5' ... (remaining 12 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
/usr/bin/ld: bazel-out/host/bin/external/org_tensorflow/tensorflow/core/liblib_internal.a(numbers.o): undefined reference to symbol 'ceil@@GLIBC_2.2.5'
//lib/x86_64-linux-gnu/libm.so.6: error adding symbols: DSO missing from command line

I have CUDA 8, cudnn and ubuntu 16.04, bazel 0.2.2
What is the problem exactly?
Is the exra tensorflow build for syntaxnet required?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment