Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version `GLIBCXX_3.4.18' not found when compiling tensorflow example #1358

Closed
kheuton opened this issue Jun 9, 2016 · 19 comments
Closed

Version `GLIBCXX_3.4.18' not found when compiling tensorflow example #1358

kheuton opened this issue Jun 9, 2016 · 19 comments

Comments

@kheuton
Copy link

kheuton commented Jun 9, 2016

Hi all,

I'm trying to install bazel and tensorflow with cuda support in my home directory on a network drive of a research cluster. I am building them using tools from a conda virtual environment, so almost nothing I have is in a default location.

Bazel version: 0.2.3 tag from the master branch
Gcc version: gcc version 4.8.5 (GCC)

When building the first tensorflow example, I get this error:

/homes/krheuton/bazel/output/bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer --verbose_failures
........

WARNING: Sandboxed execution is not supported on your system and thus hermeticity of actions cannot be guaranteed. See http://bazel.io/docs/bazel-user-manual.html#sandboxing for more information. You can turn off this warning via --ignore_unsupported_sandboxing.

WARNING: /snfs2/HOME/krheuton/.cache/bazel/_bazel_krheuton/4bbcdf630a5812f0ffcb10ef621943b3/external/protobuf/WORKSPACE:1: Workspace name in /snfs2/HOME/krheuton/.cache/bazel/_bazel_krheuton/4bbcdf630a5812f0ffcb10ef621943b3/external/protobuf/WORKSPACE (@main) does not match the name given in the repository's definition (@protobuf); this will cause a build error in future versions.

WARNING: /snfs2/HOME/krheuton/.cache/bazel/_bazel_krheuton/4bbcdf630a5812f0ffcb10ef621943b3/external/re2/WORKSPACE:1: Workspace name in /snfs2/HOME/krheuton/.cache/bazel/_bazel_krheuton/4bbcdf630a5812f0ffcb10ef621943b3/external/re2/WORKSPACE (@main) does not match the name given in the repository's definition (@re2); this will cause a build error in future versions.

WARNING: /snfs2/HOME/krheuton/.cache/bazel/_bazel_krheuton/4bbcdf630a5812f0ffcb10ef621943b3/external/highwayhash/WORKSPACE:1: Workspace name in /snfs2/HOME/krheuton/.cache/bazel/_bazel_krheuton/4bbcdf630a5812f0ffcb10ef621943b3/external/highwayhash/WORKSPACE (@main) does not match the name given in the repository's definition (@highwayhash); this will cause a build error in future versions.

INFO: Found 1 target...

Slow read: a 111231960-byte read from /snfs2/HOME/krheuton/.cache/bazel/_bazel_krheuton/4bbcdf630a5812f0ffcb10ef621943b3/tensorflow/bazel-out/local_linux-opt/genfiles/third_party/gpus/cuda/lib64/libcufft.so took 8004ms.
Slow read: a 111231960-byte read from /snfs2/HOME/krheuton/.cache/bazel/_bazel_krheuton/4bbcdf630a5812f0ffcb10ef621943b3/tensorflow/bazel-out/host/genfiles/third_party/gpus/cuda/lib64/libcufft.so took 7987ms.
Slow read: a 111231960-byte read from /snfs2/HOME/krheuton/.cache/bazel/_bazel_krheuton/4bbcdf630a5812f0ffcb10ef621943b3/tensorflow/bazel-out/local_linux-opt/bin/_solib_local/_U_S_Sthird_Uparty_Sgpus_Scuda_Ccufft___Uthird_Uparty_Sgpus_Scuda_Slib64/libcufft.so took 7999ms.

ERROR: /snfs2/HOME/krheuton/tensorflow/tensorflow/core/BUILD:90:1: null failed: protoc failed: error executing command
(cd /snfs2/HOME/krheuton/.cache/bazel/_bazel_krheuton/4bbcdf630a5812f0ffcb10ef621943b3/tensorflow &&
exec env -
bazel-out/host/bin/external/protobuf/protoc '--cpp_out=bazel-out/local_linux-opt/genfiles/' -I. -Iexternal/protobuf/src -Ibazel-out/local_linux-opt/genfiles/external/protobuf/src tensorflow/core/example/example.proto tensorflow/core/example/feature.proto tensorflow/core/framework/allocation_description.proto tensorflow/core/framework/attr_value.proto tensorflow/core/framework/cost_graph.proto tensorflow/core/framework/device_attributes.proto tensorflow/core/framework/function.proto tensorflow/core/framework/graph.proto tensorflow/core/framework/kernel_def.proto tensorflow/core/framework/log_memory.proto tensorflow/core/framework/op_def.proto tensorflow/core/framework/step_stats.proto tensorflow/core/framework/summary.proto tensorflow/core/framework/tensor.proto tensorflow/core/framework/tensor_description.proto tensorflow/core/framework/tensor_shape.proto tensorflow/core/framework/tensor_slice.proto tensorflow/core/framework/types.proto tensorflow/core/framework/variable.proto tensorflow/core/framework/versions.proto tensorflow/core/lib/core/error_codes.proto tensorflow/core/protobuf/config.proto tensorflow/core/protobuf/meta_graph.proto tensorflow/core/protobuf/named_tensor.proto tensorflow/core/protobuf/queue_runner.proto tensorflow/core/protobuf/saver.proto tensorflow/core/protobuf/tensorflow_server.proto tensorflow/core/util/event.proto tensorflow/core/util/memmapped_file_system.proto tensorflow/core/util/saved_tensor_slice.proto tensorflow/core/util/test_log.proto): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
bazel-out/host/bin/external/protobuf/protoc: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.18' not found (required by bazel-out/host/bin/external/protobuf/protoc)
Target //tensorflow/cc:tutorials_example_trainer failed to build
INFO: Elapsed time: 154.944s, Critical Path: 83.57s

This seems to mirror issue #898 but I can't quite determine what the resolution was there.

@meteorcloudy
Copy link
Member

$ export GCC=`which gcc`
$ sed -i -e "s=/usr/bin/gcc=$GCC=g" tools/cpp/CROSSTOOL

This two commands basically replace the location of gcc with the real location on your machine in the CROSSTOOL file.
Then you can rebuild Bazel again.

//cc @damienmg Is it the right solution?

@damienmg
Copy link
Contributor

damienmg commented Jun 9, 2016

No this is not.

The problem is with the protocompiler that TensorFlow drag in. I am taking a guess here: protoc is compiled with --config cuda so with nvcc and it is not executable on the host machine.

Pulling in @davidzchen who did modifications to the protobuf build for TensorFlow.

@kheuton
Copy link
Author

kheuton commented Jun 10, 2016

Thanks meteorcloudy, I have done that and it did indeed fix certain bugs I encountered, but not this one.

@philwo
Copy link
Member

philwo commented Jun 20, 2016

@kheuton I'm curious, do you have LD_LIBRARY_PATH set in the environment where you run Bazel?

@davidzchen davidzchen self-assigned this Jun 20, 2016
@kheuton
Copy link
Author

kheuton commented Jun 21, 2016

Thanks for asking @philwo, I do have LD_LIBRARY_PATH set, and the first directory is the lib directory of the virtual environment I want to use. The error message is referencing /usr/lib64/libstdc++.so.6, but if my LD_LIBRARY_PATH was being used, it should find libstdc++.so.6 in my virtual environment before it checks /usr/lib64

@kheuton
Copy link
Author

kheuton commented Jun 21, 2016

I'm not sure if I can re-open the issue, but the latest commit doesn't appear to fix the problem. I am still seeing the same errors:

ERROR: /snfs2/HOME/krheuton/tensorflow/tensorflow/core/BUILD:90:1: null failed: protoc failed: error executing command 
  (cd /snfs2/HOME/krheuton/.cache/bazel/_bazel_krheuton/4bbcdf630a5812f0ffcb10ef621943b3/execroot/tensorflow && \
  exec env - \
  bazel-out/host/bin/external/protobuf/protoc '--cpp_out=bazel-out/host/genfiles/' -I. -Iexternal/protobuf/src -Ibazel-out/host/genfiles/external/protobuf/src tensorflow/core/example/example.proto tensorflow/core/example/feature.proto tensorflow/core/framework/allocation_description.proto tensorflow/core/framework/attr_value.proto tensorflow/core/framework/cost_graph.proto tensorflow/core/framework/device_attributes.proto tensorflow/core/framework/function.proto tensorflow/core/framework/graph.proto tensorflow/core/framework/kernel_def.proto tensorflow/core/framework/log_memory.proto tensorflow/core/framework/op_def.proto tensorflow/core/framework/step_stats.proto tensorflow/core/framework/summary.proto tensorflow/core/framework/tensor.proto tensorflow/core/framework/tensor_description.proto tensorflow/core/framework/tensor_shape.proto tensorflow/core/framework/tensor_slice.proto tensorflow/core/framework/types.proto tensorflow/core/framework/variable.proto tensorflow/core/framework/versions.proto tensorflow/core/lib/core/error_codes.proto tensorflow/core/protobuf/config.proto tensorflow/core/protobuf/meta_graph.proto tensorflow/core/protobuf/named_tensor.proto tensorflow/core/protobuf/queue_runner.proto tensorflow/core/protobuf/saver.proto tensorflow/core/protobuf/tensorflow_server.proto tensorflow/core/util/event.proto tensorflow/core/util/memmapped_file_system.proto tensorflow/core/util/saved_tensor_slice.proto tensorflow/core/util/test_log.proto): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
bazel-out/host/bin/external/protobuf/protoc: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.18' not found (required by bazel-out/host/bin/external/protobuf/protoc)
Target //tensorflow/cc:tutorials_example_trainer failed to build
INFO: Elapsed time: 80.675s, Critical Path: 25.12s

@davidzchen davidzchen removed their assignment Jun 22, 2016
@damienmg damienmg reopened this Jun 22, 2016
@damienmg
Copy link
Contributor

Protobuf rules should set use_default_shell_env for invoking the protobuf compiler

@philwo
Copy link
Member

philwo commented Jun 22, 2016

@kheuton Yes, definitely reopen bugs if they're not actually fixed by commits :) Damien is right - your build is proceeding a bit more than before now, but now stumbles upon the protobuf rules, which do not set use_default_shell_env, so they're not picking up the LD_LIBRARY_PATH that seems to be necessary on your system.

I'll see what I can do about that. We're also working on a real fix for this whole issue and think we have mostly figured it out, but implementation will take a few days.

@wanguanglu
Copy link

@philwo what's the work-around, i am stuck here

@gbkedar
Copy link

gbkedar commented Jul 11, 2016

Confirming that c9bc051 does not fix this issue.

The problem seems to be the bazel file tensorflow/tensorflow/core/BUILD

Other parts of tensorflow invoke the right env command. For example:

//third_party/gpus/cuda:cuda_check [action 'Executing genrule //third_party/gpus/cuda:cuda_check']
(cd /home/gbkedar/.cache/bazel/_bazel_gbkedar/2cce0d25c048758b29b453d7b4e29b85/execroot/tensorflow && \
  exec env - \
    PATH= ...... \
  /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; OUTPUTDIR=`readlink -f bazel-out/local_linux-opt/genfiles/third_party/gpus/cuda/../../..`; cd `dirname third_party/gpus/cuda/cuda_config.sh`; OUTPUTDIR=$OUTPUTDIR ./cuda_config.sh --check;')

The change adds the LD_LIBRARY_PATH with:

//third_party/gpus/cuda:cuda_check [action 'Executing genrule //third_party/gpus/cuda:cuda_check']
(cd /home/gbkedar/.cache/bazel/_bazel_gbkedar/2cce0d25c048758b29b453d7b4e29b85/execroot/tensorflow && \
  exec env - \
    LD_LIBRARY_PATH= ...... \
    PATH= ...... \
  /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; OUTPUTDIR=`readlink -f bazel-out/local_linux-opt/genfiles/third_party/gpus/cuda/../../..`; cd `dirname third_party/gpus/cuda/cuda_config.sh`; OUTPUTDIR=$OUTPUTDIR ./cuda_config.sh --check;')

However, with protobuf neither the PATH nor LD_LIBRARY_PATH are passed:

(cd /home/gbkedar/.cache/bazel/_bazel_gbkedar/2cce0d25c048758b29b453d7b4e29b85/execroot/tensorflow && \
  exec env - \
  bazel-out/host/bin/external/protobuf/protoc '--cpp_out=bazel-out/local_linux-opt/genfiles/' -I. -Iexternal/protobuf/src -Ibazel-out/local_linux-opt/genfiles/external/protobuf/src tensorflow/core/example/example.proto tensorflow/core/example/example_parser_configuration.proto tensorflow/core/example/feature.proto tensorflow/core/framework/allocation_description.proto tensorflow/core/framework/attr_value.proto tensorflow/core/framework/cost_graph.proto tensorflow/core/framework/device_attributes.proto tensorflow/core/framework/function.proto tensorflow/core/framework/graph.proto tensorflow/core/framework/kernel_def.proto tensorflow/core/framework/log_memory.proto tensorflow/core/framework/op_def.proto tensorflow/core/framework/step_stats.proto tensorflow/core/framework/summary.proto tensorflow/core/framework/tensor.proto tensorflow/core/framework/tensor_description.proto tensorflow/core/framework/tensor_shape.proto tensorflow/core/framework/tensor_slice.proto tensorflow/core/framework/types.proto tensorflow/core/framework/variable.proto tensorflow/core/framework/versions.proto tensorflow/core/lib/core/error_codes.proto tensorflow/core/protobuf/config.proto tensorflow/core/protobuf/meta_graph.proto tensorflow/core/protobuf/named_tensor.proto tensorflow/core/protobuf/queue_runner.proto tensorflow/core/protobuf/saver.proto tensorflow/core/protobuf/tensorflow_server.proto tensorflow/core/util/event.proto tensorflow/core/util/memmapped_file_system.proto tensorflow/core/util/saved_tensor_slice.proto tensorflow/core/util/test_log.proto)
ERROR: /project/roysam/compiledLibs/tensorflow/tensorflow/core/BUILD:91:1: null failed: protoc failed: error executing command
  (cd /home/gbkedar/.cache/bazel/_bazel_gbkedar/2cce0d25c048758b29b453d7b4e29b85/execroot/tensorflow && \
  exec env - \
  bazel-out/host/bin/external/protobuf/protoc '--cpp_out=bazel-out/host/genfiles/' -I. -Iexternal/protobuf/src -Ibazel-out/host/genfiles/external/protobuf/src tensorflow/core/example/example.proto tensorflow/core/example/example_parser_configuration.proto tensorflow/core/example/feature.proto tensorflow/core/framework/allocation_description.proto tensorflow/core/framework/attr_value.proto tensorflow/core/framework/cost_graph.proto tensorflow/core/framework/device_attributes.proto tensorflow/core/framework/function.proto tensorflow/core/framework/graph.proto tensorflow/core/framework/kernel_def.proto tensorflow/core/framework/log_memory.proto tensorflow/core/framework/op_def.proto tensorflow/core/framework/step_stats.proto tensorflow/core/framework/summary.proto tensorflow/core/framework/tensor.proto tensorflow/core/framework/tensor_description.proto tensorflow/core/framework/tensor_shape.proto tensorflow/core/framework/tensor_slice.proto tensorflow/core/framework/types.proto tensorflow/core/framework/variable.proto tensorflow/core/framework/versions.proto tensorflow/core/lib/core/error_codes.proto tensorflow/core/protobuf/config.proto tensorflow/core/protobuf/meta_graph.proto tensorflow/core/protobuf/named_tensor.proto tensorflow/core/protobuf/queue_runner.proto tensorflow/core/protobuf/saver.proto tensorflow/core/protobuf/tensorflow_server.proto tensorflow/core/util/event.proto tensorflow/core/util/memmapped_file_system.proto tensorflow/core/util/saved_tensor_slice.proto tensorflow/core/util/test_log.proto): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
bazel-out/host/bin/external/protobuf/protoc: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by bazel-out/host/bin/external/protobuf/protoc)
bazel-out/host/bin/external/protobuf/protoc: /usr/lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by bazel-out/host/bin/external/protobuf/protoc)
bazel-out/host/bin/external/protobuf/protoc: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.18' not found (required by bazel-out/host/bin/external/protobuf/protoc)
Target //tensorflow/cc:tutorials_example_trainer failed to build

Further, checking the dependencies of protoc shows that they are set correctly

ldd bazel-out/host/bin/external/protobuf/protoc
        linux-vdso.so.1 =>  (0x00007ffffadff000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b5195f33000)
        libstdc++.so.6 => /share/apps/gcc-4.9.2/lib64/libstdc++.so.6 (0x00002b5196150000)
        libgcc_s.so.1 => /share/apps/gcc-4.9.2/lib64/libgcc_s.so.1 (0x00002b5196463000)
        libc.so.6 => /lib64/libc.so.6 (0x00002b5196679000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003e37c00000)
        libm.so.6 => /lib64/libm.so.6 (0x00002b5196a0d000)

Added an issue in tensorflow tensorflow/tensorflow#3261 since this seems to be more of a tensorflow bazel configuration issue than one with bazel.

@gbkedar
Copy link

gbkedar commented Jul 12, 2016

@kheuton You can add env=ctx.configuration.default_shell_env to the ctx.action call in bazel-tensorflow/external/protobuf/protobuf.bzl to get past the problem.

@kheuton
Copy link
Author

kheuton commented Jul 24, 2016

Thanks @gbkedar, however, now I end up with:
gcc: error trying to exec 'as': execvp: No such file or directory

Running which gcc on the command line results in:
~/.conda/envs/tensor/bin/gcc
and which as:
/usr/bin/as

@gbkedar
Copy link

gbkedar commented Jul 25, 2016

@kheuton in third_party/gpus/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc
change the line
PREFIX_DIR = os.path.dirname(GCC_HOST_COMPILER_PATH)
to
PREFIX_DIR = os.path.dirname("Path to AS")

@damienmg
Copy link
Contributor

Closing as the problem should be fixed on protobuf and tensorflow.

@kheuton
Copy link
Author

kheuton commented Jul 29, 2016

Thanks @damienmg, so you're saying there is no fix in bazel? Where should I move the issue, to both protobuf and tensorflow? Or is it just a protobuf thing?

@taylorpaul
Copy link

@gbkedar Your comment from Jul 12 got me through this error after several hours of adjusting CROSSTOOL and recompile attempts! Thanks! (I was using the latest bazel build 0.3.2-2016-11-02 (@03afc7d) and tensorflow v0.11.0rc1)

I wanted to document my steps for other users attempting to compile tensorflow without root on a cluster. This is the first link that pops up in a google search of the error and the only one that helped so I thought I'd post the info here in hopes of saving others the hours I spent trying to get past this error. I was able to compile bazel from source, compile tensorflow from source, and successfully pip install from the created .whl file after adding the line from the Jul 12 comment above. Please find a more detailed explanation of my steps here:

Installing Tensorflow on CENTOS 6.8 without Root

@mukul1992
Copy link

mukul1992 commented Dec 7, 2016

I was also able to compile using the hack suggested by @gbkedar and with help of instructions provided by @taylorpaul
Tensorflow version 0.12.0-rc0 with bazel 0.4.1 on CentOS 6
Thank you peeps!

@i3v
Copy link

i3v commented Dec 7, 2016

@taylorpaul 's guide is nice, but it looks like there's a simpler way to fix version 'GLIBCXX_3.4.20' not found.

I've not tried his trick with copying "as", though (if it works, it should be faster and simplier workaround), instead, I've recompiled gcc, hardcoding paths to as, nm and ld. build tensorflow 0.12rc0 on CentOS6.5, which only had gcc-4.4.7 compiler by default.

@VittalP
Copy link

VittalP commented Jan 23, 2017

@taylorpaul 's fix for the GLIBCXX issue worked for me. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests