Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow GPU r0.7 compilation => gcc: language cuda not recognized #2253

Closed
puneet336 opened this issue May 6, 2016 · 7 comments
Closed

Comments

@puneet336
Copy link

puneet336 commented May 6, 2016

Hi,
I am facing an issue while compiling tensorflow-r0.7 on centos6 with gcc-4.9.3(Configured with: ./configure --prefix=/home/soft/gcc-4.9.3)+cuda 7.0 + cudnn 7.0.3.0:

____[72 / 801] Compiling tensorflow/core/kernels/cwise_op_gpu_real.cu.cc ERROR: /home/user/TENSOR_GPU/tensorflow/tensorflow/core/BUILD:334:1: C++ compilation of rule '//tensorflow/core:gpu_kernels' failed: gcc failed: error executing command (cd /home/user/.cache/bazel/_bazel_jca142469/fc6103a815c76da222bbe3b7c887440a/tensorflow && \ exec env - \ PATH=/home/soft/cuda-7.0/bin:/home/apps/BINUTILS/2.25/gnu/bin:/home/soft/intel2015/composer_xe_2015.3.187/mkl/bin:/home/soft/gcc-4.9.3/bin:/home/user/TENSOR_GPU/jdk1.8.0_92/jre/bin:/home/user/TENSOR_GPU/jdk1.8.0_92/bin:/home/apps/PROTOBUF/2.6.1/gnu/include:/home/apps/PROTOBUF/2.6.1/gnu/bin:/home/apps/Caffe/CaffeDependencies/include:/home/soft/cuda-6.5/bin:/usr/lib64/qt-3.3/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/opt/pbs/default/bin:/opt/pbs/default/sbin:/opt/pbs/default/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/ibutils/bin:/home/apps/MATLAB/R2014b/bin:/home/user/bin \ /home/soft/gcc-4.9.3/bin/gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -fPIE -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections '-std=c++11' -iquote . -iquote bazel-out/local_linux-opt/genfiles -iquote external/bazel_tools -iquote bazel-out/local_linux-opt/genfiles/external/bazel_tools -iquote external/jpeg_archive -iquote bazel-out/local_linux-opt/genfiles/external/jpeg_archive -iquote external/png_archive -iquote bazel-out/local_linux-opt/genfiles/external/png_archive -iquote external/re2 -iquote bazel-out/local_linux-opt/genfiles/external/re2 -iquote external/eigen_archive -iquote bazel-out/local_linux-opt/genfiles/external/eigen_archive -isystem third_party/gpus/cuda/include -isystem bazel-out/local_linux-opt/genfiles/third_party/gpus/cuda/include -isystem external/bazel_tools/tools/cpp/gcc3 -isystem google/protobuf/src -isystem bazel-out/local_linux-opt/genfiles/google/protobuf/src -isystem external/jpeg_archive/jpeg-9a -isystem bazel-out/local_linux-opt/genfiles/external/jpeg_archive/jpeg-9a -isystem external/png_archive/libpng-1.2.53 -isystem bazel-out/local_linux-opt/genfiles/external/png_archive/libpng-1.2.53 -isystem external/re2 -isystem bazel-out/local_linux-opt/genfiles/external/re2 -isystem third_party/eigen3 -isystem bazel-out/local_linux-opt/genfiles/third_party/eigen3 -isystem external/eigen_archive/eigen-eigen-c5e90d9e764e -isystem bazel-out/local_linux-opt/genfiles/external/eigen_archive/eigen-eigen-c5e90d9e764e -isystem third_party/gpus/cuda -isystem bazel-out/local_linux-opt/genfiles/third_party/gpus/cuda -x cuda '-DGOOGLE_CUDA=1' -no-canonical-prefixes -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -fno-canonical-system-headers '-frandom-seed=bazel-out/local_linux-opt/bin/tensorflow/core/_objs/gpu_kernels/tensorflow/core/kernels/cwise_op_gpu_equal_to.cu.pic.o' -MD -MF bazel-out/local_linux-opt/bin/tensorflow/core/_objs/gpu_kernels/tensorflow/core/kernels/cwise_op_gpu_equal_to.cu.pic.d -fPIC -c tensorflow/core/kernels/cwise_op_gpu_equal_to.cu.cc -o bazel-out/local_linux-opt/bin/tensorflow/core/_objs/gpu_kernels/tensorflow/core/kernels/cwise_op_gpu_equal_to.cu.pic.o): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1. gcc: error: language cuda not recognized

also in gcc man page, i can't locate cuda as a possible value of language switch:
-x language
Specify explicitly the language for the following input files (rather than letting the compiler choose a default based on the file name suffix). This option applies to all following input files until the next -x option. Possible values for language are:
c c-header c-cpp-output c++ c++-header c++-cpp-output objective-c objective-c-header objective-c-cpp-output objective-c++ objective-c++-header objective-c++-cpp-output assembler assembler-with-cpp ada f77 f77-cpp-input f95 f95-cpp-input java

Any help/hint will be very useful.
Eagerly awaiting your replies..

@puneet336 puneet336 changed the title gcc: language cuda not recognized Tendor r0.7 compilation => gcc: language cuda not recognized May 6, 2016
@puneet336 puneet336 changed the title Tendor r0.7 compilation => gcc: language cuda not recognized Tensorflow r0.7 compilation => gcc: language cuda not recognized May 6, 2016
@puneet336
Copy link
Author

within tensorflow/tensorflow.bzl, if i set
cuda_copts = ["-x", "c++", "-DGOOGLE_CUDA=1",
" "] + cuda_copts
my compilation proceeds , but as expected (due to cuda constructs like threadIdx.x) the compilation fails.
usually i have seen:
nvcc -ccbin "gcc" -Xcompiler -fPIC

but the -x cuda syntax appears a bit weird!! what am i missing here??

@puneet336 puneet336 changed the title Tensorflow r0.7 compilation => gcc: language cuda not recognized Tensorflow GPU r0.7 compilation => gcc: language cuda not recognized May 6, 2016
@zheng-xq
Copy link
Contributor

zheng-xq commented May 6, 2016

The problem is that file should not be sent to gcc. It could be a bazel misconfiguration somewhere.

The correct sequence is as follows. You can follow it and see where things went wrong.

  • bazel.rc has this line "build:cuda --crosstool_top=@tf//third_party/gpus/crosstool". So under "--config=cuda" third_party/gpus/crosstool should be used instead of the default.
  • third_party/gpus/crosstool should have this line "tool_path { name: "gcc" path: "clang/bin/crosstool_wrapper_driver_is_not_gcc" }", which instructs the bazel to use that as the compiler instead of gcc. It understands the "-x cuda" option and knows which compiler to use.

@puneet336
Copy link
Author

puneet336 commented May 7, 2016

Hi,
initially With crosstool_wrapper_driver_is_not_gcc, i ran into compilation errors (mentioned below) like error: unrecognized command line option "-std=c++11", so i manually replaced the same with gcc. I had modified the third_party/gpus/crosstool/CROSSTOOL as per rdipietro's comment for my gcc 4.9.3 (7 Mar) at : bazelbuild/bazel#760.
One of the replacement was:
tool_path { name: "gcc" path: "clang/bin/crosstool_wrapper_driver_is_not_gcc" } ===> tool_path { name: "gcc" path: "/home/soft/gcc-4.9.3/bin/gcc" }.

You may like to have a look at the CROSSTOOL file which i am currently using. Please suggest modifications if any.

error log:
____Building... ____[1 / 139] Compiling external/re2/util/valgrind.cc ____[1 / 303] Compiling external/re2/util/stringprintf.cc [for host] ____[1 / 324] Compiling external/re2/re2/dfa.cc [for host] ERROR: /home/user/.cache/bazel/_bazel_jca142469/fc6103a815c76da222bbe3b7c887440a/external/re2/BUILD:9:1: C++ compilation of rule '@re2//:re2' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command (cd /home/user/.cache/bazel/_bazel_jca142469/fc6103a815c76da222bbe3b7c887440a/tensorflow && \ exec env - \ PATH=/home/soft/cuda-7.0/bin:/home/apps/BINUTILS/2.25/gnu/bin:/home/soft/intel2015/composer_xe_2015.3.187/mkl/bin:/home/soft/gcc-4.9.3/bin:/home/user/TENSOR_GPU/jdk1.8.0_92/jre/bin:/home/user/TENSOR_GPU/jdk1.8.0_92/bin:/home/apps/PROTOBUF/2.6.1/gnu/include:/home/apps/PROTOBUF/2.6.1/gnu/bin:/home/apps/Caffe/CaffeDependencies/include:/home/soft/cuda-6.5/bin:/usr/lib64/qt-3.3/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/opt/pbs/default/bin:/opt/pbs/default/sbin:/opt/pbs/default/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/ibutils/bin:/home/apps/MATLAB/R2014b/bin:/home/user/bin \ third_party/gpus/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -fPIE -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections '-std=c++11' -iquote external/re2 -iquote bazel-out/local_linux-opt/genfiles/external/re2 -iquote external/bazel_tools -iquote bazel-out/local_linux-opt/genfiles/external/bazel_tools -isystem external/re2 -isystem bazel-out/local_linux-opt/genfiles/external/re2 -isystem external/bazel_tools/tools/cpp/gcc3 -no-canonical-prefixes -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -fno-canonical-system-headers '-frandom-seed=bazel-out/local_linux-opt/bin/external/re2/_objs/re2/external/re2/re2/set.pic.o' -MD -MF bazel-out/local_linux-opt/bin/external/re2/_objs/re2/external/re2/re2/set.pic.d -fPIC -c external/re2/re2/set.cc -o bazel-out/local_linux-opt/bin/external/re2/_objs/re2/external/re2/re2/set.pic.o): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1. gcc: unrecognized option '-no-canonical-prefixes' cc1plus: error: unrecognized command line option "-std=c++11" cc1plus: error: unrecognized command line option "-fno-canonical-system-headers" cc1plus: warning: unrecognized command line option "-Wno-free-nonheap-object" ____Building complete. Target //tensorflow/tools/pip_package:build_pip_package failed to build ____Elapsed time: 2.826s, Critical Path: 2.08s

@puneet336
Copy link
Author

puneet336 commented May 7, 2016

Hi,
i got past the cc1plus: error: unrecognized command line option "-std=c++11" error by making the compiler settings in : third_party/gpus/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc as:

import pipes
print (sys.version)
CURRENT_DIR = os.path.dirname(sys.argv[0])
CPU_COMPILER = ('/home/soft/gcc-4.9.3/bin/gcc')
NVCC_PATH = ('/home/soft/cuda-7.0/bin/nvcc')
GCC_HOST_COMPILER_PATH = ('/home/soft/gcc-4.9.3/bin/gcc')
LLVM_HOST_COMPILER_PATH = ('/home/soft/gcc-4.9.3/bin/gcc')
PREFIX_DIR = os.path.dirname(GCC_HOST_COMPILER_PATH)

but now i am getting errors : ImportError: No module named argparse, even though argeparse-1.4.0 is available as python module.

[user@gpulogin01 ~/TENSOR_GPU/tensorflow]
$ python
Python 2.7.10 (default, May  5 2016, 17:12:28)
[GCC 4.9.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from argparse import ArgumentParser
>>>
>>>

I guess it is using the /usr/bin/python instead of ~/TENSOR_GPU/jdk1.8.0_92/bin/python
can you suggest the file where i can make python compiler path setting?

$ bazel build --verbose_failures -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package 2>&1|tee tensorflow_build.log
WARNING: Sandboxed execution is not supported on your system and thus hermeticity of actions cannot be guaranteed. See http://bazel.io/docs/bazel-user-manual.html#sandboxing for more information. You can turn off this warning via --ignore_unsupported_sandboxing.
____Loading...
____Found 1 target...
____Building...
____[1 / 7] Linking google/protobuf/libprotobuf_lite.a [for host]
____[1 / 42] Compiling google/protobuf/src/google/protobuf/compiler/java/java_primitive_field_lite.cc [for host]
____[1 / 481] Linking external/re2/libre2.pic.a
ERROR: /home/user/TENSOR_GPU/tensorflow/google/protobuf/BUILD:520:1: Linking of rule '//google/protobuf:internal/_api_implementation.so' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command
  (cd /home/user/.cache/bazel/_bazel_user/fc6103a815c76da222bbe3b7c887440a/tensorflow && \
  exec env - \
  third_party/gpus/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -shared -o bazel-out/local_linux-opt/bin/google/protobuf/internal/_api_implementation.so -Wl,-whole-archive bazel-out/local_linux-opt/bin/google/protobuf/_objs/internal/_api_implementation.so/google/protobuf/python/google/protobuf/internal/api_implementation.pic.o -Wl,-no-whole-archive -lstdc++ -B/usr/bin/ -Wl,-R/home/soft/gcc-4.9.3/lib64 -Wl,-z,relro,-z,now -no-canonical-prefixes -pass-exit-codes '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,--gc-sections): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
Traceback (most recent call last):
  File "third_party/gpus/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc", line 39, in <module>
    from argparse import ArgumentParser
ImportError: No module named argparse
Target //tensorflow/tools/pip_package:build_pip_package failed to build
____Elapsed time: 0.598s, Critical Path: 0.16s

@puneet336
Copy link
Author

puneet336 commented May 7, 2016

Hi ,
after adding cxx_builtin_include_directory: "/home/soft/cuda-7.0/include/" in crosstool
& adding default interpreter as: /home/user/TENSOR_GPU/jdk1.8.0_92/bin/python
I got past previous error, and now i am stuck at:

____From Compiling tensorflow/core/kernels/cwise_op_equal_to.cc: 2.7.10 (default, May 5 2016, 17:12:28) [GCC 4.9.3] In file included from ./tensorflow/core/platform/default/logging.h:23:0, from ./tensorflow/core/platform/logging.h:24, from ./tensorflow/core/lib/core/status.h:24, from ./tensorflow/core/framework/op_def_builder.h:25, from ./tensorflow/core/framework/op.h:23, from ./tensorflow/core/kernels/cwise_ops_common.h:25, from tensorflow/core/kernels/cwise_op_equal_to.cc:16: ./tensorflow/core/platform/default/logging.h: In instantiation of 'std::string* tensorflow::internal::Check_EQImpl(const T1&, const T2&, const char*) [with T1 = long unsigned int; T2 = int; std::string = std::basic_string<char>]': ./tensorflow/core/kernels/cwise_ops_common.h:55:5: required from 'static Eigen::array<long int, NDIMS> tensorflow::BinaryOpShared::ToIndexArray(const Vec&) [with int NDIMS = 2; tensorflow::BCast::Vec = tensorflow::gtl::InlinedVector<long long int, 4>]' ./tensorflow/core/kernels/cwise_ops_common.h:110:7: required from 'void tensorflow::BinaryOp<Device, Functor>::Compute(tensorflow::OpKernelContext*) [with Device = Eigen::GpuDevice; Functor = tensorflow::functor::equal_to<long long int>]' tensorflow/core/kernels/cwise_op_equal_to.cc:37:1: required from here ./tensorflow/core/platform/default/logging.h:194:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] == ) // Compilation error with CHECK_EQ(NULL, x)? ^ ./tensorflow/core/platform/macros.h:54:28: note: in definition of macro 'TF_PREDICT_TRUE' #define TF_PREDICT_TRUE(x) x ^ ./tensorflow/core/platform/default/logging.h:193:1: note: in expansion of macro 'TF_DEFINE_CHECK_OP_IMPL' TF_DEFINE_CHECK_OP_IMPL(Check_EQ, ^ ____From Compiling tensorflow/core/kernels/l2loss_op_gpu.cu.cc: 2.7.10 (default, May 5 2016, 17:12:28) [GCC 4.9.3] gcc: error trying to exec 'as': execvp: No such file or directory ERROR: /home/user/TENSOR_GPU/tensorflow/tensorflow/core/BUILD:334:1: output 'tensorflow/core/_objs/gpu_kernels/tensorflow/core/kernels/l2loss_op_gpu.cu.pic.o' was not created. ERROR: /home/user/TENSOR_GPU/tensorflow/tensorflow/core/BUILD:334:1: not all outputs were created. Target //tensorflow/tools/pip_package:build_pip_package failed to build ____Elapsed time: 35.525s, Critical Path: 34.98s
but as is within my PATH at: /home/apps/BINUTILS/2.25/gnu/bin/as as well as /usr/bin/as.

@girving
Copy link
Contributor

girving commented Jun 7, 2016

Unfortunately we don't officially support CentOS. We'll do our best to support installation on unsupported platforms, but in this case, it's unclear what we can do. If it turns out there is a problem that can be fixed with changes to TensorFlow, we're happy to accept a PR.

@girving girving closed this as completed Jun 7, 2016
@yliu120
Copy link
Contributor

yliu120 commented Dec 17, 2016

I built the latest Tensorflow (github master branch) with GPU support on a supercomputing center (CentOS 6.7 with gcc 4.9.2). I pointed out some of environment variables settings that are necessary for a success built. You can refer to my protocol:

http://biophysics.med.jhmi.edu/~yliu120/tensorflow.html

Tensorflow can be built with almost all Linux distributions with gcc >= 4.8 (usually libstdc++ comes with gcc 4.8). You just need to set the correct ENVs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants