New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path. #4105

Closed
trevorwelch opened this Issue Aug 30, 2016 · 74 comments

Comments

Projects
None yet
@trevorwelch

trevorwelch commented Aug 30, 2016

Environment info

Operating System:
OS 10.10.5

Installed version of CUDA and cuDNN:

$ ls -l /usr/local/cuda/lib/libcud*
-rwxr-xr-x  1 root           wheel      8280 Apr 13 01:02 /usr/local/cuda/lib/libcuda.dylib
lrwxr-xr-x  1 root           wheel        45 Apr 13 01:03 /usr/local/cuda/lib/libcudadevrt.a -> /Developer/NVIDIA/CUDA-7.5/lib/libcudadevrt.a
lrwxr-xr-x  1 root           wheel        50 Apr 13 01:03 /usr/local/cuda/lib/libcudart.7.5.dylib -> /Developer/NVIDIA/CUDA-7.5/lib/libcudart.7.5.dylib
lrwxr-xr-x  1 root           wheel        46 Apr 13 01:03 /usr/local/cuda/lib/libcudart.dylib -> /Developer/NVIDIA/CUDA-7.5/lib/libcudart.dylib
lrwxr-xr-x  1 root           wheel        49 Apr 13 01:03 /usr/local/cuda/lib/libcudart_static.a -> /Developer/NVIDIA/CUDA-7.5/lib/libcudart_static.a
-rwxr-xr-x@ 1 production204  staff  60108616 Feb  8  2016 /usr/local/cuda/lib/libcudnn.4.dylib
lrwxr-xr-x  1 root           admin        47 Aug 29 18:08 /usr/local/cuda/lib/libcudnn.5.dylib -> /Developer/NVIDIA/CUDA-7.5/lib/libcudnn.5.dylib
lrwxr-xr-x  1 root           admin        45 Aug 29 18:08 /usr/local/cuda/lib/libcudnn.dylib -> /Developer/NVIDIA/CUDA-7.5/lib/libcudnn.dylib
-rw-r--r--@ 1 production204  staff  59311504 Feb  8  2016 /usr/local/cuda/lib/libcudnn_static.a
  1. The output from python -c "import tensorflow; print(tensorflow.__version__)".
    (can't get that far, but i'm using 0.10)
>>> import tensorflow
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.dylib locally
Segmentation fault: 11

If installed from source, provide

  1. The commit hash (git rev-parse HEAD)
4c49dbebef05442c7e72d6129a30574fcd13f0e1
  1. The output of bazel version
$ bazel version
Build label: 0.3.1-homebrew
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Thu Aug 4 09:59:58 2016 (1470304798)
Build timestamp: 1470304798
Build timestamp as int: 1470304798

If possible, provide a minimal reproducible example (We usually don't have time to read hundreds of lines of your code)

$ bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path.
ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path.
INFO: Elapsed time: 0.076s

What other attempted solutions have you tried?

  • Downgrading to cuDNN4, switching between 4 and 5
  • Re-installing bazel
  • Modifying CROSSTOOL file according to various threads
  • Manually linking CUDA libraries during ./configure to not use symlinked libraries
  • Various other hacks over the last week 😭

@trevorwelch trevorwelch changed the title from ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path. ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path. INFO: Elapsed time: 1.327s to ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path. Aug 30, 2016

@vrv

This comment has been minimized.

Show comment
Hide comment
@vrv

vrv Aug 30, 2016

Contributor

I got this recently too, I was somehow successful by just re-running ./configure and then immediately running bazel build, but I'm not sure what's going on.

Contributor

vrv commented Aug 30, 2016

I got this recently too, I was somehow successful by just re-running ./configure and then immediately running bazel build, but I'm not sure what's going on.

@trevorwelch

This comment has been minimized.

Show comment
Hide comment
@trevorwelch

trevorwelch Aug 30, 2016

Following that @vrv I just tried de-installing TF completely and then re-running with the same ./configure settings:

$ ./configure
Please specify the location of python. [Default is /Library/Frameworks/Python.framework/Versions/2.7/bin/python]: 
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] y
Google Cloud Platform support will be enabled for TensorFlow
Found possible Python library paths:
  /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
  /Library/Python/2.7/site-packages
Please input the desired Python library path to use.  Default is [/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages]

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 7.5
Please specify the location where CUDA 7.5 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 5
Please specify the location where cuDNN 5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.0
INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.
.
INFO: Waiting for response from Bazel server (pid 63884)...
WARNING: /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/boringssl_git/WORKSPACE:1: Workspace name in /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/boringssl_git/WORKSPACE (@boringssl) does not match the name given in the repository's definition (@boringssl_git); this will cause a build error in future versions.
INFO: All external dependencies fetched successfully.
Configuration finished

Then a simplified build command:

$ bazel build -c opt --config=cuda tensorflow/...

After a very verbose and lengthy compile attempt, I received this error message (it caused OS X Terminal app to hang permanently as well, so I couldn't copy-paste, had to take a screenshot):

https://www.dropbox.com/s/riu5f4n5aj1opmk/Screenshot%202016-08-30%2016.33.48.png?dl=0

Yet again, some sort of a CROSSTOOL issue. I've made sure to pull and update my local TF often over the week that I've been trying to build, as I've seen lots of activity related to this component of TF.

trevorwelch commented Aug 30, 2016

Following that @vrv I just tried de-installing TF completely and then re-running with the same ./configure settings:

$ ./configure
Please specify the location of python. [Default is /Library/Frameworks/Python.framework/Versions/2.7/bin/python]: 
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] y
Google Cloud Platform support will be enabled for TensorFlow
Found possible Python library paths:
  /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
  /Library/Python/2.7/site-packages
Please input the desired Python library path to use.  Default is [/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages]

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 7.5
Please specify the location where CUDA 7.5 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 5
Please specify the location where cuDNN 5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.0
INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.
.
INFO: Waiting for response from Bazel server (pid 63884)...
WARNING: /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/boringssl_git/WORKSPACE:1: Workspace name in /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/boringssl_git/WORKSPACE (@boringssl) does not match the name given in the repository's definition (@boringssl_git); this will cause a build error in future versions.
INFO: All external dependencies fetched successfully.
Configuration finished

Then a simplified build command:

$ bazel build -c opt --config=cuda tensorflow/...

After a very verbose and lengthy compile attempt, I received this error message (it caused OS X Terminal app to hang permanently as well, so I couldn't copy-paste, had to take a screenshot):

https://www.dropbox.com/s/riu5f4n5aj1opmk/Screenshot%202016-08-30%2016.33.48.png?dl=0

Yet again, some sort of a CROSSTOOL issue. I've made sure to pull and update my local TF often over the week that I've been trying to build, as I've seen lots of activity related to this component of TF.

@Mistobaan

This comment has been minimized.

Show comment
Hide comment
@Mistobaan

Mistobaan Aug 30, 2016

Contributor

Yes if you are developing TF it happens quite often. I think it might be something related with the caching system of bazel. After few hours it resets and I have to re run the configure again.

Contributor

Mistobaan commented Aug 30, 2016

Yes if you are developing TF it happens quite often. I think it might be something related with the caching system of bazel. After few hours it resets and I have to re run the configure again.

@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Aug 30, 2016

Member

+cc @damienmg

This seems to be a bug in Bazel. (Edit: to clarify: I meant the occasional no such package '@local_config_cuda//crosstool': BUILD file not found on package path. error).

Next time this happens, can you take a look at directory bazel-tensorflow/external/local_config_cuda/crosstool and let me know which files are there?

Member

davidzchen commented Aug 30, 2016

+cc @damienmg

This seems to be a bug in Bazel. (Edit: to clarify: I meant the occasional no such package '@local_config_cuda//crosstool': BUILD file not found on package path. error).

Next time this happens, can you take a look at directory bazel-tensorflow/external/local_config_cuda/crosstool and let me know which files are there?

@trevorwelch

This comment has been minimized.

Show comment
Hide comment
@trevorwelch

trevorwelch Aug 30, 2016

@davidzchen thanks for your reply.

This error or some version of it is consistent:

tensorflow$ cd bazel-tensorflow/external/local_config_cuda/crosstool

production204@Trevors-MacBook-Pro crosstool$ ls -l
total 32
-rwxr-xr-x  1 production204  wheel   925 Aug 30 14:45 BUILD
-rwxr-xr-x  1 production204  wheel  9068 Aug 30 14:45 CROSSTOOL
drwxr-xr-x  3 production204  wheel   102 Aug 30 14:45 clang

production204@Trevors-MacBook-Pro crosstool$ 

trevorwelch commented Aug 30, 2016

@davidzchen thanks for your reply.

This error or some version of it is consistent:

tensorflow$ cd bazel-tensorflow/external/local_config_cuda/crosstool

production204@Trevors-MacBook-Pro crosstool$ ls -l
total 32
-rwxr-xr-x  1 production204  wheel   925 Aug 30 14:45 BUILD
-rwxr-xr-x  1 production204  wheel  9068 Aug 30 14:45 CROSSTOOL
drwxr-xr-x  3 production204  wheel   102 Aug 30 14:45 clang

production204@Trevors-MacBook-Pro crosstool$ 

@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Aug 31, 2016

Member

I just saw your screenshot, and that appears to be a different problem than the no such package '@local_config_cuda//crosstool': BUILD file not found on package path. error, which seems to be a caching issue.

Do you mean that the crosstool_wrapper_driver_is_not_gcc error occurs consistently? Is it causing your Terminal.app to hang every time? If it reproduces consistently, you re-run the command with --verbose_failures?

Member

davidzchen commented Aug 31, 2016

I just saw your screenshot, and that appears to be a different problem than the no such package '@local_config_cuda//crosstool': BUILD file not found on package path. error, which seems to be a caching issue.

Do you mean that the crosstool_wrapper_driver_is_not_gcc error occurs consistently? Is it causing your Terminal.app to hang every time? If it reproduces consistently, you re-run the command with --verbose_failures?

@trevorwelch

This comment has been minimized.

Show comment
Hide comment
@trevorwelch

trevorwelch Aug 31, 2016

I meant that my TF builds seem to fail related to crosstool consistently, probably my naivete on the specifics for me to think that crosstool_wrapper_driver_is_not_gcc and CROSSTOOL could be the same problem!

trevorwelch commented Aug 31, 2016

I meant that my TF builds seem to fail related to crosstool consistently, probably my naivete on the specifics for me to think that crosstool_wrapper_driver_is_not_gcc and CROSSTOOL could be the same problem!

@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Aug 31, 2016

Member

No problem. The naming could be a bit confusing. The @local_config_cuda//crosstool error may be an issue with Bazel's caching; I have ran into it a couple of times myself, and it usually goes away after I re-run ./configure.

Were you able to reproduce the crosstool_wrapper_driver_is_not_gcc error again? Looking at your screenshot, it looks like the headerpad_max_install_names flag is not spelled correctly for some reason since it is complaining about eaderpad_max_install_names. Did you change this flag in your CROSSTOOL file?

Member

davidzchen commented Aug 31, 2016

No problem. The naming could be a bit confusing. The @local_config_cuda//crosstool error may be an issue with Bazel's caching; I have ran into it a couple of times myself, and it usually goes away after I re-run ./configure.

Were you able to reproduce the crosstool_wrapper_driver_is_not_gcc error again? Looking at your screenshot, it looks like the headerpad_max_install_names flag is not spelled correctly for some reason since it is complaining about eaderpad_max_install_names. Did you change this flag in your CROSSTOOL file?

@Dapid

This comment has been minimized.

Show comment
Hide comment
@Dapid

Dapid Aug 31, 2016

I am getting the same missing crosstool on Linux. The strange thing is that there isn't even a bazel-tensorflow directory:

[david@SQUIDS tensorflow]$ ls
ACKNOWLEDGMENTS  avro.BUILD   boringssl.BUILD  bzip2.BUILD  CONTRIBUTING.md  farmhash.BUILD  gmock.BUILD  ISSUE_TEMPLATE.md  jsoncpp.BUILD  nanopb.BUILD  png.BUILD      README.md   six.BUILD   third_party  util       zlib.BUILD
AUTHORS          boost.BUILD  bower.BUILD      configure    eigen.BUILD      gif.BUILD       grpc.BUILD   jpeg.BUILD         LICENSE        navbar.md     _python_build  RELEASE.md  tensorflow  tools        WORKSPACE

bazel is 0.3.1, and I have ran ./configure four times now.

Dapid commented Aug 31, 2016

I am getting the same missing crosstool on Linux. The strange thing is that there isn't even a bazel-tensorflow directory:

[david@SQUIDS tensorflow]$ ls
ACKNOWLEDGMENTS  avro.BUILD   boringssl.BUILD  bzip2.BUILD  CONTRIBUTING.md  farmhash.BUILD  gmock.BUILD  ISSUE_TEMPLATE.md  jsoncpp.BUILD  nanopb.BUILD  png.BUILD      README.md   six.BUILD   third_party  util       zlib.BUILD
AUTHORS          boost.BUILD  bower.BUILD      configure    eigen.BUILD      gif.BUILD       grpc.BUILD   jpeg.BUILD         LICENSE        navbar.md     _python_build  RELEASE.md  tensorflow  tools        WORKSPACE

bazel is 0.3.1, and I have ran ./configure four times now.

@trevorwelch

This comment has been minimized.

Show comment
Hide comment
@trevorwelch

trevorwelch Aug 31, 2016

I do see that eaderpad_max_install_names reference in the screenshot above, however, after searching through all the files and folders in my TF directory I don't see that text anywhere, only headerpad_max_install_names. I don't know why the h would be getting clipped off.

Update: Same error message related to crosstool_wrapper_driver_is_not_gcc. The '@local_config_cuda//crosstool': BUILD file not found on package path. error seems to have gone away after doing a pip uninstall on TF and then re-installing.

INFO: From Linking tensorflow/cc/ops/io_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/random_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/parsing_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/sparse_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/logging_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/string_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/user_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/candidate_sampling_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/control_flow_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/image_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/array_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/linalg_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/no_op_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/training_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
ERROR: /Users/production204/Github/tensorflow/tensorflow/cc/BUILD:179:1: Executing genrule //tensorflow/cc:io_ops_genrule failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 5.
dyld: Library not loaded: @rpath/libcudart.7.5.dylib
  Referenced from: /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow/bazel-out/host/bin/tensorflow/cc/ops/io_ops_gen_cc
  Reason: image not found
/bin/bash: line 1: 44071 Trace/BPT trap: 5       bazel-out/host/bin/tensorflow/cc/ops/io_ops_gen_cc bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/io_ops.h bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/io_ops.cc 0
Target //tensorflow/cc:tutorials_example_trainer failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 3015.469s, Critical Path: 3002.51s

The complete log was too big for pastebin, here it is on Dropbox: https://www.dropbox.com/home/Documents%20Dropbox?preview=TW-TF-error-log-083116.txt

Then, I run the same above commands, but with --verbose_failures (hard to imagine it being more verbose that the previous log, which was almost 15,000 lines!), final error message was:

ERROR: /Users/production204/Github/tensorflow/tensorflow/cc/BUILD:179:1: Executing genrule //tensorflow/cc:training_ops_genrule failed: bash failed: error executing command 
  (cd /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow && \
  exec env - \
    PATH=/usr/local/cuda/bin:/Library/Frameworks/Python.framework/Versions/2.7/bin:/usr/local/bin:usr/local/sbin:/usr/local/mysql/bin:/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin \
    TMPDIR=/var/folders/h3/pn9k79xn6qd9jgksqbkpn3l80000gn/T/ \
  /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/host/bin/tensorflow/cc/ops/training_ops_gen_cc bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.h bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.cc 0'): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 5.
dyld: Library not loaded: @rpath/libcudart.7.5.dylib
  Referenced from: /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow/bazel-out/host/bin/tensorflow/cc/ops/training_ops_gen_cc
  Reason: image not found
/bin/bash: line 1: 74845 Trace/BPT trap: 5       bazel-out/host/bin/tensorflow/cc/ops/training_ops_gen_cc bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.h bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.cc 0
Target //tensorflow/cc:tutorials_example_trainer failed to build
INFO: Elapsed time: 3111.405s, Critical Path: 3097.65s

production204@Trevors-MacBook-Pro tensorflow $ 

Here's the complete log: https://www.dropbox.com/s/nozqcscnc9ho5uz/TW-TF-error-log-083116--verbose_failures.txt?dl=0

trevorwelch commented Aug 31, 2016

I do see that eaderpad_max_install_names reference in the screenshot above, however, after searching through all the files and folders in my TF directory I don't see that text anywhere, only headerpad_max_install_names. I don't know why the h would be getting clipped off.

Update: Same error message related to crosstool_wrapper_driver_is_not_gcc. The '@local_config_cuda//crosstool': BUILD file not found on package path. error seems to have gone away after doing a pip uninstall on TF and then re-installing.

INFO: From Linking tensorflow/cc/ops/io_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/random_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/parsing_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/sparse_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/logging_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/string_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/user_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/candidate_sampling_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/control_flow_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/image_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/array_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/linalg_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/no_op_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
INFO: From Linking tensorflow/cc/ops/training_ops_gen_cc [for host]:
clang: warning: argument unused during compilation: '-pthread'
ERROR: /Users/production204/Github/tensorflow/tensorflow/cc/BUILD:179:1: Executing genrule //tensorflow/cc:io_ops_genrule failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 5.
dyld: Library not loaded: @rpath/libcudart.7.5.dylib
  Referenced from: /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow/bazel-out/host/bin/tensorflow/cc/ops/io_ops_gen_cc
  Reason: image not found
/bin/bash: line 1: 44071 Trace/BPT trap: 5       bazel-out/host/bin/tensorflow/cc/ops/io_ops_gen_cc bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/io_ops.h bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/io_ops.cc 0
Target //tensorflow/cc:tutorials_example_trainer failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 3015.469s, Critical Path: 3002.51s

The complete log was too big for pastebin, here it is on Dropbox: https://www.dropbox.com/home/Documents%20Dropbox?preview=TW-TF-error-log-083116.txt

Then, I run the same above commands, but with --verbose_failures (hard to imagine it being more verbose that the previous log, which was almost 15,000 lines!), final error message was:

ERROR: /Users/production204/Github/tensorflow/tensorflow/cc/BUILD:179:1: Executing genrule //tensorflow/cc:training_ops_genrule failed: bash failed: error executing command 
  (cd /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow && \
  exec env - \
    PATH=/usr/local/cuda/bin:/Library/Frameworks/Python.framework/Versions/2.7/bin:/usr/local/bin:usr/local/sbin:/usr/local/mysql/bin:/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin \
    TMPDIR=/var/folders/h3/pn9k79xn6qd9jgksqbkpn3l80000gn/T/ \
  /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/host/bin/tensorflow/cc/ops/training_ops_gen_cc bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.h bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.cc 0'): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 5.
dyld: Library not loaded: @rpath/libcudart.7.5.dylib
  Referenced from: /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow/bazel-out/host/bin/tensorflow/cc/ops/training_ops_gen_cc
  Reason: image not found
/bin/bash: line 1: 74845 Trace/BPT trap: 5       bazel-out/host/bin/tensorflow/cc/ops/training_ops_gen_cc bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.h bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.cc 0
Target //tensorflow/cc:tutorials_example_trainer failed to build
INFO: Elapsed time: 3111.405s, Critical Path: 3097.65s

production204@Trevors-MacBook-Pro tensorflow $ 

Here's the complete log: https://www.dropbox.com/s/nozqcscnc9ho5uz/TW-TF-error-log-083116--verbose_failures.txt?dl=0

@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Sep 1, 2016

Member

@Dapid That is interesting. How did you run the ./configure script? Can you paste the output

@damienmg Is there currently a way to inspect the contents of /external after running bazel fetch but before the Bazel output directories get symlinked?

@trevorwelch FWIW, most of the noise in the output are compiler warnings. The dyld: Library not loaded: @rpath/libcudart.7.5.dylib error is interesting. Can you verify whether the file bazel-bin/tensorflow/cc/tutorials_example_trainer.runfiles/local_config_cuda/cuda/lib/libcudart.7.5.dylib exists? If not, what files are under the bazel-bin/tensorflow/cc/tutorials_example_trainer.runfiles/local_config_cuda/cuda/lib directory?

Member

davidzchen commented Sep 1, 2016

@Dapid That is interesting. How did you run the ./configure script? Can you paste the output

@damienmg Is there currently a way to inspect the contents of /external after running bazel fetch but before the Bazel output directories get symlinked?

@trevorwelch FWIW, most of the noise in the output are compiler warnings. The dyld: Library not loaded: @rpath/libcudart.7.5.dylib error is interesting. Can you verify whether the file bazel-bin/tensorflow/cc/tutorials_example_trainer.runfiles/local_config_cuda/cuda/lib/libcudart.7.5.dylib exists? If not, what files are under the bazel-bin/tensorflow/cc/tutorials_example_trainer.runfiles/local_config_cuda/cuda/lib directory?

@Dapid

This comment has been minimized.

Show comment
Hide comment
@Dapid

Dapid Sep 1, 2016

@davidzchen

./configure 
Please specify the location of python. [Default is /home/david/.virtualenvs/py35/bin/python]: 
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /bin/gcc]: /usr/local/cuda/bin/gcc
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 
libcudnn.so resolves to libcudnn.4
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 5.0
INFO: Reading 'startup' options from /home/david/.bazelrc: --batch
Warning: ignoring LD_PRELOAD in environment.
INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.
INFO: Reading 'startup' options from /home/david/.bazelrc: --batch
Warning: ignoring LD_PRELOAD in environment.
WARNING: /home/david/.cache/bazel/_bazel_david/47d00ffdd2fc0515138a34f138cebd63/external/boringssl_git/WORKSPACE:1: Workspace name in /home/david/.cache/bazel/_bazel_david/47d00ffdd2fc0515138a34f138cebd63/external/boringssl_git/WORKSPACE (@boringssl) does not match the name given in the repository's definition (@boringssl_git); this will cause a build error in future versions.
INFO: All external dependencies fetched successfully.
Configuration finished
git log
commit 6ce5b5c8298273e3861a75fb6ccde63b9dd157c5
Author: Sanders Kleinfeld <sandersk@users.noreply.github.com>
Date:   Sun Aug 28 01:00:52 2016 -0400

On branch r0.10.

If I leave the default GCC it is created, but the build fails because it is incompatible with CUDA.

Dapid commented Sep 1, 2016

@davidzchen

./configure 
Please specify the location of python. [Default is /home/david/.virtualenvs/py35/bin/python]: 
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /bin/gcc]: /usr/local/cuda/bin/gcc
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 
libcudnn.so resolves to libcudnn.4
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 5.0
INFO: Reading 'startup' options from /home/david/.bazelrc: --batch
Warning: ignoring LD_PRELOAD in environment.
INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.
INFO: Reading 'startup' options from /home/david/.bazelrc: --batch
Warning: ignoring LD_PRELOAD in environment.
WARNING: /home/david/.cache/bazel/_bazel_david/47d00ffdd2fc0515138a34f138cebd63/external/boringssl_git/WORKSPACE:1: Workspace name in /home/david/.cache/bazel/_bazel_david/47d00ffdd2fc0515138a34f138cebd63/external/boringssl_git/WORKSPACE (@boringssl) does not match the name given in the repository's definition (@boringssl_git); this will cause a build error in future versions.
INFO: All external dependencies fetched successfully.
Configuration finished
git log
commit 6ce5b5c8298273e3861a75fb6ccde63b9dd157c5
Author: Sanders Kleinfeld <sandersk@users.noreply.github.com>
Date:   Sun Aug 28 01:00:52 2016 -0400

On branch r0.10.

If I leave the default GCC it is created, but the build fails because it is incompatible with CUDA.

@trevorwelch

This comment has been minimized.

Show comment
Hide comment
@trevorwelch

trevorwelch Sep 1, 2016

@davidzchen
It does exist:

tensorflow$ cd bazel-bin/tensorflow/cc/tutorials_example_trainer.runfiles/local_config_cuda/cuda/lib/

lib$ ls -l
total 40
lrwxr-xr-x  1 production204  wheel  126 Aug 31 11:03 libcublas.7.5.dylib -> /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/local_config_cuda/cuda/lib/libcublas.7.5.dylib
lrwxr-xr-x  1 production204  wheel  126 Aug 31 11:03 libcudart.7.5.dylib -> /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/local_config_cuda/cuda/lib/libcudart.7.5.dylib
lrwxr-xr-x  1 production204  wheel  123 Aug 31 11:03 libcudnn.5.dylib -> /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/local_config_cuda/cuda/lib/libcudnn.5.dylib
lrwxr-xr-x  1 production204  wheel  125 Aug 31 11:03 libcufft.7.5.dylib -> /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/local_config_cuda/cuda/lib/libcufft.7.5.dylib
lrwxr-xr-x  1 production204  wheel  126 Aug 31 11:03 libcurand.7.5.dylib -> /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/local_config_cuda/cuda/lib/libcurand.7.5.dylib

lib$ 

trevorwelch commented Sep 1, 2016

@davidzchen
It does exist:

tensorflow$ cd bazel-bin/tensorflow/cc/tutorials_example_trainer.runfiles/local_config_cuda/cuda/lib/

lib$ ls -l
total 40
lrwxr-xr-x  1 production204  wheel  126 Aug 31 11:03 libcublas.7.5.dylib -> /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/local_config_cuda/cuda/lib/libcublas.7.5.dylib
lrwxr-xr-x  1 production204  wheel  126 Aug 31 11:03 libcudart.7.5.dylib -> /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/local_config_cuda/cuda/lib/libcudart.7.5.dylib
lrwxr-xr-x  1 production204  wheel  123 Aug 31 11:03 libcudnn.5.dylib -> /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/local_config_cuda/cuda/lib/libcudnn.5.dylib
lrwxr-xr-x  1 production204  wheel  125 Aug 31 11:03 libcufft.7.5.dylib -> /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/local_config_cuda/cuda/lib/libcufft.7.5.dylib
lrwxr-xr-x  1 production204  wheel  126 Aug 31 11:03 libcurand.7.5.dylib -> /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/external/local_config_cuda/cuda/lib/libcurand.7.5.dylib

lib$ 
@jmhodges

This comment has been minimized.

Show comment
Hide comment
@jmhodges

jmhodges Sep 1, 2016

Contributor

Whoa, I'm running into this, too, but on master with OS X 10.11.6.

./configure:

/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 
Please specify the location where CUDA  toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 
Please specify the location where cuDNN  library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.0

Here's my file listings. All of the symlinks work and the files are all Mach-O so no weird accidental ELF or something: https://gist.github.com/jmhodges/a5de9cc5760333f5b57040d1947ec190

This was after going to sleep and coming back to this just now. Last night, I was debugging a different error condition and just came back to find my builds no longer working. I thought it was me hand-hacking in extra linkopts (-L/usr/local/cuda/lib, specifically) into various BUILD files trying to get the dyld error fixed.

Contributor

jmhodges commented Sep 1, 2016

Whoa, I'm running into this, too, but on master with OS X 10.11.6.

./configure:

/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 
Please specify the location where CUDA  toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 
Please specify the location where cuDNN  library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.0

Here's my file listings. All of the symlinks work and the files are all Mach-O so no weird accidental ELF or something: https://gist.github.com/jmhodges/a5de9cc5760333f5b57040d1947ec190

This was after going to sleep and coming back to this just now. Last night, I was debugging a different error condition and just came back to find my builds no longer working. I thought it was me hand-hacking in extra linkopts (-L/usr/local/cuda/lib, specifically) into various BUILD files trying to get the dyld error fixed.

@lissyx

This comment has been minimized.

Show comment
Hide comment
@lissyx

lissyx Sep 1, 2016

Contributor

I can confirm this also building on a Debian (sid, uptodate of today) system.

Contributor

lissyx commented Sep 1, 2016

I can confirm this also building on a Debian (sid, uptodate of today) system.

@jmhodges

This comment has been minimized.

Show comment
Hide comment
@jmhodges

jmhodges Sep 1, 2016

Contributor

I've found I can induce this by Ctrl-C'ing in the middle of a fresh bazel build.

Contributor

jmhodges commented Sep 1, 2016

I've found I can induce this by Ctrl-C'ing in the middle of a fresh bazel build.

@FlorinAndrei

This comment has been minimized.

Show comment
Hide comment
@FlorinAndrei

FlorinAndrei Sep 3, 2016

Ubuntu-16.04, CUDA 8, java 1.8.0_101, bazel 0.3.1

Building from master today

Started ./configure in a virtual instance in VirtualBox, did a CTRL-C because it was taking too long. Went home, fired up the instance again, deleted tensorflow repo, cloned it again.

Did ./configure again with same options as before, it worked well except one warning at the beginning:

Found stale PID file (pid=20777). Server probably died abruptly, continuing...

Ignored it, and did the command to build for GPU:

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

And it failed immediately:

INFO: Waiting for response from Bazel server (pid 9635)...
ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path.
ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path.
INFO: Elapsed time: 4.667s

EDIT:

Tried bazel clean and try again. bazel clean --expunge and try again ./configure and bazel build. Nothing helps. Fails the same way always. :(

FlorinAndrei commented Sep 3, 2016

Ubuntu-16.04, CUDA 8, java 1.8.0_101, bazel 0.3.1

Building from master today

Started ./configure in a virtual instance in VirtualBox, did a CTRL-C because it was taking too long. Went home, fired up the instance again, deleted tensorflow repo, cloned it again.

Did ./configure again with same options as before, it worked well except one warning at the beginning:

Found stale PID file (pid=20777). Server probably died abruptly, continuing...

Ignored it, and did the command to build for GPU:

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

And it failed immediately:

INFO: Waiting for response from Bazel server (pid 9635)...
ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path.
ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path.
INFO: Elapsed time: 4.667s

EDIT:

Tried bazel clean and try again. bazel clean --expunge and try again ./configure and bazel build. Nothing helps. Fails the same way always. :(

@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Sep 3, 2016

Member

There are two issues being discussed in this thread. @trevorwelch, let's move the Library not loaded: @rpath/libcudart.7.5.dylib discussion over to #4187.

For those experiencing the '@local_config_cuda//crosstool': BUILD file not found issue:

  • If the Bazel output directories (i.e. bazel-tensorflow, etc.) exist, what is the output of ls -l bazel-tensorflow/external/local_config_cuda/crosstool?
  • If not, what is the output of ls -l $(bazel info output_base)/external/local_config_cuda/crosstool?

In the meantime, I am still trying to reproduce this.

Member

davidzchen commented Sep 3, 2016

There are two issues being discussed in this thread. @trevorwelch, let's move the Library not loaded: @rpath/libcudart.7.5.dylib discussion over to #4187.

For those experiencing the '@local_config_cuda//crosstool': BUILD file not found issue:

  • If the Bazel output directories (i.e. bazel-tensorflow, etc.) exist, what is the output of ls -l bazel-tensorflow/external/local_config_cuda/crosstool?
  • If not, what is the output of ls -l $(bazel info output_base)/external/local_config_cuda/crosstool?

In the meantime, I am still trying to reproduce this.

@asimonov

This comment has been minimized.

Show comment
Hide comment
@asimonov

asimonov Sep 3, 2016

i experience the same issue. tf 0.10. mac os el capitain. bazel 0.3.0

asimonov commented Sep 3, 2016

i experience the same issue. tf 0.10. mac os el capitain. bazel 0.3.0

@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Sep 3, 2016

Member

@asimonov - Can you print the contents of the local_config_cuda/crosstool directory as I mentioned in my comment above?

Member

davidzchen commented Sep 3, 2016

@asimonov - Can you print the contents of the local_config_cuda/crosstool directory as I mentioned in my comment above?

@FlorinAndrei

This comment has been minimized.

Show comment
Hide comment
@FlorinAndrei

FlorinAndrei Sep 3, 2016

root@machine-learning:/vagrant/packages/tensorflow# ls *bazel*
ls: cannot access '*bazel*': No such file or directory
root@machine-learning:/vagrant/packages/tensorflow# ls -l $(bazel info output_base)/external/local_config_cuda/crosstool
ls: cannot access '/root/.cache/bazel/_bazel_root/b0bb79a433b74dfa52314ef9af1d2ddd/external/local_config_cuda/crosstool': No such file or directory

contents of bazel cache after BUILD file not found

Here's how to reproduce it:

Clone this repo: https://github.com/FlorinAndrei/ml-setup

Checkout the ubuntu1604 branch, then launch the virtual machine and run the ansible installer, then compile TF by hand:

git clone https://github.com/FlorinAndrei/ml-setup
cd ml-setup
git checkout ubuntu1604
vagrant up
vagrant ssh

sudo su -
cp /vagrant/bash_profile_example /root/.bash_profile
exit
sudo su -

cd /vagrant
# this will take a long time
ansible-playbook -i inventory main.yml
exit
sudo su -

cd /vagrant/packages/tensorflow
./configure

# Hit ENTER on every question except:
# Do you wish to build TensorFlow with GPU support? (answer: y)
# Please specify a list of comma-separated Cuda compute capabilities you want to build with. (answer: 6.1)

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

However, if you delete the tensorflow repo, re-clone and try again, it starts compiling:

cd
rm -rf /vagrant/packages/tensorflow
cd /vagrant
ansible-playbook -i inventory 40-tensorflow.yml

cd /vagrant/packages/tensorflow
./configure

# Hit ENTER on every question except:
# Do you wish to build TensorFlow with GPU support? (answer: y)
# Please specify a list of comma-separated Cuda compute capabilities you want to build with. (answer: 6.1)

contents of bazel cache after ./configure

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

And now it starts compiling.

EDIT: Even on second try, it still fails to compile all the way to the end, but that seems like a different issue, which I've opened here:

#4190

FlorinAndrei commented Sep 3, 2016

root@machine-learning:/vagrant/packages/tensorflow# ls *bazel*
ls: cannot access '*bazel*': No such file or directory
root@machine-learning:/vagrant/packages/tensorflow# ls -l $(bazel info output_base)/external/local_config_cuda/crosstool
ls: cannot access '/root/.cache/bazel/_bazel_root/b0bb79a433b74dfa52314ef9af1d2ddd/external/local_config_cuda/crosstool': No such file or directory

contents of bazel cache after BUILD file not found

Here's how to reproduce it:

Clone this repo: https://github.com/FlorinAndrei/ml-setup

Checkout the ubuntu1604 branch, then launch the virtual machine and run the ansible installer, then compile TF by hand:

git clone https://github.com/FlorinAndrei/ml-setup
cd ml-setup
git checkout ubuntu1604
vagrant up
vagrant ssh

sudo su -
cp /vagrant/bash_profile_example /root/.bash_profile
exit
sudo su -

cd /vagrant
# this will take a long time
ansible-playbook -i inventory main.yml
exit
sudo su -

cd /vagrant/packages/tensorflow
./configure

# Hit ENTER on every question except:
# Do you wish to build TensorFlow with GPU support? (answer: y)
# Please specify a list of comma-separated Cuda compute capabilities you want to build with. (answer: 6.1)

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

However, if you delete the tensorflow repo, re-clone and try again, it starts compiling:

cd
rm -rf /vagrant/packages/tensorflow
cd /vagrant
ansible-playbook -i inventory 40-tensorflow.yml

cd /vagrant/packages/tensorflow
./configure

# Hit ENTER on every question except:
# Do you wish to build TensorFlow with GPU support? (answer: y)
# Please specify a list of comma-separated Cuda compute capabilities you want to build with. (answer: 6.1)

contents of bazel cache after ./configure

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

And now it starts compiling.

EDIT: Even on second try, it still fails to compile all the way to the end, but that seems like a different issue, which I've opened here:

#4190

@asimonov

This comment has been minimized.

Show comment
Hide comment
@asimonov

asimonov Sep 4, 2016

David,

I cannot find local_config_cuda/crosstool directory anywhere in tensorflow directory.

Kind Regards,
Alexey

On 3 Sep 2016, at 20:59, David Z. Chen notifications@github.com wrote:

@asimonov https://github.com/asimonov - Can you print the contents of the local_config_cuda/crosstool directory as I mentioned in my comment above?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub #4105 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AAOr9jkI6txztNytQvsTCtcTUtlP5lrgks5qmdGbgaJpZM4Jw5M2.

asimonov commented Sep 4, 2016

David,

I cannot find local_config_cuda/crosstool directory anywhere in tensorflow directory.

Kind Regards,
Alexey

On 3 Sep 2016, at 20:59, David Z. Chen notifications@github.com wrote:

@asimonov https://github.com/asimonov - Can you print the contents of the local_config_cuda/crosstool directory as I mentioned in my comment above?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub #4105 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AAOr9jkI6txztNytQvsTCtcTUtlP5lrgks5qmdGbgaJpZM4Jw5M2.

@rasmi

This comment has been minimized.

Show comment
Hide comment
@rasmi

rasmi Sep 5, 2016

Member

@davidzchen At bazel-tensorflow/external/local_config_cuda/crosstool, I'm getting a broken symlink to

~/.cache/bazel/_bazel_user/d217f35631206796f447d50c6f1d6243/external/local_config_cuda/crosstool

Maybe worth noting that there does exist a cuda directory at

~/.cache/bazel/_bazel_/d217f35631206796f447d50c6f1d6243/external/local_config_cuda/cuda

And so the symlink to it at bazel-tensorflow/external/local_config_cuda/cuda is valid.

Member

rasmi commented Sep 5, 2016

@davidzchen At bazel-tensorflow/external/local_config_cuda/crosstool, I'm getting a broken symlink to

~/.cache/bazel/_bazel_user/d217f35631206796f447d50c6f1d6243/external/local_config_cuda/crosstool

Maybe worth noting that there does exist a cuda directory at

~/.cache/bazel/_bazel_/d217f35631206796f447d50c6f1d6243/external/local_config_cuda/cuda

And so the symlink to it at bazel-tensorflow/external/local_config_cuda/cuda is valid.

@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Sep 5, 2016

Member

@asimonov @rasmi - Does the contents of your local_config_cuda/cuda directory look similar to that of the directory listing in @FlorinAndrei's gist?

Member

davidzchen commented Sep 5, 2016

@asimonov @rasmi - Does the contents of your local_config_cuda/cuda directory look similar to that of the directory listing in @FlorinAndrei's gist?

@rasmi

This comment has been minimized.

Show comment
Hide comment
@rasmi

rasmi Sep 5, 2016

Member

@davidzchen -- assuming you mean @jmhodges gist, yes. Just a bunch of cuda library files. Sure enough, running ./configure and building again produces the crosstool directory with BUILD, CROSSTOOL, and a clang/bin/crosstool_wrapper_driver_is_not_gcc files. I was getting errors with crosstool_wrapper_driver_is_not_gcc earlier when using gcc 4.9.1, and now I'm getting entirely unrelated errors in highwayhash when using gcc 4.7.2. Not sure if that has anything to do with it or if that's helpful, since this error seems further upstream from these compilation errors.

Member

rasmi commented Sep 5, 2016

@davidzchen -- assuming you mean @jmhodges gist, yes. Just a bunch of cuda library files. Sure enough, running ./configure and building again produces the crosstool directory with BUILD, CROSSTOOL, and a clang/bin/crosstool_wrapper_driver_is_not_gcc files. I was getting errors with crosstool_wrapper_driver_is_not_gcc earlier when using gcc 4.9.1, and now I'm getting entirely unrelated errors in highwayhash when using gcc 4.7.2. Not sure if that has anything to do with it or if that's helpful, since this error seems further upstream from these compilation errors.

@JimmyKon

This comment has been minimized.

Show comment
Hide comment
@Dapid

This comment has been minimized.

Show comment
Hide comment
@Dapid

Dapid Sep 5, 2016

On the error of '@local_config_cuda//crosstool': BUILD file not found:

$ bazel info output_base
Warning: ignoring LD_PRELOAD in environment.
.
/home/david/.cache/bazel/_bazel_david/47d00ffdd2fc0515138a34f138cebd63
$ ls -l $(bazel info output_base)/external/local_config_cuda/crosstool
Warning: ignoring LD_PRELOAD in environment.
total 20
-rwxrwxr-x. 1 david david  925 Sep  1 13:38 BUILD
drwxrwxr-x. 3 david david 4096 Sep  1 13:38 clang
-rwxrwxr-x. 1 david david 8870 Sep  1 13:38 CROSSTOOL

Note that this only happens when I set a non default GCC, but since it is incompatible with CUDA I can only build for CPU.

Dapid commented Sep 5, 2016

On the error of '@local_config_cuda//crosstool': BUILD file not found:

$ bazel info output_base
Warning: ignoring LD_PRELOAD in environment.
.
/home/david/.cache/bazel/_bazel_david/47d00ffdd2fc0515138a34f138cebd63
$ ls -l $(bazel info output_base)/external/local_config_cuda/crosstool
Warning: ignoring LD_PRELOAD in environment.
total 20
-rwxrwxr-x. 1 david david  925 Sep  1 13:38 BUILD
drwxrwxr-x. 3 david david 4096 Sep  1 13:38 clang
-rwxrwxr-x. 1 david david 8870 Sep  1 13:38 CROSSTOOL

Note that this only happens when I set a non default GCC, but since it is incompatible with CUDA I can only build for CPU.

@tornadomeet

This comment has been minimized.

Show comment
Hide comment
@tornadomeet

tornadomeet Sep 6, 2016

same here, when ls -l $(bazel info output_base)/external/local_config_cuda, there is noly cudadirectory, no crosstool dir.

tornadomeet commented Sep 6, 2016

same here, when ls -l $(bazel info output_base)/external/local_config_cuda, there is noly cudadirectory, no crosstool dir.

@tornadomeet

This comment has been minimized.

Show comment
Hide comment
@tornadomeet

tornadomeet Sep 6, 2016

i solved this by careful setting the cuda/cudnn/gcc version when using ./configure

tornadomeet commented Sep 6, 2016

i solved this by careful setting the cuda/cudnn/gcc version when using ./configure

@asimonov

This comment has been minimized.

Show comment
Hide comment
@asimonov

asimonov Sep 6, 2016

This is what I have:

AlexPro:tensorflow alex$ bazel info output_base
/private/var/tmp/_bazel_alex/9a8b7d02e7f6ce832d52efe09806ba70

AlexPro:tensorflow alex$ ls -la /private/var/tmp/_bazel_alex/9a8b7d02e7f6ce832d52efe09806ba70/external/local_config_cuda
total 8
drwxr-xr-x 4 alex wheel 136 6 Sep 07:10 .
drwxr-xr-x 168 alex wheel 5712 6 Sep 07:10 ..
-rw-r--r-- 1 alex wheel 116 6 Sep 07:10 WORKSPACE
drwxr-xr-x 9 alex wheel 306 6 Sep 07:10 cuda

On 6 Sep 2016, at 02:51, Wei Wu notifications@github.com wrote:

same here, when ls -l $(bazel info output_base)/external/local_config_cuda, there is noly cuda directory, no crosstool.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub #4105 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AAOr9tiufEt_WZPBpsHOl0ZZjhBgyzE2ks5qnMcogaJpZM4Jw5M2.

asimonov commented Sep 6, 2016

This is what I have:

AlexPro:tensorflow alex$ bazel info output_base
/private/var/tmp/_bazel_alex/9a8b7d02e7f6ce832d52efe09806ba70

AlexPro:tensorflow alex$ ls -la /private/var/tmp/_bazel_alex/9a8b7d02e7f6ce832d52efe09806ba70/external/local_config_cuda
total 8
drwxr-xr-x 4 alex wheel 136 6 Sep 07:10 .
drwxr-xr-x 168 alex wheel 5712 6 Sep 07:10 ..
-rw-r--r-- 1 alex wheel 116 6 Sep 07:10 WORKSPACE
drwxr-xr-x 9 alex wheel 306 6 Sep 07:10 cuda

On 6 Sep 2016, at 02:51, Wei Wu notifications@github.com wrote:

same here, when ls -l $(bazel info output_base)/external/local_config_cuda, there is noly cuda directory, no crosstool.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub #4105 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AAOr9tiufEt_WZPBpsHOl0ZZjhBgyzE2ks5qnMcogaJpZM4Jw5M2.

@Dapid

This comment has been minimized.

Show comment
Hide comment
@Dapid

Dapid Sep 6, 2016

@tornadomeet can you show exactly what you used in configure?

Dapid commented Sep 6, 2016

@tornadomeet can you show exactly what you used in configure?

@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Sep 6, 2016

Member

Odd. It appears that even though you are running the ./configure script to build with GPU support, cuda_configure seems to think that GPU support is disabled.

The way that cuda_configure determines whether GPU support is enabled is whether TF_NEED_CUDA is set to "1", which is what the ./configure script sets if you answered y to whether to build with GPU support.

If you are consistently reproducing the '@local_config_cuda//crosstool': BUILD file not found error, can you apply this patch, then run your ./configure script and paste the debug (warning) messages from cuda_configure.bzl in the output?

Member

davidzchen commented Sep 6, 2016

Odd. It appears that even though you are running the ./configure script to build with GPU support, cuda_configure seems to think that GPU support is disabled.

The way that cuda_configure determines whether GPU support is enabled is whether TF_NEED_CUDA is set to "1", which is what the ./configure script sets if you answered y to whether to build with GPU support.

If you are consistently reproducing the '@local_config_cuda//crosstool': BUILD file not found error, can you apply this patch, then run your ./configure script and paste the debug (warning) messages from cuda_configure.bzl in the output?

@nsuke

This comment has been minimized.

Show comment
Hide comment
@nsuke

nsuke Sep 6, 2016

Contributor

I've solved the same problem by putting bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package into the last line of configure.
Looks like bazel configuration does not complete on bazel fetch //... in some cases.
source configure before running bazer build ... might work too.

Contributor

nsuke commented Sep 6, 2016

I've solved the same problem by putting bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package into the last line of configure.
Looks like bazel configuration does not complete on bazel fetch //... in some cases.
source configure before running bazer build ... might work too.

davidzchen added a commit to davidzchen/tensorflow that referenced this issue Sep 11, 2016

Force clean+fetch when re-running configure with different settings.
* Run bazel clean and bazel fetch in the configure script even when building
  without GPU support to force clean+fetch if the user re-runs ./configure
  with a different setting.
* Print a more actionable error messsage if the user attempts to build with
  --config=cuda but did not configure TensorFlow to build with GPU support.
* Update the BUILD file in @local_config_cuda to use repository-local labels.

Fixes #4105
@anirudh2290

This comment has been minimized.

Show comment
Hide comment
@anirudh2290

anirudh2290 Sep 12, 2016

Hi,

I am getting a similar error when running ./configure. "ERROR: The specified --crosstool_top '@local_config_cuda//crosstool:CROSSTOOL' is not a valid cc_toolchain_suite rule."
My configuration :
export TF_NEED_GCP=0
export TF_NEED_CUDA=1
export TF_CUDA_VERSION=7.5
export TF_CUDNN_VERSION=5

Is this related to this issue ? I switched to r0.10 branch but still able to reproduce the issue.

anirudh2290 commented Sep 12, 2016

Hi,

I am getting a similar error when running ./configure. "ERROR: The specified --crosstool_top '@local_config_cuda//crosstool:CROSSTOOL' is not a valid cc_toolchain_suite rule."
My configuration :
export TF_NEED_GCP=0
export TF_NEED_CUDA=1
export TF_CUDA_VERSION=7.5
export TF_CUDNN_VERSION=5

Is this related to this issue ? I switched to r0.10 branch but still able to reproduce the issue.

@nsuke

This comment has been minimized.

Show comment
Hide comment
@nsuke

nsuke Sep 12, 2016

Contributor

@anirudh2290 which version of bazel are you using ?
I've got the same error with this commit and after: bazelbuild/bazel@0d32fc8
0.3.1 should be fine.

Contributor

nsuke commented Sep 12, 2016

@anirudh2290 which version of bazel are you using ?
I've got the same error with this commit and after: bazelbuild/bazel@0d32fc8
0.3.1 should be fine.

@anirudh2290

This comment has been minimized.

Show comment
Hide comment
@anirudh2290

anirudh2290 Sep 12, 2016

@nsuke thank you. that was the problem. reverting to a previous commit worked.

anirudh2290 commented Sep 12, 2016

@nsuke thank you. that was the problem. reverting to a previous commit worked.

@damienmg

This comment has been minimized.

Show comment
Hide comment
@damienmg

damienmg Sep 12, 2016

Member

@anirudh2290 yes this is a recent change in Bazel. @davidzchen we should not longer use a filegroup to refer to crosstool but use cc_toolchain_suite

Member

damienmg commented Sep 12, 2016

@anirudh2290 yes this is a recent change in Bazel. @davidzchen we should not longer use a filegroup to refer to crosstool but use cc_toolchain_suite

davidzchen added a commit to davidzchen/tensorflow that referenced this issue Sep 13, 2016

Force clean+fetch when re-running configure with different settings.
* Run bazel clean and bazel fetch in the configure script even when building
  without GPU support to force clean+fetch if the user re-runs ./configure
  with a different setting.
* Print a more actionable error messsage if the user attempts to build with
  --config=cuda but did not configure TensorFlow to build with GPU support.
* Update the BUILD file in @local_config_cuda to use repository-local labels.

Fixes #4105

davidzchen added a commit to davidzchen/tensorflow that referenced this issue Sep 13, 2016

Force clean+fetch when re-running configure with different settings.
* Run bazel clean and bazel fetch in the configure script even when building
  without GPU support to force clean+fetch if the user re-runs ./configure
  with a different setting.
* Print a more actionable error messsage if the user attempts to build with
  --config=cuda but did not configure TensorFlow to build with GPU support.
* Update the BUILD file in @local_config_cuda to use repository-local labels.

Fixes #4105
@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Sep 13, 2016

Member

@damienmg Understood. I have added those changes to #4285

Member

davidzchen commented Sep 13, 2016

@damienmg Understood. I have added those changes to #4285

davidzchen added a commit to davidzchen/tensorflow that referenced this issue Sep 13, 2016

Force clean+fetch when re-running configure with different settings.
* Run bazel clean and bazel fetch in the configure script even when building
  without GPU support to force clean+fetch if the user re-runs ./configure
  with a different setting.
* Print a more actionable error messsage if the user attempts to build with
  --config=cuda but did not configure TensorFlow to build with GPU support.
* Update the BUILD file in @local_config_cuda to use repository-local labels.

Fixes #4105

davidzchen added a commit to davidzchen/tensorflow that referenced this issue Sep 16, 2016

Force clean+fetch when re-running configure with different settings.
* Run bazel clean and bazel fetch in the configure script even when building
  without GPU support to force clean+fetch if the user re-runs ./configure
  with a different setting.
* Print a more actionable error messsage if the user attempts to build with
  --config=cuda but did not configure TensorFlow to build with GPU support.
* Update the BUILD file in @local_config_cuda to use repository-local labels.

Fixes #4105
@gopi77

This comment has been minimized.

Show comment
Hide comment
@gopi77

gopi77 Sep 17, 2016

Got this problem ERROR: no such package '@local_config_cuda//crosstool': again today.
After various attempts the below steps worked.

  1. sudo apt-get upgrade bazel

  2. ./configure (i didn't repeat this step >> $ git clone https://github.com/tensorflow/tensorflow)

  3. To build with GPU support:

    bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

  4. mkdir _python_build
    cd _python_build
    ln -s ../bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/org_tensorflow/* .
    ln -s ../tensorflow/tools/pip_package/* .
    python setup.py develop

  5. cd tensorflow/models/image/mnist
    python convolutional.py

gopi77 commented Sep 17, 2016

Got this problem ERROR: no such package '@local_config_cuda//crosstool': again today.
After various attempts the below steps worked.

  1. sudo apt-get upgrade bazel

  2. ./configure (i didn't repeat this step >> $ git clone https://github.com/tensorflow/tensorflow)

  3. To build with GPU support:

    bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

  4. mkdir _python_build
    cd _python_build
    ln -s ../bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/org_tensorflow/* .
    ln -s ../tensorflow/tools/pip_package/* .
    python setup.py develop

  5. cd tensorflow/models/image/mnist
    python convolutional.py

@suiyuan2009

This comment has been minimized.

Show comment
Hide comment
@suiyuan2009

suiyuan2009 Sep 18, 2016

Contributor

meet the same issue.

Contributor

suiyuan2009 commented Sep 18, 2016

meet the same issue.

davidzchen added a commit to davidzchen/tensorflow that referenced this issue Sep 19, 2016

Force clean+fetch when re-running configure with different settings.
* Run bazel clean and bazel fetch in the configure script even when building
  without GPU support to force clean+fetch if the user re-runs ./configure
  with a different setting.
* Print a more actionable error messsage if the user attempts to build with
  --config=cuda but did not configure TensorFlow to build with GPU support.
* Update the BUILD file in @local_config_cuda to use repository-local labels.

Fixes #4105
@obo

This comment has been minimized.

Show comment
Hide comment
@obo

obo Sep 19, 2016

Hi. I'm getting "ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path." as well with:

The issue for me happens deterministically, if I run tensorflow ./configure while trying to avoid interactive questions:

set vars to avoid interactive

export PYTHON_BIN_PATH=/usr/bin/python

No way to confirm the following default value to ./util/python/python_config.sh without actually hitting the Return key: :-((

/usr/lib/python3/dist-packages

export TF_NEED_GCP=n
export TF_NEED_CUDA=y
export GCC_HOST_COMPILER_PATH=/usr/bin/gcc
export TF_CUDA_VERSION=7.5
export CUDA_TOOLKIT_PATH=$CUDA_HOME
export TF_CUDNN_VERSION=4
export CUDNN_INSTALL_PATH=$CUDA_HOME
export TF_CUDA_COMPUTE_CAPABILITIES=3.0

If I run ./configure without setting the above variables, ie.:

ubuntu@aws17:~/tensorflow$ ./configure
~/tensorflow ~/tensorflow
Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]
No Google Cloud Platform support will be enabled for TensorFlow
Found possible Python library paths:
/usr/local/lib/python3.5/dist-packages
/usr/lib/python3/dist-packages
Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages]

/usr/local/lib/python3.5/dist-packages
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]:
Please specify the location where CUDA toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Leave empty to use system default]:
Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
libcudnn.so resolves to libcudnn.4
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.

then the compilation works.

Note that there is probably no way to pass blank values (indicating "use the default" as opposed to undefined values indicating "I have not answered yet") for several of the variables. So ./configure in the interactive mode is getting various things blank while the less interactive ./configure has these values filled.

obo commented Sep 19, 2016

Hi. I'm getting "ERROR: no such package '@local_config_cuda//crosstool': BUILD file not found on package path." as well with:

The issue for me happens deterministically, if I run tensorflow ./configure while trying to avoid interactive questions:

set vars to avoid interactive

export PYTHON_BIN_PATH=/usr/bin/python

No way to confirm the following default value to ./util/python/python_config.sh without actually hitting the Return key: :-((

/usr/lib/python3/dist-packages

export TF_NEED_GCP=n
export TF_NEED_CUDA=y
export GCC_HOST_COMPILER_PATH=/usr/bin/gcc
export TF_CUDA_VERSION=7.5
export CUDA_TOOLKIT_PATH=$CUDA_HOME
export TF_CUDNN_VERSION=4
export CUDNN_INSTALL_PATH=$CUDA_HOME
export TF_CUDA_COMPUTE_CAPABILITIES=3.0

If I run ./configure without setting the above variables, ie.:

ubuntu@aws17:~/tensorflow$ ./configure
~/tensorflow ~/tensorflow
Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]
No Google Cloud Platform support will be enabled for TensorFlow
Found possible Python library paths:
/usr/local/lib/python3.5/dist-packages
/usr/lib/python3/dist-packages
Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages]

/usr/local/lib/python3.5/dist-packages
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]:
Please specify the location where CUDA toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Leave empty to use system default]:
Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
libcudnn.so resolves to libcudnn.4
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.

then the compilation works.

Note that there is probably no way to pass blank values (indicating "use the default" as opposed to undefined values indicating "I have not answered yet") for several of the variables. So ./configure in the interactive mode is getting various things blank while the less interactive ./configure has these values filled.

davidzchen added a commit to davidzchen/tensorflow that referenced this issue Sep 19, 2016

Force clean+fetch when re-running configure with different settings.
* Run bazel clean and bazel fetch in the configure script even when building
  without GPU support to force clean+fetch if the user re-runs ./configure
  with a different setting.
* Print a more actionable error messsage if the user attempts to build with
  --config=cuda but did not configure TensorFlow to build with GPU support.
* Update the BUILD file in @local_config_cuda to use repository-local labels.

Fixes #4105

davidzchen added a commit to davidzchen/tensorflow that referenced this issue Sep 21, 2016

Force clean+fetch when re-running configure with different settings.
* Run bazel clean and bazel fetch in the configure script even when building
  without GPU support to force clean+fetch if the user re-runs ./configure
  with a different setting.
* Print a more actionable error messsage if the user attempts to build with
  --config=cuda but did not configure TensorFlow to build with GPU support.
* Update the BUILD file in @local_config_cuda to use repository-local labels.

Fixes #4105

@martinwicke martinwicke closed this in #4285 Sep 21, 2016

martinwicke added a commit that referenced this issue Sep 21, 2016

Force clean+fetch when re-running configure with different settings. (#…
…4285)

* Run bazel clean and bazel fetch in the configure script even when building
  without GPU support to force clean+fetch if the user re-runs ./configure
  with a different setting.
* Print a more actionable error messsage if the user attempts to build with
  --config=cuda but did not configure TensorFlow to build with GPU support.
* Update the BUILD file in @local_config_cuda to use repository-local labels.

Fixes #4105
@kamal94

This comment has been minimized.

Show comment
Hide comment
@kamal94

kamal94 Sep 24, 2016

@martinwicke A similar error now appears during the building process:

/home/kamal/.cache/bazel/_bazel_kamal/f9ae4eca457b390bb2ebe780caca64e0/external/protobuf/BUILD:333:1: Linking of rule '@protobuf//:protoc' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command 
  (cd /home/kamal/.cache/bazel/_bazel_kamal/f9ae4eca457b390bb2ebe780caca64e0/execroot/tensorflow && \
  exec env - \
  external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -o bazel-out/host/bin/external/protobuf/protoc bazel-out/host/bin/external/protobuf/_objs/protoc/external/protobuf/src/google/protobuf/compiler/main.o bazel-out/host/bin/external/protobuf/libprotoc_lib.a bazel-out/host/bin/external/protobuf/libprotobuf.a bazel-out/host/bin/external/protobuf/libprotobuf_lite.a -lpthread -lstdc++ -B/usr/bin/ -pie -Wl,-z,relro,-z,now -no-canonical-prefixes -pass-exit-codes '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,-S -Wl,--gc-sections): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
bazel-out/host/bin/external/protobuf/_objs/protoc/external/protobuf/src/google/protobuf/compiler/main.o: In function `main':
main.cc:(.text.startup.main+0x2ad): undefined reference to `vtable for google::protobuf::compiler::php::Generator'
main.cc:(.text.startup.main+0x5fc): undefined reference to `vtable for google::protobuf::compiler::php::Generator'
main.cc:(.text.startup.main+0x707): undefined reference to `vtable for google::protobuf::compiler::php::Generator'
collect2: error: ld returned 1 exit status
Target //tensorflow/cc:tutorials_example_trainer failed to build

I think this might be related to 4316aeb

kamal94 commented Sep 24, 2016

@martinwicke A similar error now appears during the building process:

/home/kamal/.cache/bazel/_bazel_kamal/f9ae4eca457b390bb2ebe780caca64e0/external/protobuf/BUILD:333:1: Linking of rule '@protobuf//:protoc' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command 
  (cd /home/kamal/.cache/bazel/_bazel_kamal/f9ae4eca457b390bb2ebe780caca64e0/execroot/tensorflow && \
  exec env - \
  external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -o bazel-out/host/bin/external/protobuf/protoc bazel-out/host/bin/external/protobuf/_objs/protoc/external/protobuf/src/google/protobuf/compiler/main.o bazel-out/host/bin/external/protobuf/libprotoc_lib.a bazel-out/host/bin/external/protobuf/libprotobuf.a bazel-out/host/bin/external/protobuf/libprotobuf_lite.a -lpthread -lstdc++ -B/usr/bin/ -pie -Wl,-z,relro,-z,now -no-canonical-prefixes -pass-exit-codes '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,-S -Wl,--gc-sections): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
bazel-out/host/bin/external/protobuf/_objs/protoc/external/protobuf/src/google/protobuf/compiler/main.o: In function `main':
main.cc:(.text.startup.main+0x2ad): undefined reference to `vtable for google::protobuf::compiler::php::Generator'
main.cc:(.text.startup.main+0x5fc): undefined reference to `vtable for google::protobuf::compiler::php::Generator'
main.cc:(.text.startup.main+0x707): undefined reference to `vtable for google::protobuf::compiler::php::Generator'
collect2: error: ld returned 1 exit status
Target //tensorflow/cc:tutorials_example_trainer failed to build

I think this might be related to 4316aeb

@sskgit

This comment has been minimized.

Show comment
Hide comment
@sskgit

sskgit Oct 2, 2016

Hi,

Facing similar issues for tensorflow build.
I am trying Tensoflow from the source and receive build error for
a) C++ compilation of rule '@grpc//:gpr' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc

b) ERROR: I/O error while writing action log: No space left on device.

Enviornment:

Cuda 8.0
CuDNN 5
Ubuntu 16.04
bazel 0.31
Nvidia K80
Azure VM N6 (56 GB Memory)

Tried ./configure and build several times. Configure is successful but build fails.

Also tried, bazel clean, bazel clean --explunge and ran the build with reduced number of jobs
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

but the error continues..

Looked at this thread and also

#190

Here is the full error message:

52adb8ea4f53b1b72067611e8a7eb020/external/grpc/BUILD:69:1: C++ compilation of rule '@grpc//:gpr' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -fPIE -Wall -Wunused-but-set-parameter ... (remaining 38 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
ERROR: I/O error while writing action log: No space left on device.
java.util.logging.ErrorManager: 2
java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.util.logging.FileHandler$MeteredStream.flush(FileHandler.java:196)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:297)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at java.util.logging.StreamHandler.flush(StreamHandler.java:259)
at java.util.logging.FileHandler.publish(FileHandler.java:683)
at java.util.logging.Logger.log(Logger.java:738)
at java.util.logging.Logger.doLog(Logger.java:765)
at java.util.logging.Logger.log(Logger.java:788)
at java.util.logging.Logger.info(Logger.java:1489)
at com.google.devtools.build.lib.profiler.AutoProfiler$LoggingElapsedTimeReceiver.accept(AutoProfiler.java:315)
at com.google.devtools.build.lib.profiler.AutoProfiler$SequencedElapsedTimeReceiver.accept(AutoProfiler.java:262)
at com.google.devtools.build.lib.profiler.AutoProfiler.completeAndGetElapsedTimeNanos(AutoProfiler.java:226)
at com.google.devtools.build.lib.buildtool.ExecutionTool.saveCaches(ExecutionTool.java:725)
at com.google.devtools.build.lib.buildtool.ExecutionTool.executeBuild(ExecutionTool.java:470)
at com.google.devtools.build.lib.buildtool.BuildTool.buildTargets(BuildTool.java:201)
at com.google.devtools.build.lib.buildtool.BuildTool.processRequest(BuildTool.java:333)
at com.google.devtools.build.lib.runtime.commands.BuildCommand.exec(BuildCommand.java:69)
at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.execExclusively(BlazeCommandDispatcher.java:488)
at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.exec(BlazeCommandDispatcher.java:324)
at com.google.devtools.build.lib.runtime.CommandExecutor.exec(CommandExecutor.java:49)
at com.google.devtools.build.lib.server.RPCService.executeRequest(RPCService.java:70)

sskgit commented Oct 2, 2016

Hi,

Facing similar issues for tensorflow build.
I am trying Tensoflow from the source and receive build error for
a) C++ compilation of rule '@grpc//:gpr' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc

b) ERROR: I/O error while writing action log: No space left on device.

Enviornment:

Cuda 8.0
CuDNN 5
Ubuntu 16.04
bazel 0.31
Nvidia K80
Azure VM N6 (56 GB Memory)

Tried ./configure and build several times. Configure is successful but build fails.

Also tried, bazel clean, bazel clean --explunge and ran the build with reduced number of jobs
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

but the error continues..

Looked at this thread and also

#190

Here is the full error message:

52adb8ea4f53b1b72067611e8a7eb020/external/grpc/BUILD:69:1: C++ compilation of rule '@grpc//:gpr' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -fPIE -Wall -Wunused-but-set-parameter ... (remaining 38 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
ERROR: I/O error while writing action log: No space left on device.
java.util.logging.ErrorManager: 2
java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.util.logging.FileHandler$MeteredStream.flush(FileHandler.java:196)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:297)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at java.util.logging.StreamHandler.flush(StreamHandler.java:259)
at java.util.logging.FileHandler.publish(FileHandler.java:683)
at java.util.logging.Logger.log(Logger.java:738)
at java.util.logging.Logger.doLog(Logger.java:765)
at java.util.logging.Logger.log(Logger.java:788)
at java.util.logging.Logger.info(Logger.java:1489)
at com.google.devtools.build.lib.profiler.AutoProfiler$LoggingElapsedTimeReceiver.accept(AutoProfiler.java:315)
at com.google.devtools.build.lib.profiler.AutoProfiler$SequencedElapsedTimeReceiver.accept(AutoProfiler.java:262)
at com.google.devtools.build.lib.profiler.AutoProfiler.completeAndGetElapsedTimeNanos(AutoProfiler.java:226)
at com.google.devtools.build.lib.buildtool.ExecutionTool.saveCaches(ExecutionTool.java:725)
at com.google.devtools.build.lib.buildtool.ExecutionTool.executeBuild(ExecutionTool.java:470)
at com.google.devtools.build.lib.buildtool.BuildTool.buildTargets(BuildTool.java:201)
at com.google.devtools.build.lib.buildtool.BuildTool.processRequest(BuildTool.java:333)
at com.google.devtools.build.lib.runtime.commands.BuildCommand.exec(BuildCommand.java:69)
at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.execExclusively(BlazeCommandDispatcher.java:488)
at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.exec(BlazeCommandDispatcher.java:324)
at com.google.devtools.build.lib.runtime.CommandExecutor.exec(CommandExecutor.java:49)
at com.google.devtools.build.lib.server.RPCService.executeRequest(RPCService.java:70)

@darrengarvey

This comment has been minimized.

Show comment
Hide comment
@darrengarvey

darrengarvey Oct 2, 2016

Contributor

@sskgit - This looks like b) no space left on device is the reason for a) compilation of a file fails. You'll notice the bazel-* symlinks in the tensorflow directory point to $HOME/.cache/bazel/..., so check you've got enough space on the partition you have $HOME mounted to. You'll need about 10GB, possibly more.

Contributor

darrengarvey commented Oct 2, 2016

@sskgit - This looks like b) no space left on device is the reason for a) compilation of a file fails. You'll notice the bazel-* symlinks in the tensorflow directory point to $HOME/.cache/bazel/..., so check you've got enough space on the partition you have $HOME mounted to. You'll need about 10GB, possibly more.

@sskgit

This comment has been minimized.

Show comment
Hide comment
@sskgit

sskgit Oct 3, 2016

@darrengarvey Thanks for your response.

I tried sudo df and here is the output

df
Filesystem 1K-blocks Used Available Use% Mounted on
udev 28837212 0 28837212 0% /dev
tmpfs 5771196 9244 5761952 1% /run
/dev/sda1 29711408 29329736 365288 99% /
tmpfs 28855968 0 28855968 0% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 28855968 0 28855968 0% /sys/fs/cgroup
none 64 0 64 0% /etc/network/interfaces.dynamic.d
/dev/sdb1 356513788 135524 356378264 1% /mnt
tmpfs 5771196 0 5771196 0% /run/user/1000

It shows mounted on / has almost no space, and I think $VM/Username (i.e. $HOME) is on the same mount /, as there is no $HOME mounted on, in the output. Is this the right command?

At / (used space)

du -sch
2.3G .
2.3G total

At $HOME (used space)

du -sch
4.4G .
4.4G total

So, I am not sure what is occupying the remaining space? Total storage on this VM is 380 GB.

Is there any way to get rid of bazel logs? Is it causing the space issue? If so, where?
Also, if it is how do I free up some space?

Thanks

sskgit commented Oct 3, 2016

@darrengarvey Thanks for your response.

I tried sudo df and here is the output

df
Filesystem 1K-blocks Used Available Use% Mounted on
udev 28837212 0 28837212 0% /dev
tmpfs 5771196 9244 5761952 1% /run
/dev/sda1 29711408 29329736 365288 99% /
tmpfs 28855968 0 28855968 0% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 28855968 0 28855968 0% /sys/fs/cgroup
none 64 0 64 0% /etc/network/interfaces.dynamic.d
/dev/sdb1 356513788 135524 356378264 1% /mnt
tmpfs 5771196 0 5771196 0% /run/user/1000

It shows mounted on / has almost no space, and I think $VM/Username (i.e. $HOME) is on the same mount /, as there is no $HOME mounted on, in the output. Is this the right command?

At / (used space)

du -sch
2.3G .
2.3G total

At $HOME (used space)

du -sch
4.4G .
4.4G total

So, I am not sure what is occupying the remaining space? Total storage on this VM is 380 GB.

Is there any way to get rid of bazel logs? Is it causing the space issue? If so, where?
Also, if it is how do I free up some space?

Thanks

@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Oct 3, 2016

Member

@sskgit Running bazel clean --expunge should remove all of the generated files for the workspace. How much free space do you have after running that command?

@martinwicke Sorry for the late reply. I have been on call for a good part of the past week. That looks like a linker error in protobuf and does not seem related to this particular change. crosstool_wrapper_driver_is_not_gcc is a wrapper script that calls gcc. Is this on a newer version of protobuf?

Member

davidzchen commented Oct 3, 2016

@sskgit Running bazel clean --expunge should remove all of the generated files for the workspace. How much free space do you have after running that command?

@martinwicke Sorry for the late reply. I have been on call for a good part of the past week. That looks like a linker error in protobuf and does not seem related to this particular change. crosstool_wrapper_driver_is_not_gcc is a wrapper script that calls gcc. Is this on a newer version of protobuf?

@sskgit

This comment has been minimized.

Show comment
Hide comment
@sskgit

sskgit Oct 3, 2016

@davidzchen Thanks for your response.

used bazel clean --expunge (this cleaned 500MB space) and re-configured tensorflow using ./configure

Run the bazel build again..

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

ERROR: /$HOME/Downloads/tensorflow/tensorflow/core/kernels/BUILD:1710:1: error while parsing .d file: /$HOME/.cache/bazel/_bazel_gpuadmin/52adb8ea4f53b1b72067611e8a7eb020/execroot/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/depth_space_ops_gpu/tensorflow/core/kernels/depthtospace_op_gpu.cu.pic.d (No such file or directory).
nvcc warning : option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr'.
: fatal error: when writing output to : No space left on device
compilation terminated.
Target //tensorflow/tools/pip_package:build_pip_package failed to build

At $HOME:
du -sch
4.6G .
4.6G total

At /:
du -sch
2.3G .
2.3G total

df -h
Filesystem Size Used Avail Use% Mounted on
udev 28G 0 28G 0% /dev
tmpfs 5.6G 9.1M 5.5G 1% /run
/dev/sda1 29G 29G 9.8M 100% /
tmpfs 28G 0 28G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 28G 0 28G 0% /sys/fs/cgroup
none 64K 0 64K 0% /etc/network/interfaces.dynamic.d
/dev/sdb1 340G 133M 340G 1% /mnt
tmpfs 5.6G 0 5.6G 0% /run/user/1000

This is a fairly new machine with few installer software (<200MB)

sskgit commented Oct 3, 2016

@davidzchen Thanks for your response.

used bazel clean --expunge (this cleaned 500MB space) and re-configured tensorflow using ./configure

Run the bazel build again..

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

ERROR: /$HOME/Downloads/tensorflow/tensorflow/core/kernels/BUILD:1710:1: error while parsing .d file: /$HOME/.cache/bazel/_bazel_gpuadmin/52adb8ea4f53b1b72067611e8a7eb020/execroot/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/depth_space_ops_gpu/tensorflow/core/kernels/depthtospace_op_gpu.cu.pic.d (No such file or directory).
nvcc warning : option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr'.
: fatal error: when writing output to : No space left on device
compilation terminated.
Target //tensorflow/tools/pip_package:build_pip_package failed to build

At $HOME:
du -sch
4.6G .
4.6G total

At /:
du -sch
2.3G .
2.3G total

df -h
Filesystem Size Used Avail Use% Mounted on
udev 28G 0 28G 0% /dev
tmpfs 5.6G 9.1M 5.5G 1% /run
/dev/sda1 29G 29G 9.8M 100% /
tmpfs 28G 0 28G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 28G 0 28G 0% /sys/fs/cgroup
none 64K 0 64K 0% /etc/network/interfaces.dynamic.d
/dev/sdb1 340G 133M 340G 1% /mnt
tmpfs 5.6G 0 5.6G 0% /run/user/1000

This is a fairly new machine with few installer software (<200MB)

@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Oct 3, 2016

Member

@sskgit Interesting. Can you open a bug at https://github.com/bazelbuild/bazel for this issue? Thanks.

Member

davidzchen commented Oct 3, 2016

@sskgit Interesting. Can you open a bug at https://github.com/bazelbuild/bazel for this issue? Thanks.

@sskgit

This comment has been minimized.

Show comment
Hide comment
@sskgit

sskgit Oct 3, 2016

@davidzchen Opened an issue with bazel as well. I have tried the build almost 10 times in last couple of days. Still trying to figure why the build fails and how do I complete the build successfully.

sskgit commented Oct 3, 2016

@davidzchen Opened an issue with bazel as well. I have tried the build almost 10 times in last couple of days. Still trying to figure why the build fails and how do I complete the build successfully.

npanpaliya added a commit to ibmsoe/tensorflow that referenced this issue Oct 12, 2016

Force clean+fetch when re-running configure with different settings. (#…
…4285)

* Run bazel clean and bazel fetch in the configure script even when building
  without GPU support to force clean+fetch if the user re-runs ./configure
  with a different setting.
* Print a more actionable error messsage if the user attempts to build with
  --config=cuda but did not configure TensorFlow to build with GPU support.
* Update the BUILD file in @local_config_cuda to use repository-local labels.

Fixes #4105
@sskgit

This comment has been minimized.

Show comment
Hide comment
@sskgit

sskgit Oct 14, 2016

Space issue caused my Tensorflow build to fail. Clearing some space on the / mount made the build successful and Tensorflow now works as expected.

Thanks everyone for your help!

sskgit commented Oct 14, 2016

Space issue caused my Tensorflow build to fail. Clearing some space on the / mount made the build successful and Tensorflow now works as expected.

Thanks everyone for your help!

@TensorTom

This comment has been minimized.

Show comment
Hide comment
@TensorTom

TensorTom Jul 29, 2017

This is happening for me too. I already have TF installed and working for GPU via the runfile but I wanted to compile it for optimizations. I get:

ERROR: Skipping '//tensorflow/tools/pip_package:build_pip_package': error loading package 'tensorflow/tools/pip_package': Encountered error while reading extension file 'cuda/build_defs.bzl': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 1039
                _create_local_cuda_repository(repository_ctx)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 976, in _create_local_cuda_repository
                _host_compiler_includes(repository_ctx, cc)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 145, in _host_compiler_includes
                get_cxx_inc_directories(repository_ctx, cc)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 120, in get_cxx_inc_directories
                set(includes_cpp)
depsets cannot contain mutable items
WARNING: Target pattern parsing failed.
ERROR: no such package '@local_config_cuda//crosstool': Traceback (most recent call last):
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 1039
                _create_local_cuda_repository(repository_ctx)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 976, in _create_local_cuda_repository
                _host_compiler_includes(repository_ctx, cc)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 145, in _host_compiler_includes
                get_cxx_inc_directories(repository_ctx, cc)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 120, in get_cxx_inc_directories
                set(includes_cpp)
depsets cannot contain mutable items
INFO: Elapsed time: 4.869s
FAILED: Build did NOT complete successfully (3 packages loaded)
    currently loading: tensorflow/tools/pip_package

Steps to reproduce:

git clone --recurse-submodules https://github.com/tensorflow/tensorflow
cd tensorflow
$ ./configure
WARNING: Running Bazel server needs to be killed, because the startup options are different.
Please specify the location of python. [Default is /home/user/.pyenv/versions/3.6.2/bin/python]: 
Found possible Python library paths:
/home/user/.pyenv/versions/3.6.2/lib/python3.6/site-packages
Please input the desired Python library path to use.  Default is /home/user/.pyenv/versions/3.6.2/lib/python3.6/site-packages
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: y
jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]: n
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [y/N]: n
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]: y
XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL support? [y/N]: n
No OpenCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-8.0
"Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 
Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-8.0]:/usr/lib/x86_64-linux-gnu 
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1]
Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 
Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.
Configuration finished
$ bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda -k //tensorflow/tools/pip_package:build_pip_package

If you notice that I've done something wrong, please let me know. I saw someone before mention something about LD_LIBRARY_PATH which I have set to /usr/local/cuda-8.0/lib64. I figured it was fine since as I said, I already have TF installed and running from the runfile download. I would like to be able to compile from source though.

TensorTom commented Jul 29, 2017

This is happening for me too. I already have TF installed and working for GPU via the runfile but I wanted to compile it for optimizations. I get:

ERROR: Skipping '//tensorflow/tools/pip_package:build_pip_package': error loading package 'tensorflow/tools/pip_package': Encountered error while reading extension file 'cuda/build_defs.bzl': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 1039
                _create_local_cuda_repository(repository_ctx)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 976, in _create_local_cuda_repository
                _host_compiler_includes(repository_ctx, cc)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 145, in _host_compiler_includes
                get_cxx_inc_directories(repository_ctx, cc)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 120, in get_cxx_inc_directories
                set(includes_cpp)
depsets cannot contain mutable items
WARNING: Target pattern parsing failed.
ERROR: no such package '@local_config_cuda//crosstool': Traceback (most recent call last):
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 1039
                _create_local_cuda_repository(repository_ctx)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 976, in _create_local_cuda_repository
                _host_compiler_includes(repository_ctx, cc)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 145, in _host_compiler_includes
                get_cxx_inc_directories(repository_ctx, cc)
        File "/home/user/bin/tensorflow/third_party/gpus/cuda_configure.bzl", line 120, in get_cxx_inc_directories
                set(includes_cpp)
depsets cannot contain mutable items
INFO: Elapsed time: 4.869s
FAILED: Build did NOT complete successfully (3 packages loaded)
    currently loading: tensorflow/tools/pip_package

Steps to reproduce:

git clone --recurse-submodules https://github.com/tensorflow/tensorflow
cd tensorflow
$ ./configure
WARNING: Running Bazel server needs to be killed, because the startup options are different.
Please specify the location of python. [Default is /home/user/.pyenv/versions/3.6.2/bin/python]: 
Found possible Python library paths:
/home/user/.pyenv/versions/3.6.2/lib/python3.6/site-packages
Please input the desired Python library path to use.  Default is /home/user/.pyenv/versions/3.6.2/lib/python3.6/site-packages
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: y
jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]: n
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [y/N]: n
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]: y
XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL support? [y/N]: n
No OpenCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-8.0
"Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 
Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-8.0]:/usr/lib/x86_64-linux-gnu 
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1]
Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 
Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.
Configuration finished
$ bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda -k //tensorflow/tools/pip_package:build_pip_package

If you notice that I've done something wrong, please let me know. I saw someone before mention something about LD_LIBRARY_PATH which I have set to /usr/local/cuda-8.0/lib64. I figured it was fine since as I said, I already have TF installed and running from the runfile download. I would like to be able to compile from source though.

@itssujeeth

This comment has been minimized.

Show comment
Hide comment
@itssujeeth

itssujeeth Aug 1, 2017

I’m also facing the same challenge - unable to build tensor flow on a gpu server. Below given are the details. OS is Ubuntu 16.04LTS

user@gpu-devbox:~/Workouts/tensorflow$ python --version
Python 2.7.12
user@gpu-devbox:~/Workouts/tensorflow$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1-16.04.4) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
user@gpu-devbox:~/Workouts/tensorflow$ bazel version
Build label: 0.5.3
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Jul 28 08:34:59 2017 (1501230899)
Build timestamp: 1501230899
Build timestamp as int: 1501230899
user@gpu-devbox:~/Workouts/tensorflow$ ./configure 
WARNING: Running Bazel server needs to be killed, because the startup options are different.
Please specify the location of python. [Default is /usr/bin/python]: 
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: 
jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]: 
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [y/N]: 
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]: 
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]: 
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL support? [y/N]: 
No OpenCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: Y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
"Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 
Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1,6.1]6.1
Do you want to use clang as CUDA compiler? [y/N]: 
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Do you wish to build TensorFlow with MPI support? [y/N]: 
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 
Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.
Configuration finished
user@gpu-devbox:~/Workouts/tensorflow$ echo $CUDA_HOME
/usr/local/cuda-8.0

user@gpu-devbox:~/Workouts/tensorflow$ echo $LD_LIBRARY_PATH
/usr/local/cuda-8.0/lib64
user@gpu-devbox:~/Workouts/tensorflow$ bazel build --config=opt --config=cuda --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" ./tensorflow/tools/pip_package:build_pip_package 
.......
ERROR: no such package '@local_config_cuda//crosstool': Traceback (most recent call last):
    File "/home/u19061/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 1039
        _create_local_cuda_repository(repository_ctx)
    File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 976, in _create_local_cuda_repository
        _host_compiler_includes(repository_ctx, cc)
    File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 145, in _host_compiler_includes
        get_cxx_inc_directories(repository_ctx, cc)
    File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 120, in get_cxx_inc_directories
        set(includes_cpp)
depsets cannot contain mutable items
INFO: Elapsed time: 5.488s
FAILED: Build did NOT complete successfully (3 packages loaded)

itssujeeth commented Aug 1, 2017

I’m also facing the same challenge - unable to build tensor flow on a gpu server. Below given are the details. OS is Ubuntu 16.04LTS

user@gpu-devbox:~/Workouts/tensorflow$ python --version
Python 2.7.12
user@gpu-devbox:~/Workouts/tensorflow$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1-16.04.4) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
user@gpu-devbox:~/Workouts/tensorflow$ bazel version
Build label: 0.5.3
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Jul 28 08:34:59 2017 (1501230899)
Build timestamp: 1501230899
Build timestamp as int: 1501230899
user@gpu-devbox:~/Workouts/tensorflow$ ./configure 
WARNING: Running Bazel server needs to be killed, because the startup options are different.
Please specify the location of python. [Default is /usr/bin/python]: 
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: 
jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]: 
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [y/N]: 
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]: 
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]: 
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL support? [y/N]: 
No OpenCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: Y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
"Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 
Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1,6.1]6.1
Do you want to use clang as CUDA compiler? [y/N]: 
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Do you wish to build TensorFlow with MPI support? [y/N]: 
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 
Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.
Configuration finished
user@gpu-devbox:~/Workouts/tensorflow$ echo $CUDA_HOME
/usr/local/cuda-8.0

user@gpu-devbox:~/Workouts/tensorflow$ echo $LD_LIBRARY_PATH
/usr/local/cuda-8.0/lib64
user@gpu-devbox:~/Workouts/tensorflow$ bazel build --config=opt --config=cuda --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" ./tensorflow/tools/pip_package:build_pip_package 
.......
ERROR: no such package '@local_config_cuda//crosstool': Traceback (most recent call last):
    File "/home/u19061/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 1039
        _create_local_cuda_repository(repository_ctx)
    File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 976, in _create_local_cuda_repository
        _host_compiler_includes(repository_ctx, cc)
    File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 145, in _host_compiler_includes
        get_cxx_inc_directories(repository_ctx, cc)
    File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 120, in get_cxx_inc_directories
        set(includes_cpp)
depsets cannot contain mutable items
INFO: Elapsed time: 5.488s
FAILED: Build did NOT complete successfully (3 packages loaded)
@itssujeeth

This comment has been minimized.

Show comment
Hide comment
@itssujeeth

itssujeeth Aug 1, 2017

Also -when checked the cache doesn’t contain crosstool - Am I missing something here?

user@devbox:~/Workouts/tensorflow$ ls -l $(bazel info output_base)/external/local_config_cuda/crosstool
total 4
-rwxrwxr-x 1 user user 1267 Aug  1 16:32 BUILD

itssujeeth commented Aug 1, 2017

Also -when checked the cache doesn’t contain crosstool - Am I missing something here?

user@devbox:~/Workouts/tensorflow$ ls -l $(bazel info output_base)/external/local_config_cuda/crosstool
total 4
-rwxrwxr-x 1 user user 1267 Aug  1 16:32 BUILD
@sh1r0

This comment has been minimized.

Show comment
Hide comment
@sh1r0

sh1r0 Aug 3, 2017

@itssujeeth #11949 fixes the issue when building tensorflow with gpu support using bazel 0.5.3.

sh1r0 commented Aug 3, 2017

@itssujeeth #11949 fixes the issue when building tensorflow with gpu support using bazel 0.5.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment