New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library not loaded: @rpath/libcudart.7.5.dylib #4187

Closed
davidzchen opened this Issue Sep 3, 2016 · 28 comments

Comments

Projects
None yet
@davidzchen
Member

davidzchen commented Sep 3, 2016

+@trevorwelch

From the discussion in #4105 and #4145, this is the tracking bug for the following error:

ERROR: /Users/production204/Github/tensorflow/tensorflow/cc/BUILD:179:1: Executing genrule //tensorflow/cc:training_ops_genrule failed: bash failed: error executing command 
  (cd /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow && \
  exec env - \
    PATH=/usr/local/cuda/bin:/Library/Frameworks/Python.framework/Versions/2.7/bin:/usr/local/bin:usr/local/sbin:/usr/local/mysql/bin:/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin \
    TMPDIR=/var/folders/h3/pn9k79xn6qd9jgksqbkpn3l80000gn/T/ \
  /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/host/bin/tensorflow/cc/ops/training_ops_gen_cc bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.h bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.cc 0'): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 5.
dyld: Library not loaded: @rpath/libcudart.7.5.dylib
  Referenced from: /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow/bazel-out/host/bin/tensorflow/cc/ops/training_ops_gen_cc
  Reason: image not found
/bin/bash: line 1: 74845 Trace/BPT trap: 5       bazel-out/host/bin/tensorflow/cc/ops/training_ops_gen_cc bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.h bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/training_ops.cc 0
Target //tensorflow/cc:tutorials_example_trainer failed to build
INFO: Elapsed time: 3111.405s, Critical Path: 3097.65s

production204@Trevors-MacBook-Pro tensorflow $ 
@JimmyKon

This comment has been minimized.

Show comment
Hide comment
@JimmyKon

JimmyKon Sep 5, 2016

@davidzchen

your error exactly like mine, i have fix this problem.
My compute:
2012 early iMac
CPU:i5
Display Card: GT640M

Xcode 7.3
CUDA 7.5.27
CUDNN 4

here my solution:
https://github.com/JimmyKon/tensorflow_build_issue_fix/tree/master

I found the genrule-setup.sh file will execute before error.

...execroot/tensorflow/external/bazel_tools/tools/genrule/genrule-setup.sh

ok, print file timestamp first.

stat genrule-setup.sh

output like this:

16777217 56288053 -rwxr-xr-x 1 ****** wheel 0 242 "Sep  4 23:26:23 2016" "Sep  2 22:34:23 2026" "Sep  4 22:34:24 2016" "Sep  4 22:34:21 2016" 4096 8 0 genrule-setup.sh

"Sep 2 22:34:23 2026"? yes, record this timestamp.

open this file, add the environment configuration to the end of the file

export DYLD_LIBRARY_PATH=/usr/local/cuda/lib

and then, recover genrule-setup.sh timestamp

touch -t YYYYMMDDhhmm.SS genrule-setup.sh

YYYYMMDDhhmm.SS is your recorded timestamp,my situation is 202609022234.23

compile again, done.

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

JimmyKon commented Sep 5, 2016

@davidzchen

your error exactly like mine, i have fix this problem.
My compute:
2012 early iMac
CPU:i5
Display Card: GT640M

Xcode 7.3
CUDA 7.5.27
CUDNN 4

here my solution:
https://github.com/JimmyKon/tensorflow_build_issue_fix/tree/master

I found the genrule-setup.sh file will execute before error.

...execroot/tensorflow/external/bazel_tools/tools/genrule/genrule-setup.sh

ok, print file timestamp first.

stat genrule-setup.sh

output like this:

16777217 56288053 -rwxr-xr-x 1 ****** wheel 0 242 "Sep  4 23:26:23 2016" "Sep  2 22:34:23 2026" "Sep  4 22:34:24 2016" "Sep  4 22:34:21 2016" 4096 8 0 genrule-setup.sh

"Sep 2 22:34:23 2026"? yes, record this timestamp.

open this file, add the environment configuration to the end of the file

export DYLD_LIBRARY_PATH=/usr/local/cuda/lib

and then, recover genrule-setup.sh timestamp

touch -t YYYYMMDDhhmm.SS genrule-setup.sh

YYYYMMDDhhmm.SS is your recorded timestamp,my situation is 202609022234.23

compile again, done.

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
@tatatodd

This comment has been minimized.

Show comment
Hide comment
@tatatodd

tatatodd Sep 7, 2016

Member

Thanks for filing this bug, David! I'm triaging issues and it seems you might be the most appropriate owner; if not just un-assign yourself and I'll re-triage it.

Member

tatatodd commented Sep 7, 2016

Thanks for filing this bug, David! I'm triaging issues and it seems you might be the most appropriate owner; if not just un-assign yourself and I'll re-triage it.

@trevorwelch

This comment has been minimized.

Show comment
Hide comment
@trevorwelch

trevorwelch Sep 13, 2016

thanks for your help on this one. So, I'm still getting this same error on branch r0.10, although on Master I'm getting #4105

Following @JimmyKon JimmyKon's instructions on branch r0.10, including adding the DYLD_LIBRARY_PATH to genrule-setup.sh ...

$ stat genrule-setup.sh
16777220 36599236 -rwxr-xr-x 1 production204 staff 0 242 "Sep 12 19:37:15 2016" "Aug 19 16:39:34 2016" "Aug 29 16:10:10 2016" "Aug 19 16:39:34 2016" 4096 8 0 genrule-setup.sh

$ touch -t 201609121937.15 genrule-setup.sh

upon compile I get:

ERROR: /Users/production204/Github/tensorflow/tensorflow/cc/BUILD:116:1: Executing genrule //tensorflow/cc:parsing_ops_genrule failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 5.
dyld: Library not loaded: @rpath/libcudart.7.5.dylib
  Referenced from: /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow/bazel-out/host/bin/tensorflow/cc/ops/parsing_ops_gen_cc
  Reason: image not found
/bin/bash: line 1: 73929 Trace/BPT trap: 5       bazel-out/host/bin/tensorflow/cc/ops/parsing_ops_gen_cc bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/parsing_ops.h bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/parsing_ops.cc 0
Target //tensorflow/cc:tutorials_example_trainer failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 2525.303s, Critical Path: 1420.09s

Which, I've confirmed the library is where it needs to be, I've even tried locating the library directly in export DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-7.5/lib and following the rest of the above instructions again to no avail

trevorwelch commented Sep 13, 2016

thanks for your help on this one. So, I'm still getting this same error on branch r0.10, although on Master I'm getting #4105

Following @JimmyKon JimmyKon's instructions on branch r0.10, including adding the DYLD_LIBRARY_PATH to genrule-setup.sh ...

$ stat genrule-setup.sh
16777220 36599236 -rwxr-xr-x 1 production204 staff 0 242 "Sep 12 19:37:15 2016" "Aug 19 16:39:34 2016" "Aug 29 16:10:10 2016" "Aug 19 16:39:34 2016" 4096 8 0 genrule-setup.sh

$ touch -t 201609121937.15 genrule-setup.sh

upon compile I get:

ERROR: /Users/production204/Github/tensorflow/tensorflow/cc/BUILD:116:1: Executing genrule //tensorflow/cc:parsing_ops_genrule failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 5.
dyld: Library not loaded: @rpath/libcudart.7.5.dylib
  Referenced from: /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow/bazel-out/host/bin/tensorflow/cc/ops/parsing_ops_gen_cc
  Reason: image not found
/bin/bash: line 1: 73929 Trace/BPT trap: 5       bazel-out/host/bin/tensorflow/cc/ops/parsing_ops_gen_cc bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/parsing_ops.h bazel-out/local_darwin-opt/genfiles/tensorflow/cc/ops/parsing_ops.cc 0
Target //tensorflow/cc:tutorials_example_trainer failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 2525.303s, Critical Path: 1420.09s

Which, I've confirmed the library is where it needs to be, I've even tried locating the library directly in export DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-7.5/lib and following the rest of the above instructions again to no avail

@JimmyKon

This comment has been minimized.

Show comment
Hide comment
@JimmyKon

JimmyKon Sep 13, 2016

@trevorwelch

you still got error after follow my instructions?
did you miss some step like blew:

$ sudo mv include/cudnn.h /Developer/NVIDIA/CUDA-7.5/include/
$ sudo mv lib/libcudnn* /Developer/NVIDIA/CUDA-7.5/lib
$ sudo ln -s /Developer/NVIDIA/CUDA-7.5/lib/libcudnn* /usr/local/cuda/lib/

BTW, if you get error:

Segmentation fault: 11

please, try this command:

ln -sf /usr/local/cuda/lib/libcuda.dylib /usr/local/cuda/lib/libcuda.1.dylib

JimmyKon commented Sep 13, 2016

@trevorwelch

you still got error after follow my instructions?
did you miss some step like blew:

$ sudo mv include/cudnn.h /Developer/NVIDIA/CUDA-7.5/include/
$ sudo mv lib/libcudnn* /Developer/NVIDIA/CUDA-7.5/lib
$ sudo ln -s /Developer/NVIDIA/CUDA-7.5/lib/libcudnn* /usr/local/cuda/lib/

BTW, if you get error:

Segmentation fault: 11

please, try this command:

ln -sf /usr/local/cuda/lib/libcuda.dylib /usr/local/cuda/lib/libcuda.1.dylib
@trevorwelch

This comment has been minimized.

Show comment
Hide comment
@trevorwelch

trevorwelch Sep 13, 2016

@JimmyKon Thanks for engaging with this. yes (truncated results indicated by ...) :

$ cd Developer/NVIDIA/CUDA-7.5/include/

$ ls -l
...
-r--r--r--@  1 production204  staff     99657 Jun  9 10:49 cudnn.h
...
CUDA-7.5$ cd lib

lib$ ls -l

...
-rwxr-xr-x@ 1 production204  wheel   60108616 Feb  8  2016 libcudnn.4.dylib
-rwxr-xr-x@ 1 production204  staff   58975112 Jun 10 03:30 libcudnn.5.dylib
lrwxr-xr-x@ 1 production204  staff         16 Jun 10 03:31 libcudnn.dylib -> libcudnn.5.dylib
-rw-r--r--@ 1 production204  staff   56392320 Jun 10 03:30 libcudnn_static.a
...

Slightly different error after re-running with pip instead of tutorial, same missing lib though

ERROR: /Users/production204/Github/tensorflow/tensorflow/contrib/layers/BUILD:30:1: Executing genrule //tensorflow/contrib/layers:bucketization_op_pygenrule failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 5.
dyld: Library not loaded: @rpath/libcudart.7.5.dylib
  Referenced from: /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow/bazel-out/host/bin/tensorflow/contrib/layers/gen_bucketization_op_py_wrappers_cc
  Reason: image not found
/bin/bash: line 1:  5368 Trace/BPT trap: 5       bazel-out/host/bin/tensorflow/contrib/layers/gen_bucketization_op_py_wrappers_cc 0 > bazel-out/local_darwin-opt/genfiles/tensorflow/contrib/layers/ops/gen_bucketization_op.py
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 1748.825s, Critical Path: 1673.19s

trevorwelch commented Sep 13, 2016

@JimmyKon Thanks for engaging with this. yes (truncated results indicated by ...) :

$ cd Developer/NVIDIA/CUDA-7.5/include/

$ ls -l
...
-r--r--r--@  1 production204  staff     99657 Jun  9 10:49 cudnn.h
...
CUDA-7.5$ cd lib

lib$ ls -l

...
-rwxr-xr-x@ 1 production204  wheel   60108616 Feb  8  2016 libcudnn.4.dylib
-rwxr-xr-x@ 1 production204  staff   58975112 Jun 10 03:30 libcudnn.5.dylib
lrwxr-xr-x@ 1 production204  staff         16 Jun 10 03:31 libcudnn.dylib -> libcudnn.5.dylib
-rw-r--r--@ 1 production204  staff   56392320 Jun 10 03:30 libcudnn_static.a
...

Slightly different error after re-running with pip instead of tutorial, same missing lib though

ERROR: /Users/production204/Github/tensorflow/tensorflow/contrib/layers/BUILD:30:1: Executing genrule //tensorflow/contrib/layers:bucketization_op_pygenrule failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 5.
dyld: Library not loaded: @rpath/libcudart.7.5.dylib
  Referenced from: /private/var/tmp/_bazel_production204/ed2bbf43bcd665c40f1e3ebaa04f68f6/execroot/tensorflow/bazel-out/host/bin/tensorflow/contrib/layers/gen_bucketization_op_py_wrappers_cc
  Reason: image not found
/bin/bash: line 1:  5368 Trace/BPT trap: 5       bazel-out/host/bin/tensorflow/contrib/layers/gen_bucketization_op_py_wrappers_cc 0 > bazel-out/local_darwin-opt/genfiles/tensorflow/contrib/layers/ops/gen_bucketization_op.py
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 1748.825s, Critical Path: 1673.19s

@JimmyKon

This comment has been minimized.

Show comment
Hide comment
@JimmyKon

JimmyKon Sep 13, 2016

@trevorwelch you are welcome :)

you can use --verbose_failures to see the error details, please let me know.

JimmyKon commented Sep 13, 2016

@trevorwelch you are welcome :)

you can use --verbose_failures to see the error details, please let me know.

@davidzchen

This comment has been minimized.

Show comment
Hide comment
@davidzchen

davidzchen Sep 13, 2016

Member

@JimmyKon Thanks for looking into this and for providing the workaround!

It would be good to figure out why the Bazel build is not providing the correct libcudart library since it should have been linked from @local_config_cuda//cuda:cudart.

@JimmyKon @trevorwelch When you ran your configure script, did you explicitly provide a CUDA version?

Member

davidzchen commented Sep 13, 2016

@JimmyKon Thanks for looking into this and for providing the workaround!

It would be good to figure out why the Bazel build is not providing the correct libcudart library since it should have been linked from @local_config_cuda//cuda:cudart.

@JimmyKon @trevorwelch When you ran your configure script, did you explicitly provide a CUDA version?

@JimmyKon

This comment has been minimized.

Show comment
Hide comment
@JimmyKon

JimmyKon Sep 13, 2016

@davidzchen

When you ran your configure script, did you explicitly provide a CUDA version?

i did.

i resolved this problem, through add lib path into genrule-setup.sh file.

JimmyKon commented Sep 13, 2016

@davidzchen

When you ran your configure script, did you explicitly provide a CUDA version?

i did.

i resolved this problem, through add lib path into genrule-setup.sh file.

@JimmyKon

This comment has been minimized.

Show comment
Hide comment
@JimmyKon

JimmyKon Sep 13, 2016

@davidzchen

Maybe my solution quite trick, but it is work. :)

JimmyKon commented Sep 13, 2016

@davidzchen

Maybe my solution quite trick, but it is work. :)

@trevorwelch

This comment has been minimized.

Show comment
Hide comment
@trevorwelch

trevorwelch Sep 13, 2016

@davidzchen

When you ran your configure script, did you explicitly provide a CUDA version?

Yes, I've tried both ways (system default, or explicitly pointing).

I imagine that in future versions of bazel I won't have this issue, but bazel seems to just not work in general on my particular system configuration as it relates to cuda. For example, on tensorflow/magenta, all of my bazel tests fail, but magenta runs fine on the GPU.

$ bazel run //magenta/models/lookback_rnn:lookback_rnn_generate -- \
> --run_dir=/tmp/lookback_rnn/logdir/run1 \
> --hparams="{'batch_size':64,'rnn_layer_sizes':[64,64]}" \
> --output_dir=/tmp/lookback_rnn/generated \
> --num_outputs=10 \
> --num_steps=128 \
> --primer_melody="[60]"
INFO: Found 1 target...
Target //magenta/models/lookback_rnn:lookback_rnn_generate up-to-date:
  bazel-bin/magenta/models/lookback_rnn/lookback_rnn_generate
INFO: Elapsed time: 0.460s, Critical Path: 0.00s

INFO: Running command line: bazel-bin/magenta/models/lookback_rnn/lookback_rnn_generate '--run_dir=/tmp/lookback_rnn/logdir/run1' '--hparams={'\''batch_size'\'':64,'\''rnn_layer_sizes'\'':[64,64]}' '--output_dir=/tmp/lookback_rnn/generated' '--num_outputs=10' '--num_steps=128' '--primer_melody=[60]'
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.1.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.dylib locally
INFO:tensorflow:hparams = {'rnn_layer_sizes': [64, 64], 'temperature': 1.0, 'decay_rate': 0.95, 'dropout_keep_prob': 1.0, 'batch_size': 1, 'decay_steps': 1000, 'clip_norm': 5, 'initial_learning_rate': 0.01, 'skip_first_n_losses': 0}
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell.BasicLSTMCell object at 0x1274a7a50>: Using a concatenated state is slower and will soon be deprecated.  Use state_is_tuple=True.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell.BasicLSTMCell object at 0x127126b50>: Using a concatenated state is slower and will soon be deprecated.  Use state_is_tuple=True.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] OS X does not support NUMA - returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GT 650M
major: 3 minor: 0 memoryClockRate (GHz) 0.9
pciBusID 0000:01:00.0
Total memory: 1023.69MiB
Free memory: 151.57MiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0)
INFO:tensorflow:Checkpoint used: /tmp/lookback_rnn/logdir/run1/train/model.ckpt-111
INFO:tensorflow:Wrote 10 MIDI files to /tmp/lookback_rnn/generated

Anyway, in the meantime I have TF running on the GPU, and I don't need to build from source for any particular project at the moment.

trevorwelch commented Sep 13, 2016

@davidzchen

When you ran your configure script, did you explicitly provide a CUDA version?

Yes, I've tried both ways (system default, or explicitly pointing).

I imagine that in future versions of bazel I won't have this issue, but bazel seems to just not work in general on my particular system configuration as it relates to cuda. For example, on tensorflow/magenta, all of my bazel tests fail, but magenta runs fine on the GPU.

$ bazel run //magenta/models/lookback_rnn:lookback_rnn_generate -- \
> --run_dir=/tmp/lookback_rnn/logdir/run1 \
> --hparams="{'batch_size':64,'rnn_layer_sizes':[64,64]}" \
> --output_dir=/tmp/lookback_rnn/generated \
> --num_outputs=10 \
> --num_steps=128 \
> --primer_melody="[60]"
INFO: Found 1 target...
Target //magenta/models/lookback_rnn:lookback_rnn_generate up-to-date:
  bazel-bin/magenta/models/lookback_rnn/lookback_rnn_generate
INFO: Elapsed time: 0.460s, Critical Path: 0.00s

INFO: Running command line: bazel-bin/magenta/models/lookback_rnn/lookback_rnn_generate '--run_dir=/tmp/lookback_rnn/logdir/run1' '--hparams={'\''batch_size'\'':64,'\''rnn_layer_sizes'\'':[64,64]}' '--output_dir=/tmp/lookback_rnn/generated' '--num_outputs=10' '--num_steps=128' '--primer_melody=[60]'
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.1.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.dylib locally
INFO:tensorflow:hparams = {'rnn_layer_sizes': [64, 64], 'temperature': 1.0, 'decay_rate': 0.95, 'dropout_keep_prob': 1.0, 'batch_size': 1, 'decay_steps': 1000, 'clip_norm': 5, 'initial_learning_rate': 0.01, 'skip_first_n_losses': 0}
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell.BasicLSTMCell object at 0x1274a7a50>: Using a concatenated state is slower and will soon be deprecated.  Use state_is_tuple=True.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell.BasicLSTMCell object at 0x127126b50>: Using a concatenated state is slower and will soon be deprecated.  Use state_is_tuple=True.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] OS X does not support NUMA - returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GT 650M
major: 3 minor: 0 memoryClockRate (GHz) 0.9
pciBusID 0000:01:00.0
Total memory: 1023.69MiB
Free memory: 151.57MiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0)
INFO:tensorflow:Checkpoint used: /tmp/lookback_rnn/logdir/run1/train/model.ckpt-111
INFO:tensorflow:Wrote 10 MIDI files to /tmp/lookback_rnn/generated

Anyway, in the meantime I have TF running on the GPU, and I don't need to build from source for any particular project at the moment.

@tcfuji

This comment has been minimized.

Show comment
Hide comment
@tcfuji

tcfuji Sep 17, 2016

Contributor

Getting the same error while installing version 0.10.0 from source. Not getting the error with version 0.10.0rc0 though.

Contributor

tcfuji commented Sep 17, 2016

Getting the same error while installing version 0.10.0 from source. Not getting the error with version 0.10.0rc0 though.

@thefonseca

This comment has been minimized.

Show comment
Hide comment
@thefonseca

thefonseca Sep 24, 2016

@JimmyKon fix worked perfectly in my case (Master branch).

My specs:

  • Clang: 7.3 build 703
  • macOS: 10.11.6-x86_64
  • CUDA V7.5.26
  • CUDNN 5.1
  • GPU: GT750M

thefonseca commented Sep 24, 2016

@JimmyKon fix worked perfectly in my case (Master branch).

My specs:

  • Clang: 7.3 build 703
  • macOS: 10.11.6-x86_64
  • CUDA V7.5.26
  • CUDNN 5.1
  • GPU: GT750M
@cwindolf

This comment has been minimized.

Show comment
Hide comment
@cwindolf

cwindolf Oct 2, 2016

@JimmyKon 's fixes worked for me. had to modify the genrule script and symlink like in #3263. for: r0.10 on OSX 10.10.5 with Cuda 7.5, CUDNN 5, and a GT 650M.

cwindolf commented Oct 2, 2016

@JimmyKon 's fixes worked for me. had to modify the genrule script and symlink like in #3263. for: r0.10 on OSX 10.10.5 with Cuda 7.5, CUDNN 5, and a GT 650M.

@virajago

This comment has been minimized.

Show comment
Hide comment
@virajago

virajago Oct 12, 2016

@JimmyKon's fix worked for me.

Config
OSX 10.12
CUDA 8
XCode 7.3
r0.11

virajago commented Oct 12, 2016

@JimmyKon's fix worked for me.

Config
OSX 10.12
CUDA 8
XCode 7.3
r0.11

@BKJackson

This comment has been minimized.

Show comment
Hide comment
@BKJackson

BKJackson Oct 14, 2016

I couldn't find a "genrule-setup.sh" anywhere on my Mac.

I've been going in circles--trying brew/pip/conda, 2.7/3.5, trying to get TensorFlow w/NVIDIA GPU support working. It seems that Mac OS X is simply poorly supported by TensorFlow and/or NVIDIA.

BKJackson commented Oct 14, 2016

I couldn't find a "genrule-setup.sh" anywhere on my Mac.

I've been going in circles--trying brew/pip/conda, 2.7/3.5, trying to get TensorFlow w/NVIDIA GPU support working. It seems that Mac OS X is simply poorly supported by TensorFlow and/or NVIDIA.

@virajago

This comment has been minimized.

Show comment
Hide comment
@virajago

virajago Oct 14, 2016

@BKJackson can you send the versions of OSX, CUDA, Xcode, tf etc.

genrule-setup.sh is located in a tmp directory. There is a symbolic link called bazel-tensorflow in your tensorflow folder which points to the tmp directory.

so try this tensorflow-dir/bazel-tensorflow/external/bazel_tools/tools/genrule/genrule-setup.sh

virajago commented Oct 14, 2016

@BKJackson can you send the versions of OSX, CUDA, Xcode, tf etc.

genrule-setup.sh is located in a tmp directory. There is a symbolic link called bazel-tensorflow in your tensorflow folder which points to the tmp directory.

so try this tensorflow-dir/bazel-tensorflow/external/bazel_tools/tools/genrule/genrule-setup.sh

@gongwr

This comment has been minimized.

Show comment
Hide comment
@gongwr

gongwr Oct 14, 2016

I tried @JimmyKon 's solution, worked for me. Thanks.

OS X 10.11,
CUDA 7.5.27 CUDNN 5.1,
Xcode 7.3.1
tensorflow-0.11.0rc0-py2-none-any.whl
anaconda/python2.7
gt 750m
didn't specify cuda/cudnn versions in ./configure

gongwr commented Oct 14, 2016

I tried @JimmyKon 's solution, worked for me. Thanks.

OS X 10.11,
CUDA 7.5.27 CUDNN 5.1,
Xcode 7.3.1
tensorflow-0.11.0rc0-py2-none-any.whl
anaconda/python2.7
gt 750m
didn't specify cuda/cudnn versions in ./configure

@BKJackson

This comment has been minimized.

Show comment
Hide comment
@BKJackson

BKJackson Oct 14, 2016

OSX: 10.11.6 (El Capitan)
GPU: NVIDIA GeForce GT 750M 2048 MB (0.5 in the NVIDIA/CUDA number system)
Xcode: 7.3.1
CUDA 7.5.27 CUDNN 5.1
Python default environment: 3.5.0rc4 | Anaconda custom (x86_64)
GCC: 4.2.1 (Apple build 5577) on darwin
TensorFlow: Installed in Conda Python 2.7 environment ("py27"): 0.11.0rc0 installed with pip (& protobuf 3.0.0 also installed with pip).

When try "import tensorflow as tf" in my "py27" Anaconda environment, I get the error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/__init__.py", line 23, in <module>
    from tensorflow.python import *
  File "/Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in <module>
    _pywrap_tensorflow = swig_import_helper()
  File "/Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
ImportError: dlopen(/Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so, 10): no suitable image found.  Did find:
    /Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x03

It looks like it's complaining about swig (?), which I installed with brew. I'm wondering if it's a brew/anaconda problem, because I don't see swig in the "conda list". Can Anaconda's python environment see and use libraries downloaded with brew?

BKJackson commented Oct 14, 2016

OSX: 10.11.6 (El Capitan)
GPU: NVIDIA GeForce GT 750M 2048 MB (0.5 in the NVIDIA/CUDA number system)
Xcode: 7.3.1
CUDA 7.5.27 CUDNN 5.1
Python default environment: 3.5.0rc4 | Anaconda custom (x86_64)
GCC: 4.2.1 (Apple build 5577) on darwin
TensorFlow: Installed in Conda Python 2.7 environment ("py27"): 0.11.0rc0 installed with pip (& protobuf 3.0.0 also installed with pip).

When try "import tensorflow as tf" in my "py27" Anaconda environment, I get the error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/__init__.py", line 23, in <module>
    from tensorflow.python import *
  File "/Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in <module>
    _pywrap_tensorflow = swig_import_helper()
  File "/Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
ImportError: dlopen(/Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so, 10): no suitable image found.  Did find:
    /Applications/anaconda/envs/py27/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x03

It looks like it's complaining about swig (?), which I installed with brew. I'm wondering if it's a brew/anaconda problem, because I don't see swig in the "conda list". Can Anaconda's python environment see and use libraries downloaded with brew?

@virajago

This comment has been minimized.

Show comment
Hide comment
@virajago

virajago Oct 17, 2016

I dont think Anaconda's python environment sees brew python's packages. Usually the search path is setup in bash profile where one takes precedence over the other

virajago commented Oct 17, 2016

I dont think Anaconda's python environment sees brew python's packages. Usually the search path is setup in bash profile where one takes precedence over the other

@vade

This comment has been minimized.

Show comment
Hide comment
@vade

vade Nov 18, 2016

Contributor

As a data point, configuring TF 0.11.0 source from git (282823b) for GPU usage specifying '8.0' in the configure script, and using Xcode Command Line Tools 7.3.1 results in the same error

Executing genrule //tensorflow/contrib/layers:bucketization_op_pygenrule failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 6. dyld: Library not loaded: @rpath/libcudart.8.0.dylib
I in fact do have a libcudart.8.0.dylib in the default cuda install location.

Contributor

vade commented Nov 18, 2016

As a data point, configuring TF 0.11.0 source from git (282823b) for GPU usage specifying '8.0' in the configure script, and using Xcode Command Line Tools 7.3.1 results in the same error

Executing genrule //tensorflow/contrib/layers:bucketization_op_pygenrule failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 6. dyld: Library not loaded: @rpath/libcudart.8.0.dylib
I in fact do have a libcudart.8.0.dylib in the default cuda install location.

@tolomaus

This comment has been minimized.

Show comment
Hide comment
@tolomaus

tolomaus Dec 11, 2016

@JimmyKon 's solution worked here as well:

OS X 10.12,
CUDA 8.0 CUDNN 5.2,
Xcode Command line tools
tensorflow 0.12.0rc1
python3.5
gt 650m
didn't specify cuda/cudnn versions in ./configure

Thanks Jimmy!

tolomaus commented Dec 11, 2016

@JimmyKon 's solution worked here as well:

OS X 10.12,
CUDA 8.0 CUDNN 5.2,
Xcode Command line tools
tensorflow 0.12.0rc1
python3.5
gt 650m
didn't specify cuda/cudnn versions in ./configure

Thanks Jimmy!

@gunan

This comment has been minimized.

Show comment
Hide comment
@gunan

gunan Dec 21, 2016

Member

I am curious as our mac test machines have been working fine so far.

Is it possible the LD_LIBRARY_PATH environment variable is not set to point to cuda, when you are running into the issue?

Member

gunan commented Dec 21, 2016

I am curious as our mac test machines have been working fine so far.

Is it possible the LD_LIBRARY_PATH environment variable is not set to point to cuda, when you are running into the issue?

@kovek

This comment has been minimized.

Show comment
Hide comment
@kovek

kovek Dec 24, 2016

@gunan 's comment helped me!

I've been reading about this issue and possible fixes for it for a bit now.

People have success with

sudo ln -s /usr/local/cuda/lib/libcuda.dylib /usr/local/cuda/lib/libcuda.1.dylib

and with setting

export DYLD_LIBRARY_PATH="/Developer/NVIDIA/CUDA-8.0/lib:/usr/local/cuda/lib"

However. In my case, these two did not help solve the issue fully. I also had to do:

export LD_LIBRARY_PATH=/usr/local/cuda/lib

Now, it works!

kovek commented Dec 24, 2016

@gunan 's comment helped me!

I've been reading about this issue and possible fixes for it for a bit now.

People have success with

sudo ln -s /usr/local/cuda/lib/libcuda.dylib /usr/local/cuda/lib/libcuda.1.dylib

and with setting

export DYLD_LIBRARY_PATH="/Developer/NVIDIA/CUDA-8.0/lib:/usr/local/cuda/lib"

However. In my case, these two did not help solve the issue fully. I also had to do:

export LD_LIBRARY_PATH=/usr/local/cuda/lib

Now, it works!

@gunan

This comment has been minimized.

Show comment
Hide comment
@gunan

gunan Dec 25, 2016

Member

OK, then I will close this issue.
Please reopen if you still run into it.

Member

gunan commented Dec 25, 2016

OK, then I will close this issue.
Please reopen if you still run into it.

@gunan gunan closed this Dec 25, 2016

@sholtodouglas

This comment has been minimized.

Show comment
Hide comment
@sholtodouglas

sholtodouglas Jan 6, 2017

Hi, I'm running into the same issue, MBP late 2013, CUDA 8.0 CuDNN 5.1, OXS 10.12.1 Py 2.7, but I can't find the genrule folder. In fact, I can't even find tensorflow/external. Any idea why this might be?

sholtodouglas commented Jan 6, 2017

Hi, I'm running into the same issue, MBP late 2013, CUDA 8.0 CuDNN 5.1, OXS 10.12.1 Py 2.7, but I can't find the genrule folder. In fact, I can't even find tensorflow/external. Any idea why this might be?

@yaroslavvb

This comment has been minimized.

Show comment
Hide comment
@yaroslavvb

yaroslavvb Jan 6, 2017

Contributor

The error means you have TensorFlow built for CUDA 7.5 (I'm guessing you have 0.11 or earlier)

Contributor

yaroslavvb commented Jan 6, 2017

The error means you have TensorFlow built for CUDA 7.5 (I'm guessing you have 0.11 or earlier)

@gojira

This comment has been minimized.

Show comment
Hide comment
@gojira

gojira Jan 14, 2017

Contributor

I just hit this error in r1.0 alpha.

dyld: Library not loaded: @rpath/libcudart.8.0.dylib
Referenced from: /private/var/tmp/_bazel_Keiji/1dc08e629317533c84e553c6b75a1110/execroot/tensorflow/bazel-out/host/bin/tensorflow/contrib/layers/gen_bucketization_op_py_wrappers_cc
Reason: image not found
/bin/bash: line 1: 83868 Abort trap: 6 bazel-out/host/bin/tensorflow/contrib/layers/gen_bucketization_op_py_wrappers_cc 0 > bazel-out/local_darwin-py3-opt/genfiles/tensorflow/contrib/layers/ops/gen_bucketization_op.py

The @JimmyKon instructions worked here as well.

macOS 10.12.2
Anaconda Python 3.5
CUDA 8.0

Contributor

gojira commented Jan 14, 2017

I just hit this error in r1.0 alpha.

dyld: Library not loaded: @rpath/libcudart.8.0.dylib
Referenced from: /private/var/tmp/_bazel_Keiji/1dc08e629317533c84e553c6b75a1110/execroot/tensorflow/bazel-out/host/bin/tensorflow/contrib/layers/gen_bucketization_op_py_wrappers_cc
Reason: image not found
/bin/bash: line 1: 83868 Abort trap: 6 bazel-out/host/bin/tensorflow/contrib/layers/gen_bucketization_op_py_wrappers_cc 0 > bazel-out/local_darwin-py3-opt/genfiles/tensorflow/contrib/layers/ops/gen_bucketization_op.py

The @JimmyKon instructions worked here as well.

macOS 10.12.2
Anaconda Python 3.5
CUDA 8.0

@WellDone2094

This comment has been minimized.

Show comment
Hide comment
@WellDone2094

WellDone2094 Jan 19, 2017

the first solution worked for me thanks

WellDone2094 commented Jan 19, 2017

the first solution worked for me thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment