Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda/cuda_config.h missing when compiling custom ops with nvcc #12860

Closed
pronobis opened this issue Sep 7, 2017 · 23 comments
Closed

cuda/cuda_config.h missing when compiling custom ops with nvcc #12860

pronobis opened this issue Sep 7, 2017 · 23 comments
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower type:build/install Build and install issues

Comments

@pronobis
Copy link

pronobis commented Sep 7, 2017

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
    Yes, see below.
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
    Ubuntu 16.04
  • TensorFlow installed from (source or binary):
    Source
  • TensorFlow version (use command below):
    v1.3.0-0-g9e76bf324 1.3.0
  • Python version:
    3.5.2
  • Bazel version (if compiling from source):
    0.5.4
  • CUDA/cuDNN version:
    8.0.44 / 5.1.5
  • GPU model and memory:
    Any.
  • Exact command to reproduce:
    See below.

Describe the problem

When compiling a custom op using nvcc, which includes tensorflow/core/util/cuda_kernel_helper.h, I get the following error:

/usr/local/cuda-8.0/bin/nvcc -c -o ~/Code/libspn/build/ops/gather_columns_functor_gpu.cu.cc.o ~/Code/libspn/libspn/ops/gather_columns_functor_gpu.cu.cc -std=c++11 -x=cu -Xcompiler -fPIC -DGOOGLE_CUDA=1 --expt-relaxed-constexpr -I ~/.local/lib/python3.5/site-packages/tensorflow/include -gencode=arch=compute_35,"code=sm_35,compute_35" -gencode=arch=compute_52,"code=sm_52,compute_52" -gencode=arch=compute_61,"code=sm_61,compute_61"
In file included from ~/.local/lib/python3.5/site-packages/tensorflow/include/tensorflow/core/platform/default/stream_executor.h:26:0,
                 from ~/.local/lib/python3.5/site-packages/tensorflow/include/tensorflow/core/platform/stream_executor.h:24,
                 from ~/.local/lib/python3.5/site-packages/tensorflow/include/tensorflow/core/util/cuda_kernel_helper.h:26,
                 from ~/Code/libspn/libspn/ops/gather_columns_functor_gpu.cu.h:11,
                 from ~/Code/libspn/libspn/ops/gather_columns_functor_gpu.cu.cc:5:
~/.local/lib/python3.5/site-packages/tensorflow/include/tensorflow/stream_executor/dso_loader.h:32:30: fatal error: cuda/cuda_config.h: No such file or directory
compilation terminated.

Copying cuda_config.h to /site-packages/tensorflow/include/tensorflow/stream_executor/cuda solves the problem.

The same issue has been observed by several other users in #6602 (see the comments added after the issue was closed).

@yangyu12
Copy link

yangyu12 commented Sep 7, 2017

the same problem, I want to know if there is any better way to solve this problem

@aselle
Copy link
Contributor

aselle commented Sep 20, 2017

@allenlavoie, have you looked at this as part of your library cleanup?

@aselle aselle added stat:awaiting response Status - Awaiting response from author stat:awaiting tensorflower Status - Awaiting response from tensorflower and removed stat:awaiting response Status - Awaiting response from author labels Sep 20, 2017
@allenlavoie
Copy link
Member

I haven't. I may have run into it when I was following our custom op documentation. It sounds like worst case we could add a copy in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/pip_package/build_pip_package.sh ? Or maybe a genrule before that.

@jmlago
Copy link

jmlago commented Sep 28, 2017

I have the same problem while compiling with nvcc. Also is because tensorflow/core/util/cuda_kernel_helper.h was included.

But when I copy the file, i get the following error:

../site-packages/tensorflow/include/tensorflow/core/util/cuda_kernel_helper.h:28:80: fatal error: cuda/include/cuda.h: No such file or directory
compilation terminated.

@PatWie
Copy link

PatWie commented Oct 11, 2017

The commit 2c598e8 destroys my makefile, as well. This commit consistly changes in several places the includes

-#include "third_party/gpus/cuda/include/cuda.h"
+#include "cuda/include/cuda.h"

Which gives me the error message:

../site-packages/tensorflow/include/tensorflow/core/util/cuda_kernel_helper.h:28:80: fatal error: cuda/include/cuda.h: No such file or directory
compilation terminated.

compiling user-ops with GPU does not work since 1.3.1.

Where lives that guy, tensorflower-gardener? He is the one who usually introduces those breaking changes!

@HelloSeeing
Copy link

@PatWie +1
I encounter this problem when build android tensorflow_demo, giving the following error message:

tensorflow/core/kernels/lrn_op.cc:34:10: fatal error: 'cuda/include/cuda.h' file not found

@pronobis
Copy link
Author

With TF 1.3, I only find cuda_config.h missing, but when compiling against current master 1.4-rc0/1, I also additionally get:

tensorflow/include/tensorflow/core/util/cuda_kernel_helper.h:24:31: fatal error: cuda/include/cuda.h: No such file or directory
compilation terminated.

@bigbigda
Copy link

bigbigda commented Nov 1, 2017

I also encounter the same problem in TF 1.3.0. This error is introduced with #include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
/usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/stream_executor/dso_loader.h:32:30: fatal error: cuda/cuda_config.h: No such file or directory

@Leo-Mrl
Copy link

Leo-Mrl commented Nov 3, 2017

I also encounter this problem in TF 1.4.0, any clue ?

@kkk324
Copy link

kkk324 commented Nov 10, 2017

Also stucking on this problem in TF 1.4.0...

@Leo-Mrl
Copy link

Leo-Mrl commented Nov 10, 2017

@CR-Ko Downgrading to 1.2.0 "solved" the issue for me

@nikste
Copy link
Contributor

nikste commented Nov 13, 2017

I have a similar issue, resulting in:
.tensorflow-venv/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/util/cuda_kernel_helper.h:24:31: fatal error: cuda/include/cuda.h: No such file or directory
Any other workarounds for this besides downgrading?

@Queequeg92
Copy link

Queequeg92 commented Nov 17, 2017

Downgrading "solution" will cause incompatibility with cudnn. You also need to downgrade cudnn.@nikste
Could installing from source solve this problem?@pronobis @PatWie

@PatWie
Copy link

PatWie commented Nov 17, 2017

No change here. I currently stick to TF1.2 compiled from source (I always compile the library from source). It is frustrating. @allenlavoie or @aselle can you give us at least a hint?

edit: I tracked it down to commit 2c598e8.

This is a MWE for v1.2 for reproducing the error. It works under TF1.2 but not TF1.4.

@nikste
Copy link
Contributor

nikste commented Nov 27, 2017

@Queequeg92 installing from source did not fix the problem, ended up manually adding paths to .h files as additional include directories and copying cuda_config.h from source to my project.

@tensorflowbutler
Copy link
Member

It has been 14 days with no activity and the awaiting tensorflower label was assigned. Please update the label and/or status accordingly.

@John1231983
Copy link

Hello. I also have the problem with T.F 1.4 although it worked in T.F 1.2. Have anyone fix it with T.F 1.4

@gunan
Copy link
Contributor

gunan commented Jan 5, 2018

Folding all the duplicates into #15002

@gunan gunan closed this as completed Jan 5, 2018
@PatWie
Copy link

PatWie commented Jan 9, 2018

@Queequeg92 I have it working under v1.5rc0 now, see https://github.com/cgtuebingen/tf_custom_op which requires installing from source or copying the cuda_config.h from #15002.

@Queequeg92
Copy link

@PatWie Thank you! I'll have a try.

@Queequeg92
Copy link

@PatWie I have installed v1.5 from source successfully. No cuda_config.h error, but still cuda.h error. 😫

@xysmlx
Copy link

xysmlx commented Feb 6, 2018

I solved this problem by commenting the line *** #include "cuda/cuda_config.h" *** in the "dso_loader.h" file. And the custom op works normally.

@PatWie
Copy link

PatWie commented Feb 6, 2018

@Queequeg92
The cuda.h is part of the cuda toolkit or whatever Nvidia ships. It is not part of TensorFlow, that is what the lines

# use cuda
find_package(CUDA 9.0 EXACT REQUIRED)
set(CUDA_SAMPLE_INC "${CUDA_INCLUDE_DIRS}/../samples/common/inc")
message(STATUS "CUDA_INCLUDE_DIRS: ${CUDA_INCLUDE_DIRS}")
include_directories(SYSTEM "${CUDA_INCLUDE_DIRS}/../../")

are for. But the latest version of TF is broken again (#15002 (comment))

@xysmlx
When writing cuda ops, you probably want to use the TensorFlow cuda_config part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower type:build/install Build and install issues
Projects
None yet
Development

No branches or pull requests