Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix compiler error with cuda-clang #17171

Merged
merged 1 commit into from
Feb 23, 2018
Merged

Fix compiler error with cuda-clang #17171

merged 1 commit into from
Feb 23, 2018

Conversation

ilya-biryukov
Copy link
Contributor

segment_reduction_ops.h requires cuda_kernel_helper.h to be
included in clang because it uses some of the helpers directly in the
header (e.g. CudaAtomicMax). It works with nvcc, because the usage is
in a template context and nvcc checks that function is available only later
at template instantiation.
However, clang does more strict erorr-checking for functions found
during template instantiation and requires them to also be found either by
ADL or at the point of template declaration.

segment_reduction_ops.h requires cuda_kernel_helper.h to be
included in clang because it uses some of the helpers directly in the
header (e.g. CudaAtomicMax). It works with nvcc, because the usage is
in a template context and nvcc checks that function is available only
at template instantiation.
However, clang does more strict erorr-checking for functions found
during template instantiation and requires them to be found either by
ADL or at the point of template declaration.
@@ -16,6 +16,14 @@ limitations under the License.
#ifndef THIRD_PARTY_TENSORFLOW_CORE_KERNELS_SEGMENT_REDUCTION_OPS_H_
#define THIRD_PARTY_TENSORFLOW_CORE_KERNELS_SEGMENT_REDUCTION_OPS_H_


// This file requires the following include because it uses CudaAtomicMax:
// #include "tensorflow/core/util/cuda_kernel_helper.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we guard the include with "ifdef GOOGLE_CUDA" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried doing that, but discovered that CPU builds also set GOOGLE_CUDA (see tensorflow.bzl#1 and tensorflow.bzl#2, both seem to propagate to '.cc' files as well as '.cu.cc' files)
Probably tensorflow.bzl shouldn't set GOOGLE_CUDA, as this define is handled by the [GPU crosstool] (

return if_cuda(["-x", "cuda", "-DGOOGLE_CUDA=1"] + %{cuda_extra_copts})
).

However when I tried removing '-DGOOGLE_CUDA=1' from tensorflow.bzl, but I got a bunch of link errors and didn't investigate any further. Even if we want to dig deeper into this, it would be nice to have a workaround to unbreak cuda_clang while we investigate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, that seems like a bug.
If this works as is now, I am OK with merging it.

@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Feb 23, 2018
@ekelsen ekelsen merged commit 6eb8f8c into tensorflow:master Feb 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants