Fix compiler error with cuda-clang #17171

ilya-biryukov · 2018-02-21T16:37:34Z

segment_reduction_ops.h requires cuda_kernel_helper.h to be
included in clang because it uses some of the helpers directly in the
header (e.g. CudaAtomicMax). It works with nvcc, because the usage is
in a template context and nvcc checks that function is available only later
at template instantiation.
However, clang does more strict erorr-checking for functions found
during template instantiation and requires them to also be found either by
ADL or at the point of template declaration.

segment_reduction_ops.h requires cuda_kernel_helper.h to be included in clang because it uses some of the helpers directly in the header (e.g. CudaAtomicMax). It works with nvcc, because the usage is in a template context and nvcc checks that function is available only at template instantiation. However, clang does more strict erorr-checking for functions found during template instantiation and requires them to be found either by ADL or at the point of template declaration.

gunan · 2018-02-22T18:24:23Z

tensorflow/core/kernels/segment_reduction_ops.h

@@ -16,6 +16,14 @@ limitations under the License.
 #ifndef THIRD_PARTY_TENSORFLOW_CORE_KERNELS_SEGMENT_REDUCTION_OPS_H_
 #define THIRD_PARTY_TENSORFLOW_CORE_KERNELS_SEGMENT_REDUCTION_OPS_H_

+
+// This file requires the following include because it uses CudaAtomicMax:
+// #include "tensorflow/core/util/cuda_kernel_helper.h"


What if we guard the include with "ifdef GOOGLE_CUDA" ?

I tried doing that, but discovered that CPU builds also set GOOGLE_CUDA (see tensorflow.bzl#1 and tensorflow.bzl#2, both seem to propagate to '.cc' files as well as '.cu.cc' files)
Probably tensorflow.bzl shouldn't set GOOGLE_CUDA, as this define is handled by the [GPU crosstool] (

tensorflow/third_party/gpus/cuda/build_defs.bzl.tpl

Line 18 in 9054c9b

return if_cuda(["-x", "cuda", "-DGOOGLE_CUDA=1"] + %{cuda_extra_copts})

).

However when I tried removing '-DGOOGLE_CUDA=1' from tensorflow.bzl, but I got a bunch of link errors and didn't investigate any further. Even if we want to dig deeper into this, it would be nice to have a workaround to unbreak cuda_clang while we investigate.

You are right, that seems like a bug.
If this works as is now, I am OK with merging it.

googlebot added the cla: yes label Feb 21, 2018

gunan reviewed Feb 22, 2018

View reviewed changes

gunan approved these changes Feb 23, 2018

View reviewed changes

gunan added awaiting testing (then merge) kokoro:force-run Tests on submitted change labels Feb 23, 2018

kokoro-team removed the kokoro:force-run Tests on submitted change label Feb 23, 2018

ekelsen merged commit 6eb8f8c into tensorflow:master Feb 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix compiler error with cuda-clang #17171

Fix compiler error with cuda-clang #17171

ilya-biryukov commented Feb 21, 2018

gunan Feb 22, 2018

ilya-biryukov Feb 23, 2018

gunan Feb 23, 2018

Fix compiler error with cuda-clang #17171

Fix compiler error with cuda-clang #17171

Conversation

ilya-biryukov commented Feb 21, 2018

gunan Feb 22, 2018

Choose a reason for hiding this comment

ilya-biryukov Feb 23, 2018

Choose a reason for hiding this comment

gunan Feb 23, 2018

Choose a reason for hiding this comment