Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix compiler error with cuda-clang #17171

Merged
merged 1 commit into from
Feb 23, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
8 changes: 8 additions & 0 deletions tensorflow/core/kernels/segment_reduction_ops.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,14 @@ limitations under the License.
#ifndef THIRD_PARTY_TENSORFLOW_CORE_KERNELS_SEGMENT_REDUCTION_OPS_H_
#define THIRD_PARTY_TENSORFLOW_CORE_KERNELS_SEGMENT_REDUCTION_OPS_H_


// This file requires the following include because it uses CudaAtomicMax:
// #include "tensorflow/core/util/cuda_kernel_helper.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we guard the include with "ifdef GOOGLE_CUDA" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried doing that, but discovered that CPU builds also set GOOGLE_CUDA (see tensorflow.bzl#1 and tensorflow.bzl#2, both seem to propagate to '.cc' files as well as '.cu.cc' files)
Probably tensorflow.bzl shouldn't set GOOGLE_CUDA, as this define is handled by the [GPU crosstool] (

return if_cuda(["-x", "cuda", "-DGOOGLE_CUDA=1"] + %{cuda_extra_copts})
).

However when I tried removing '-DGOOGLE_CUDA=1' from tensorflow.bzl, but I got a bunch of link errors and didn't investigate any further. Even if we want to dig deeper into this, it would be nice to have a workaround to unbreak cuda_clang while we investigate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, that seems like a bug.
If this works as is now, I am OK with merging it.


// Unfortunately we can't add the #include, since it breaks compilation for
// non-GPU targets. This only breaks in clang, because it's more strict for
// template code and CudaAtomicMax is used in template context.

#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/framework/tensor_shape.h"
Expand Down
5 changes: 4 additions & 1 deletion tensorflow/core/kernels/segment_reduction_ops_gpu.cu.cc
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,13 @@ limitations under the License.

#define EIGEN_USE_GPU

// We need to include cuda_kernel_helper.h before segment_reduction_ops.h
// See comment in segment_reduction_ops.h for more details.
#include "tensorflow/core/util/cuda_kernel_helper.h"

#include "tensorflow/core/kernels/segment_reduction_ops.h"
#include "tensorflow/core/framework/register_types.h"
#include "tensorflow/core/util/cuda_device_functions.h"
#include "tensorflow/core/util/cuda_kernel_helper.h"


namespace tensorflow {
Expand Down