Skip to content

Commit

Permalink
Fix for TF build failure with ROCm 3.9 (error: call to 'min' is ambig…
Browse files Browse the repository at this point in the history
…uous)

When building TF with ROCm 3.9, we are running into the following compile error

```
In file included from tensorflow/core/kernels/reduction_ops_half_mean_sum.cu.cc:20:
./tensorflow/core/kernels/reduction_gpu_kernels.cu.h:430:9: error: call to 'min' is ambiguous
        min(blockDim.y, num_rows - blockIdx.y * blockDim.y);
        ^~~
/opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1183:23: note: candidate function
__DEVICE__ inline int min(int __arg1, int __arg2) {
                      ^
/opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1197:14: note: candidate function
inline float min(float __x, float __y) { return fminf(__x, __y); }
             ^
/opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1200:15: note: candidate function
inline double min(double __x, double __y) { return fmin(__x, __y); }
              ^
1 error generated when compiling for gfx803.
```

The build error seems to be because ROCm 3.9 uses llvm header files from `llvm/lib/clang/12.0.0/include` (ROCm 3.8 uses the `11.0.0` version). `12.0.0` has a new `__clang_hip_math.h` file, which is not present in `11.0.0`. This file has the `min` function overloaded for the `float` and `double` types.

The first argument in the call to `min` (which leads to the error) is `blockDim.y` which has a `uint` type, and hence the compiler gets confused as to which overloaded type to resole to. Previously (i.e. ROCm 3.8 and before) there was only one option (`int`), with ROCm 3.9 there are three (`int`, `float`, and `double`) and hence the error.

The "fix" is to explicitly cast the first argument to `int` to remove the ambiguity (the second argument is already an `int` type).
  • Loading branch information
deven-amd authored and serach24 committed Jun 4, 2021
1 parent 14d995f commit c0e144b
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion tensorflow/core/kernels/reduction_gpu_kernels.cu.h
Expand Up @@ -387,7 +387,7 @@ __global__ __launch_bounds__(1024) void ColumnReduceKernel(
// - =
// =
const int numRowsThisBlock =
min(blockDim.y, num_rows - blockIdx.y * blockDim.y);
min(int(blockDim.y), num_rows - blockIdx.y * blockDim.y);

for (int row = 1; row < numRowsThisBlock; ++row) {
value_type t = partial_sums[threadIdx.x * (TF_RED_WARPSIZE + 1) + row];
Expand Down

0 comments on commit c0e144b

Please sign in to comment.