Skip to content

[r2.5 port][ROCm] Port PR 47980 to r2.5 #48444

Merged
mihaimaruseac merged 2 commits intotensorflow:r2.5from
ROCm:google_upstream_r25_port_pr_47980
Apr 22, 2021
Merged

[r2.5 port][ROCm] Port PR 47980 to r2.5 #48444
mihaimaruseac merged 2 commits intotensorflow:r2.5from
ROCm:google_upstream_r25_port_pr_47980

Conversation

@deven-amd
Copy link
Contributor

@deven-amd deven-amd commented Apr 9, 2021

The following commit adds GPU support for int32/int64 support for the Unique/UniqueWithCounts ops, but breaks ROCm build in the process

tensorflow@02585ac

```
tensorflow/core/kernels/unique_op_gpu.cu.cc:292:9: error: no matching constructor for initialization of 'gpuprim::TransformInputIterator<int, SegmentIndicatorFunctor<unsigned long, int>, gpuprim::CountingInputIterator<int>>' (aka 'transform_iterator<rocprim::counting_iterator<int, long>, tensorflow::(anonymous namespace)::SegmentIndicatorFunctor<unsigned long, int>, int>')
        segment_indicator_iter(0, {sorted_input_ptr});
        ^                      ~~~~~~~~~~~~~~~~~~~~~
tensorflow/core/kernels/unique_op_gpu.cu.cc:176:12: note: in instantiation of member function 'tensorflow::UniqueOpGPU<unsigned long, int>::ComputeAsync' requested here
  explicit UniqueOpGPU(OpKernelConstruction* context)
           ^
tensorflow/core/kernels/unique_op_gpu.cu.cc:461:27: note: in instantiation of member function 'tensorflow::UniqueOpGPU<unsigned long, int>::UniqueOpGPU' requested here
TF_CALL_REAL_NUMBER_TYPES(REGISTER_UNIQUE_GPU);
```

This PR/commit disables ROCm support for newly added ops to get the CSB passing again. We are looking into resolving the build errors, and will file a separate PR to re-enable ROCm functionality for the same. This PR commit also adds the `no_rocm` tag to a couple of unit tests that start failing as a consequence of lack of ROCm support for these ops.

```
//tensorflow/python/keras/optimizer_v2:adamax_test_gpu                   FAILED in 3 out of 3 in 7.0s
//tensorflow/python/training:adam_test_gpu                               FAILED in 3 out of 3 in 6.3s
```
…m platform

tensorflow@50f8897

```
//tensorflow/python/keras/distribute:dataset_creator_model_fit_test_gpu  FAILED in 3 out of 3 in 112.5s
```

This commit adds a `no_rocm` tag to temporarily disable that unit-test on ROCm.
@google-cla google-cla bot added the cla: yes label Apr 9, 2021
@gbaned gbaned self-assigned this Apr 12, 2021
@gbaned gbaned added the size:XS CL Change Size: Extra Small label Apr 12, 2021
@gbaned gbaned assigned mihaimaruseac and unassigned gbaned Apr 12, 2021
@gbaned gbaned requested a review from mihaimaruseac April 12, 2021 05:12
@mihaimaruseac mihaimaruseac merged commit d25f3bc into tensorflow:r2.5 Apr 22, 2021
@deven-amd deven-amd deleted the google_upstream_r25_port_pr_47980 branch May 12, 2021 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes size:XS CL Change Size: Extra Small

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants