[ROCm] Adding Cuda Alias to Gpu functions #28343

jerryyin · 2019-05-02T18:15:23Z

This PR is a follow-up to PR #24293. This PR depends on & includes code changes from PR #28116. Code updates for this PR are in the tensorflow/core/util path

Through utility functions defined in gpu_cuda_alias.h, this PR provides a Cuda* alias to existing Gpu* functions. The reason for this PR is to create backward compatibility to both the TensorFlow master branch as well as ROCm's develop-upstream branch. It allows TensorFlow master branch to slowly deprecate Cuda* aliased function, to the final goal of using GPU as the prefix to unify both Cuda and ROCm builds. All aliasing is done through perfect forwarding macro so that it is easy to maintain/take out.

Tests performed: Both Cuda/ROCm build successfully in both master and develop-upstream branch.

@whchung @deven-amd

angerson · 2019-05-02T19:14:06Z

I don't have the expertise to review this; removing myself.

tensorflow/core/util/gpu_launch_config.h

tensorflow/core/util/gpu_kernel_helper.h

jerryyin · 2019-05-06T21:49:29Z

Will post back to address review feedback after PR439 approved. Will also integrate macro only change from GPU_LAUNCH_KERNEL to GpuLaunchKernel

jerryyin · 2019-05-09T17:25:29Z

The on-going effort in commit d8d67ef to achieve the same goal with this PR seems to be duplicated, and is creating hard-to-resolve conflicts with this PR.

Commit d8d67ef used a less desirable way to create aliases - manually copy and rename every function signature. Compared to the approach of creating dedicated macro to perfect forward function name, the commit looks hard to maintain and error prone.

Could you help contact the internal contributor make sure our opinion on the same page with this PR? @chsigg @mrry @rthadur

whchung · 2019-05-09T18:16:02Z

@jerryyin for the time being, I recommend reverting d8d67ef in this PR, as perfect forwarding makes the code better maintainable.

chsigg · 2019-05-09T18:33:03Z

Hi Zhuoran. I started the rename from our side, sorry that I didn't look more carefully at your approach first. Generally, please do not mix several changes into the same PR. It makes it hard to understand and review.
Could you please make a PR that introduces the macros only, and a follow-up to generate the various alias functions/macros/types, maybe even split into multiple PRs. At that point I can start preparing changes to use the new 'gpu' symbols.

jerryyin · 2019-05-09T18:58:55Z

Hi Zhuoran. I started the rename from our side, sorry that I didn't look more carefully at your approach first. Generally, please do not mix several changes into the same PR. It makes it hard to understand and review.
Could you please make a PR that introduces the macros only, and a follow-up to generate the various alias functions/macros/types, maybe even split into multiple PRs. At that point I can start preparing changes to use the new 'gpu' symbols.

@chsigg No worries, good feedback on breaking the PR.

Please prioritize following reviews:

One PR for create forwarding macros
[ROCm] Creating Cuda forwarding alias macros #28564 Creating Cuda forwarding alias macros
One PR for create GpuLaunchKernel
[ROCm] Creating GpuLaunchKernel #28565 Creating GpuLaunchKernel

Then the dependent ones:

Corresponding changes in using the forwarding type marco
[ROCm] Forward type names from gpu prefix to cuda prefix #28567 Forward type names from gpu prefix to cuda prefix
Corresponding changes in using the forwarding host function macro
[ROCm] Forward host function names from gpu prefix to cuda prefix #28568 Forward host function names from gpu prefix to cuda prefix
corresponding changes in using the forwarding device function macro
[ROCm] Forward device function name from gpu prefix to cuda prefix #28571 Forward device function name from gpu prefix to cuda prefix
One PR for Removing GPU_LAUNCH_KERNEL
[ROCm] remove GPU_LAUNCH_KERNEL macro #28566 remove GPU_LAUNCH_KERNEL macro

The final goal of closing the large PR.

The --config=rocm build was broken by the merge for PR tensorflow#26840 This commit backs out the ROCm support in the file avgpooling_op.cc (added by the above PR). This is because the the template instantiations required for GPU support of the average pooling operator (which are in avgpooling_op_gpu.cu.cc) also need to be enabled for ROCm at the same time (as the code in avgpooling_op.cc) in order to avoid link errors with the `--config=rocm` build. Enabling ROCm support for the code in avgpooling_op_gpu.cu.cc requires other PRs (the set spwaned from PR tensorflow#28343) to be merged first. Once those PRs are merged, we will file another PR to re-enable ROCm support in the avgpooling*.cc files.

@chsigg

Imported from GitHub PR #28565 This PR is a follow-up to the original PR #28343. The reviewer requested to break down the original large PR to a series of smaller ones. According to the plan here, this PR is the fifth one in the whole series. @chsigg @whchung Copybara import of the project: - c4756f5 Creating GpuLaunchKernel by zhuoryin <zhuoryin@amd.com> - bd98ac6 Merge c4756f5 into b5170... by Zhuoran Yin <jerryyin@users.noreply.github.com> COPYBARA_INTEGRATE_REVIEW=#28565 from ROCmSoftwarePlatform:google-upstream-pr-GpuLaunchKernel c4756f5 PiperOrigin-RevId: 248006235

jerryyin · 2019-05-17T20:14:01Z

Closing the review as all sub-PRs are merged

tensorflow-bot bot added the size:L CL Change Size: Large label May 2, 2019

googlebot added the cla: yes label May 2, 2019

rthadur self-assigned this May 2, 2019

rthadur added this to Assigned Reviewer in PR Queue via automation May 2, 2019

rthadur added the cuda label May 2, 2019

rthadur requested review from chsigg and angerson May 2, 2019 18:43

angerson removed their request for review May 2, 2019 19:14

whchung added the kokoro:force-run Tests on submitted change label May 2, 2019

kokoro-team removed the kokoro:force-run Tests on submitted change label May 2, 2019

jerryyin force-pushed the google-upstream-pr-gpu-backwardcompat-alias branch 2 times, most recently from 5f09d2c to 6223014 Compare May 3, 2019 15:34

rthadur requested review from mrry and removed request for chsigg May 3, 2019 17:38

jerryyin force-pushed the google-upstream-pr-gpu-backwardcompat-alias branch from 6223014 to 8652f82 Compare May 6, 2019 14:46

whchung added the kokoro:force-run Tests on submitted change label May 6, 2019

kokoro-team removed the kokoro:force-run Tests on submitted change label May 6, 2019

whchung requested a review from chsigg May 6, 2019 16:25

whchung requested changes May 6, 2019

View reviewed changes

tensorflow/core/util/gpu_launch_config.h Outdated Show resolved Hide resolved

PR Queue automation moved this from Assigned Reviewer to Reviewer Requested Changes May 6, 2019

whchung reviewed May 6, 2019

View reviewed changes

tensorflow/core/util/gpu_launch_config.h Outdated Show resolved Hide resolved

whchung reviewed May 6, 2019

View reviewed changes

tensorflow/core/util/gpu_kernel_helper.h Outdated Show resolved Hide resolved

whchung mentioned this pull request May 6, 2019

Replace GPU_LAUNCH_KERNEL with GpuLaunchKernel ROCm/tensorflow-upstream#439

Merged

deven-amd mentioned this pull request May 7, 2019

[ROCm] Adding ROCm support for "searchsorted" op #28479

Merged

whchung added kokoro:force-run Tests on submitted change and removed kokoro:force-run Tests on submitted change labels May 9, 2019

kokoro-team removed the kokoro:force-run Tests on submitted change label May 9, 2019

whchung added the kokoro:force-run Tests on submitted change label May 9, 2019

jerryyin added 5 commits May 9, 2019 17:18

Adding Cuda Alias to Gpu functions

9a60a98

Creating GpuLaunchKernel, making CudaLaunchKernel an alias

f7cb8f1

Removing definition of GPU_LAUNCHLK_KERNEL

ca0a5cd

Addressing review feedbacks and rebase conflicts

b3a9f77

Merging latest update from develop-upstream branch

0f07388

jerryyin force-pushed the google-upstream-pr-gpu-backwardcompat-alias branch from bff8831 to 0f07388 Compare May 9, 2019 17:19

kokoro-team removed the kokoro:force-run Tests on submitted change label May 9, 2019

rthadur requested review from chsigg and whchung May 9, 2019 17:53

rthadur removed the ready to pull PR ready for merge process label May 14, 2019

jerryyin closed this May 17, 2019

PR Queue automation moved this from Reviewer Requested Changes to Closed/Rejected May 17, 2019

jerryyin deleted the google-upstream-pr-gpu-backwardcompat-alias branch December 20, 2019 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ROCm] Adding Cuda Alias to Gpu functions #28343

[ROCm] Adding Cuda Alias to Gpu functions #28343

jerryyin commented May 2, 2019 •

edited

angerson commented May 2, 2019

jerryyin commented May 6, 2019

jerryyin commented May 9, 2019 •

edited

whchung commented May 9, 2019

chsigg commented May 9, 2019

jerryyin commented May 9, 2019 •

edited

jerryyin commented May 17, 2019

[ROCm] Adding Cuda Alias to Gpu functions #28343

[ROCm] Adding Cuda Alias to Gpu functions #28343

Conversation

jerryyin commented May 2, 2019 • edited

angerson commented May 2, 2019

jerryyin commented May 6, 2019

jerryyin commented May 9, 2019 • edited

whchung commented May 9, 2019

chsigg commented May 9, 2019

jerryyin commented May 9, 2019 • edited

jerryyin commented May 17, 2019

jerryyin commented May 2, 2019 •

edited

jerryyin commented May 9, 2019 •

edited

jerryyin commented May 9, 2019 •

edited