Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCm] Adding Cuda Alias to Gpu functions #28343

Conversation

jerryyin
Copy link
Member

@jerryyin jerryyin commented May 2, 2019

This PR is a follow-up to PR #24293. This PR depends on & includes code changes from PR #28116. Code updates for this PR are in the tensorflow/core/util path

Through utility functions defined in gpu_cuda_alias.h, this PR provides a Cuda* alias to existing Gpu* functions. The reason for this PR is to create backward compatibility to both the TensorFlow master branch as well as ROCm's develop-upstream branch. It allows TensorFlow master branch to slowly deprecate Cuda* aliased function, to the final goal of using GPU as the prefix to unify both Cuda and ROCm builds. All aliasing is done through perfect forwarding macro so that it is easy to maintain/take out.

Tests performed: Both Cuda/ROCm build successfully in both master and develop-upstream branch.

@whchung @deven-amd

@tensorflow-bot tensorflow-bot bot added the size:L CL Change Size: Large label May 2, 2019
@rthadur rthadur self-assigned this May 2, 2019
@rthadur rthadur added this to Assigned Reviewer in PR Queue via automation May 2, 2019
@rthadur rthadur added the cuda label May 2, 2019
@rthadur rthadur requested review from chsigg and angerson May 2, 2019 18:43
@angerson
Copy link
Contributor

angerson commented May 2, 2019

I don't have the expertise to review this; removing myself.

@angerson angerson removed their request for review May 2, 2019 19:14
@whchung whchung added the kokoro:force-run Tests on submitted change label May 2, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label May 2, 2019
@jerryyin jerryyin force-pushed the google-upstream-pr-gpu-backwardcompat-alias branch 2 times, most recently from 5f09d2c to 6223014 Compare May 3, 2019 15:34
@rthadur rthadur requested review from mrry and removed request for chsigg May 3, 2019 17:38
@jerryyin jerryyin force-pushed the google-upstream-pr-gpu-backwardcompat-alias branch from 6223014 to 8652f82 Compare May 6, 2019 14:46
@whchung whchung added the kokoro:force-run Tests on submitted change label May 6, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label May 6, 2019
@whchung whchung requested a review from chsigg May 6, 2019 16:25
tensorflow/core/util/gpu_launch_config.h Outdated Show resolved Hide resolved
PR Queue automation moved this from Assigned Reviewer to Reviewer Requested Changes May 6, 2019
@jerryyin
Copy link
Member Author

jerryyin commented May 6, 2019

Will post back to address review feedback after PR439 approved. Will also integrate macro only change from GPU_LAUNCH_KERNEL to GpuLaunchKernel

@whchung whchung added kokoro:force-run Tests on submitted change and removed kokoro:force-run Tests on submitted change labels May 9, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label May 9, 2019
@whchung whchung added the kokoro:force-run Tests on submitted change label May 9, 2019
@jerryyin jerryyin force-pushed the google-upstream-pr-gpu-backwardcompat-alias branch from bff8831 to 0f07388 Compare May 9, 2019 17:19
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label May 9, 2019
@jerryyin
Copy link
Member Author

jerryyin commented May 9, 2019

The on-going effort in commit d8d67ef to achieve the same goal with this PR seems to be duplicated, and is creating hard-to-resolve conflicts with this PR.

Commit d8d67ef used a less desirable way to create aliases - manually copy and rename every function signature. Compared to the approach of creating dedicated macro to perfect forward function name, the commit looks hard to maintain and error prone.

Could you help contact the internal contributor make sure our opinion on the same page with this PR? @chsigg @mrry @rthadur

@rthadur rthadur requested review from chsigg and whchung May 9, 2019 17:53
@whchung
Copy link
Contributor

whchung commented May 9, 2019

@jerryyin for the time being, I recommend reverting d8d67ef in this PR, as perfect forwarding makes the code better maintainable.

@chsigg
Copy link
Contributor

chsigg commented May 9, 2019

Hi Zhuoran. I started the rename from our side, sorry that I didn't look more carefully at your approach first. Generally, please do not mix several changes into the same PR. It makes it hard to understand and review.
Could you please make a PR that introduces the macros only, and a follow-up to generate the various alias functions/macros/types, maybe even split into multiple PRs. At that point I can start preparing changes to use the new 'gpu' symbols.

@jerryyin
Copy link
Member Author

jerryyin commented May 9, 2019

Hi Zhuoran. I started the rename from our side, sorry that I didn't look more carefully at your approach first. Generally, please do not mix several changes into the same PR. It makes it hard to understand and review.
Could you please make a PR that introduces the macros only, and a follow-up to generate the various alias functions/macros/types, maybe even split into multiple PRs. At that point I can start preparing changes to use the new 'gpu' symbols.

@chsigg No worries, good feedback on breaking the PR.

Please prioritize following reviews:

Then the dependent ones:

The final goal of closing the large PR.

deven-amd added a commit to ROCm/tensorflow-upstream that referenced this pull request May 13, 2019
The --config=rocm build was broken by the merge for PR tensorflow#26840

This commit backs out the ROCm support in the file avgpooling_op.cc (added by the above PR). This is because the the template instantiations required for GPU support of the average pooling operator (which are in avgpooling_op_gpu.cu.cc) also need to be enabled for ROCm at the same time (as the code in avgpooling_op.cc) in order to avoid link errors with the `--config=rocm` build.  Enabling ROCm support for the code in avgpooling_op_gpu.cu.cc requires other PRs (the set spwaned from PR tensorflow#28343) to be merged first. Once those PRs are merged, we will file another PR to re-enable ROCm support in the avgpooling*.cc files.
tensorflow-copybara pushed a commit that referenced this pull request May 13, 2019
Imported from GitHub PR #28565

This PR is a follow-up to the original PR #28343. The reviewer requested to break down the original large PR to a series of smaller ones. According to the plan here, this PR is the fifth one in the whole series.

@chsigg @whchung

Copybara import of the project:

  - c4756f5 Creating GpuLaunchKernel by zhuoryin <zhuoryin@amd.com>
  - bd98ac6 Merge c4756f5 into b5170... by Zhuoran Yin <jerryyin@users.noreply.github.com>

COPYBARA_INTEGRATE_REVIEW=#28565 from ROCmSoftwarePlatform:google-upstream-pr-GpuLaunchKernel c4756f5
PiperOrigin-RevId: 248006235
@rthadur rthadur removed the ready to pull PR ready for merge process label May 14, 2019
@jerryyin
Copy link
Member Author

Closing the review as all sub-PRs are merged

@jerryyin jerryyin closed this May 17, 2019
PR Queue automation moved this from Reviewer Requested Changes to Closed/Rejected May 17, 2019
@jerryyin jerryyin deleted the google-upstream-pr-gpu-backwardcompat-alias branch December 20, 2019 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes size:L CL Change Size: Large
Projects
PR Queue
  
Closed/Rejected
Development

Successfully merging this pull request may close these issues.

None yet

7 participants