Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCm] Update eigen_contraction_kernel subroutine to make it device compatible #29395

Conversation

jerryyin
Copy link
Member

@jerryyin jerryyin commented Jun 4, 2019

This PR addresses ROCm specific compilation issues. Below is the summary of the PR changes.


First of all, UseCustomContractionKernels() should be declared with EIGEN_DEVICE_FUNC and EIGEN_DONT_INLINE as well to be consistent with the signature from packLhs() or packRhs(), invoke().

Secondly, std::call_once() is not available in gpu device code, and this function should not be invoked in device as well. If it does happen, take the short cut and return.


The change has been thoroughly tested under different compilation targets in develop-upstream branch.

@tatianashp @whchung @chsigg

@tensorflow-bot tensorflow-bot bot added the size:S CL Change Size: Small label Jun 4, 2019
@whchung whchung requested a review from chsigg June 4, 2019 15:21
@tensorflow-bot tensorflow-bot bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jun 4, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Jun 4, 2019
@rthadur rthadur self-assigned this Jun 4, 2019
@rthadur rthadur added this to Assigned Reviewer in PR Queue via automation Jun 4, 2019
@rthadur rthadur added cuda and removed ready to pull PR ready for merge process labels Jun 4, 2019
@jerryyin
Copy link
Member Author

jerryyin commented Jun 4, 2019

Looking at the test results, Linux GPU has a large number of failures on Broken by missing target @mkl_dnn//:mkldnn_single_threaded. I do realize that this change might be related because eigen_contraction_kernel_with_mkl is the only target build with mkldnn_single_threaded. However, I'm not able to view the corresponding build log online or reproduce locally.

An alternative of fixing this for the rocm target is to remove the defined macros in this target to make the chunk of code not build for rocm, i.e. "//tensorflow:using_rocm_hipcc": [], @chsigg Let me know if you prefer this alternative fix.

@deven-amd
Copy link
Contributor

posting here for reference, the error(s) we see in the --config=rocm build without this fix

[[32mINFO: ^[[0mFrom Compiling tensorflow/core/kernels/conv_2d_gpu_uint64.cu.cc [for host]:                                                                                                                                                                      
clang-9: warning: /usr/bin/gcc: 'linker' input unused [-Wunused-command-line-argument]                                                                                                                                                                            
clang-9: warning: /usr/bin/gcc: 'linker' input unused [-Wunused-command-line-argument]                                                                                                                                                                            
In file included from tensorflow/core/kernels/conv_2d_gpu_uint64.cu.cc:25:                                                                                                                                                                                        
In file included from ./tensorflow/core/kernels/conv_2d.h:22:                                                                                                                                                                                                     
In file included from ./tensorflow/core/kernels/eigen_spatial_convolutions.h:26:                                                                                                                                                                                  
./tensorflow/core/kernels/eigen_contraction_kernel.h:860:1: error:  'Eigen::internal::UseCustomContractionKernels':  no overloaded function has restriction specifiers that are compatible with the ambient context 'packLhs'                                     
REGISTER_TENSOR_CONTRACTION_KERNEL_WITH_FALLBACK(float, float, float);                                                                                                                                                                                            
^                                                                                                                                                                                                                                                                 
./tensorflow/core/kernels/eigen_contraction_kernel.h:605:38: note: expanded from macro 'REGISTER_TENSOR_CONTRACTION_KERNEL_WITH_FALLBACK'                                                                                                                         
      if (UseCustomContractionKernels()) {                                     \                                                                                                                                                                                  
                                     ^                                                                                                                                                                                                                            
./tensorflow/core/kernels/eigen_contraction_kernel.h:860:1: error:  'Eigen::internal::UseCustomContractionKernels':  no overloaded function has restriction specifiers that are compatible with the ambient context 'packLhs'                                     
./tensorflow/core/kernels/eigen_contraction_kernel.h:605:38: note: expanded from macro 'REGISTER_TENSOR_CONTRACTION_KERNEL_WITH_FALLBACK'                                                                                                                         
      if (UseCustomContractionKernels()) {                                     \                                                                                                                                                                                  
                                     ^                                                                                                                                                                                                                            
./tensorflow/core/kernels/eigen_contraction_kernel.h:860:1: error:  'Eigen::internal::UseCustomContractionKernels':  no overloaded function has restriction specifiers that are compatible with the ambient context 'packRhs'                                     
./tensorflow/core/kernels/eigen_contraction_kernel.h:624:38: note: expanded from macro 'REGISTER_TENSOR_CONTRACTION_KERNEL_WITH_FALLBACK'                                                                                                                         
      if (UseCustomContractionKernels()) {                                     \                                                                                                                                                                                  
                                     ^                                                                                                                                                                                                                            
./tensorflow/core/kernels/eigen_contraction_kernel.h:860:1: error:  'Eigen::internal::UseCustomContractionKernels':  no overloaded function has restriction specifiers that are compatible with the ambient context 'packRhs'                                     
./tensorflow/core/kernels/eigen_contraction_kernel.h:624:38: note: expanded from macro 'REGISTER_TENSOR_CONTRACTION_KERNEL_WITH_FALLBACK'                                                                                                                         
      if (UseCustomContractionKernels()) {                                     \                                                                                                                                                                                  
                                     ^                                                                                                                                                                                                                            
./tensorflow/core/kernels/eigen_contraction_kernel.h:860:1: error:  'Eigen::internal::UseCustomContractionKernels':  no overloaded function has restriction specifiers that are compatible with the ambient context 'invoke'                                      
./tensorflow/core/kernels/eigen_contraction_kernel.h:644:38: note: expanded from macro 'REGISTER_TENSOR_CONTRACTION_KERNEL_WITH_FALLBACK'                                                                                                                         
      if (UseCustomContractionKernels()) {                                     \                                                                                                                                                                                  
                                     ^                                                                                                                                                                                                                            
./tensorflow/core/kernels/eigen_contraction_kernel.h:860:1: error:  'Eigen::internal::UseCustomContractionKernels':  no overloaded function has restriction specifiers that are compatible with the ambient context 'invoke'                                      
./tensorflow/core/kernels/eigen_contraction_kernel.h:644:38: note: expanded from macro 'REGISTER_TENSOR_CONTRACTION_KERNEL_WITH_FALLBACK'                                                                                                                         
      if (UseCustomContractionKernels()) {                                     \                                                                                                                                                                                  
                                     ^                                                                                                                                                                                                                            
6 errors generated.                                                                                                                                                                                                                                               
                                                                                                                                          

One curious aspect of these errors, is that the they do not cause the build to terminate, and the build actually finishes with a non-error exit status! That was the reason we failed to spot it before.

Have you (TF developers) seen this with errors (in Eigen headers) before, where in the errors do not lead to a build failure? just curious.

@tensorflow-bot tensorflow-bot bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jun 7, 2019
PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Jun 7, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Jun 7, 2019
@tensorflow-copybara tensorflow-copybara merged commit d07363c into tensorflow:master Jun 11, 2019
PR Queue automation moved this from Approved by Reviewer to Merged Jun 11, 2019
@jerryyin jerryyin deleted the google-upstream-pr-eigen_contraction_kernel branch December 20, 2019 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes ready to pull PR ready for merge process size:S CL Change Size: Small
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

8 participants