New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ROCm] Update eigen_contraction_kernel subroutine to make it device compatible #29395
[ROCm] Update eigen_contraction_kernel subroutine to make it device compatible #29395
Conversation
Looking at the test results, Linux GPU has a large number of failures on An alternative of fixing this for the rocm target is to remove the defined macros in this target to make the chunk of code not build for rocm, i.e. |
posting here for reference, the error(s) we see in the
One curious aspect of these errors, is that the they do not cause the build to terminate, and the build actually finishes with a non-error exit status! That was the reason we failed to spot it before. Have you (TF developers) seen this with errors (in Eigen headers) before, where in the errors do not lead to a build failure? just curious. |
This PR addresses ROCm specific compilation issues. Below is the summary of the PR changes.
First of all,
UseCustomContractionKernels()
should be declared withEIGEN_DEVICE_FUNC
andEIGEN_DONT_INLINE
as well to be consistent with the signature frompackLhs()
orpackRhs()
,invoke()
.Secondly,
std::call_once()
is not available in gpu device code, and this function should not be invoked in device as well. If it does happen, take the short cut and return.The change has been thoroughly tested under different compilation targets in develop-upstream branch.
@tatianashp @whchung @chsigg