[ROCm] Fix for ROCm CSB Breakage - 201022 #44233
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The following commit introduces a failure on in the
//tensorflow/python/kernel_tests:linalg_grad_test_gpu
test on the ROCm Platform3ce466a
The failure is in the
MatrixExponentialGradient
subtest, and the errors we get are of the following formThe regression was fixed by the subsequent commit, 7d57263
and then re-introduced when parts of the above commits were rolled back by the following commit
fb22dff
The regression seems to occur on the ROCm platform because the "MatrixSolve" operator is currently only enabled for the CUDA platform for GPUs, and not on the ROCm platform.
This commit is to temporarily disable the subtest on the ROCm platform, to get the ROCm CSB to pass. It can be reverted once the reverted changes are put back in.
/cc @cheshire @chsigg @nvining-work