Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CompatTest.test_cuda_eigh_cusolver_syev_dtype=f32_syevd: skip for CUDA 12.4+ #21258

Merged

Conversation

olupton
Copy link
Contributor

@olupton olupton commented May 16, 2024

This test assumes that a workspace size queried from an older cuSOLVER version can be used with a newer cuSOLVER version. This is not safe and leads to an error when upgrading the CUDA toolkit from 12.3 to 12.4.1:
jaxlib.xla_extension.XlaRuntimeError: INTERNAL: CustomCall failed: jaxlib/gpu/solver_kernels.cc:485: operation gpusolverDnSsyevd(handle.get(), jobz, d.uplo, d.n, a, d.n, w, static_cast<float*>(work), d.lwork, info) failed: cuSolver execution failed

Eventually the workspace size query + allocation should be moved out of JAX and into an XLA optimisation pass.

cc: @hawkinsp @nouiz

@hawkinsp hawkinsp requested a review from gnecula May 21, 2024 12:30
This version is shipped with CUDA 12.4. The test assumes that a
workspace size baked in with an older version of cuSolver can be used
with a newer version of cuSolver. This is not safe, and leads to an
error when upgrading from 11.5 to 11.6.
@olupton olupton force-pushed the skip-cusolver-test-with-cuda-12.4 branch from e0358fc to 9ba77f8 Compare May 21, 2024 14:49
@google-ml-butler google-ml-butler bot added kokoro:force-run pull ready Ready for copybara import and testing labels May 21, 2024
@copybara-service copybara-service bot merged commit 5350bc9 into google:main May 21, 2024
13 checks passed
@olupton olupton deleted the skip-cusolver-test-with-cuda-12.4 branch May 22, 2024 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pull ready Ready for copybara import and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants