New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUBLAS fails on new CUDA.jl v4 #1852
Comments
Please run with |
Here is the error output: ┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│ type: type=SOME TYPE; val=0
│ value: type=int; val=POINTER (IN HEX:0x0x7f5075c79f90)
│ Time: 2023-04-05T09:44:37 elapsed from start 0.183333 minutes or 11.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
CUBLASError: ┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│ type: type=SOME TYPE; val=1
│ value: type=int; val=POINTER (IN HEX:0x0x7f5075c79fa0)
│ Time: 2023-04-05T09:44:37 elapsed from start 0.183333 minutes or 11.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
the GPU program failed to execute┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│ type: type=SOME TYPE; val=2
│ value: type=int; val=POINTER (IN HEX:0x0x7f5075c79fb0)
│ Time: 2023-04-05T09:44:37 elapsed from start 0.183333 minutes or 11.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
(code 13, CUBLAS_STATUS_EXECUTION_FAILED)┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasCreate_v2(cublasContext**) called:
│ handle: type=cublasHandle_t; val=POINTER (IN HEX:0x0x7ffff6596530)
│ Time: 2023-04-05T09:44:38 elapsed from start 0.200000 minutes or 12.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
Stacktrace:┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasSetStream_v2(cublasHandle_t, cudaStream_t) called:
│ handle: type=cublasHandle_t; val=POINTER (IN HEX:0x0xa2cf1a0)
│ streamId: type=SOME TYPE; val=POINTER (IN HEX:0x0x1b49680)
│ Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x0xa2cf1a0); StreamId=POINTER (IN HEX:0x(nil)) (defaultStream); MathMode=CUBLAS_DEFAULT_MATH
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│ type: type=SOME TYPE; val=0
│ value: type=int; val=POINTER (IN HEX:0x0x7f5074134390)
│ Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│ type: type=SOME TYPE; val=1
│ value: type=int; val=POINTER (IN HEX:0x0x7f50741343a0)
│ Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
[1]┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│ type: type=SOME TYPE; val=2
│ value: type=int; val=POINTER (IN HEX:0x0x7f50741343b0)
│ Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│ type: type=SOME TYPE; val=0
│ value: type=int; val=POINTER (IN HEX:0x0x7f50741343c0)
│ Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
throw_api_error┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│ type: type=SOME TYPE; val=1
│ value: type=int; val=POINTER (IN HEX:0x0x7f50741343d0)
│ Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
(┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│ type: type=SOME TYPE; val=2
│ value: type=int; val=POINTER (IN HEX:0x0x7f50741343e0)
│ Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
res┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasSetMathMode(cublasHandle_t, cublasMath_t) called:
│ handle: type=cublasHandle_t; val=POINTER (IN HEX:0x0xa2cf1a0)
│ mode: type=cublasMath_t; val=CUBLAS_DEFAULT_MATH | CUBLAS_MATH_DISALLOW_REDUCED_PRECISION_REDUCTION(16)
│ Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x0xa2cf1a0); StreamId=POINTER (IN HEX:0x0x1b49680); MathMode=CUBLAS_DEFAULT_MATH
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
::┌ Debug: cuBLAS (v11.8) function cublasStatus_t cublasGemmEx(cublasHandle_t, cublasOperation_t, cublasOperation_t, int, int, int, const void*, const void*, cudaDataType_t, int, const void*, cudaDataType_t, int, const vo
id*, void*, cudaDataType_t, int, cublasComputeType_t, cublasGemmAlgo_t) called:
│ handle: type=cublasHandle_t; val=POINTER (IN HEX:0x0xa2cf1a0)
│ transa: type=cublasOperation_t; val=CUBLAS_OP_N(0)
│ transb: type=cublasOperation_t; val=CUBLAS_OP_N(0)
│ m: type=int; val=100
│ n: type=int; val=100
│ k: type=int; val=100
│ alpha: type=void; val=POINTER (IN HEX:0x0x7f5074134410)
│ A: type=void; val=POINTER (IN HEX:0x0x602000000)
│ Atype: type=cudaDataType_t; val=CUDA_R_32F(0)
│ lda: type=int; val=100
│ B: type=void; val=POINTER (IN HEX:0x0x602000000)
│ Btype: type=cudaDataType_t; val=CUDA_R_32F(0)
│ ldb: type=int; val=100
│ beta: type=void; val=POINTER (IN HEX:0x0x7f5074134420)
│ C: type=void; val=POINTER (IN HEX:0x0x602009e00)
│ Ctype: type=cudaDataType_t; val=CUDA_R_32F(0)
│ ldc: type=int; val=100
│ computeType: type=cublasComputeType_t; val=CUBLAS_COMPUTE_32F(68)
│ algo: type=SOME TYPE; val=CUBLAS_GEMM_DEFAULT(-1)
│ Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x0xa2cf1a0); StreamId=POINTER (IN HEX:0x0x1b49680); MathMode=CUBLAS_DEFAULT_MATH | CUBLAS_MATH_DISALLOW_REDUCED_PRECISION_REDUCTION
│ COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
CUDA.CUBLAS.cublasStatus_t)
@ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/libcublas.jl:11
[2] macro expansion
@ ~/.julia/packages/CUDA/is36v/lib/cublas/libcublas.jl:24 [inlined]
[3] cublasGemmEx(handle::Ptr{CUDA.CUBLAS.cublasContext}, transa::Char, transb::Char, m::Int64, n::Int64, k::Int64, alpha::Base.RefValue{Float32}, A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Atype::Type, lda::Int64, B:
:CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Btype::Type, ldb::Int64, beta::Base.RefValue{Float32}, C::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Ctype::Type, ldc::Int64, computeType::CUDA.CUBLAS.cublasComputeType_t, algo
::CUDA.CUBLAS.cublasGemmAlgo_t)
@ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/utils/call.jl:26
[4] gemmEx!(transA::Char, transB::Char, alpha::Number, A::StridedCuVecOrMat, B::StridedCuVecOrMat, beta::Number, C::StridedCuVecOrMat; algo::CUDA.CUBLAS.cublasGemmAlgo_t)
@ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/wrappers.jl:897
[5] gemmEx!
@ ~/.julia/packages/CUDA/is36v/lib/cublas/wrappers.jl:875 [inlined]
[6] gemm_dispatch!(C::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, B::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, alpha::Bool, beta::Bool)
@ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/linalg.jl:298
[7] mul!
@ ~/.julia/packages/CUDA/is36v/lib/cublas/linalg.jl:309 [inlined]
[8] mul!
@ ~/julia-1.8.0/share/julia/stdlib/v1.8/LinearAlgebra/src/matmul.jl:276 [inlined]
[9] *(A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, B::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
@ LinearAlgebra ~/julia-1.8.0/share/julia/stdlib/v1.8/LinearAlgebra/src/matmul.jl:148
[10] top-level scope
@ REPL[3]:1
[11] top-level scope
@ ~/.julia/packages/CUDA/is36v/src/initialization.jl:162 |
Nothing looks wrong with that. Maybe try upgrading your NVIDIA driver; 11.4 is pretty old. |
i can NOT reproduce with:
|
Not sure it is useful for you, as I came to this thread because my CUDA.jl test fails on CUBLAS 22 times (e.g. However, when I run the a * a test provided here, it succeeds.
|
Can you verify you don't have LD_LIBRARY_PATH set? |
|
Can you try upgrading your driver so that CUDA 12.1 can be installed? Maybe this is a CUBLAS bug that got fixed. |
Upgraded the driver (Asus and NVIDIA software did not suggest any new updates, so I downloaded it) and made a fresh project environment in which I added only CUDA. The
It did give a warning:
The a*a test works also. It seems the problems with cublas are resolved. |
That's great! I guess we'll chalk this up to a CUDA/CUBLAS bug, which isn't unreasonable for Lovelace/Hopper (i.e. fairly recent) hardware on older CUDA toolkits. If this would still reproduce on recent toolkits, feel free to comment or open a new issue. |
Describe the bug
When any CUBLAS operation is invoked, it crashes. Non-CUBLAS operations seem to work fine. I've tried to use older CUDA runtimes to no avail.
To reproduce
The Minimal Working Example (MWE) for this bug:
Version info
CUDA.jl 4.1.3
Details on Julia:
Details on CUDA:
The text was updated successfully, but these errors were encountered: