CUBLAS fails on new CUDA.jl v4 #1852

marmarelis · 2023-04-03T21:44:20Z

Describe the bug

When any CUBLAS operation is invoked, it crashes. Non-CUBLAS operations seem to work fine. I've tried to use older CUDA runtimes to no avail.

To reproduce

The Minimal Working Example (MWE) for this bug:

using CUDA
a = randn(100,100)  |>  cu
a * a

ERROR: CUBLASError: the GPU program failed to execute (code 13, CUBLAS_STATUS_EXECUTION_FAILED)

Version info

CUDA.jl 4.1.3

Details on Julia:

Julia Version 1.8.0
Commit 5544a0fab76 (2022-08-17 13:38 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 72 × Intel(R) Xeon(R) Gold 5220 CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, cascadelake)
  Threads: 1 on 72 virtual cores

Details on CUDA:

CUDA runtime 11.8, artifact installation
CUDA driver 11.4
NVIDIA driver 470.57.2

Libraries: 
- CUBLAS: 11.5.1
- CURAND: 10.3.0
- CUFFT: 10.9.0
- CUSOLVER: 11.4.1
- CUSPARSE: 11.7.5
- CUPTI: 18.0.0
- NVML: 11.0.0+470.57.2

Toolchain:
- Julia: 1.8.0
- LLVM: 13.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

4 devices:
  0: NVIDIA GeForce RTX 2080 Ti (sm_75, 10.757 GiB / 10.761 GiB available)
  1: NVIDIA GeForce RTX 2080 Ti (sm_75, 10.757 GiB / 10.761 GiB available)
  2: NVIDIA GeForce RTX 2080 Ti (sm_75, 10.757 GiB / 10.761 GiB available)
  3: NVIDIA GeForce RTX 2080 Ti (sm_75, 10.757 GiB / 10.761 GiB available)

The text was updated successfully, but these errors were encountered:

maleadt · 2023-04-04T07:11:06Z

Please run with JULIA_DEBUG=CUBLAS and post the output.

marmarelis · 2023-04-05T16:48:41Z

Here is the error output:

┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│   type: type=SOME TYPE; val=0
│   value: type=int; val=POINTER (IN HEX:0x0x7f5075c79f90)
│  Time: 2023-04-05T09:44:37 elapsed from start 0.183333 minutes or 11.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
CUBLASError: ┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│   type: type=SOME TYPE; val=1
│   value: type=int; val=POINTER (IN HEX:0x0x7f5075c79fa0)
│  Time: 2023-04-05T09:44:37 elapsed from start 0.183333 minutes or 11.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
the GPU program failed to execute┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│   type: type=SOME TYPE; val=2
│   value: type=int; val=POINTER (IN HEX:0x0x7f5075c79fb0)
│  Time: 2023-04-05T09:44:37 elapsed from start 0.183333 minutes or 11.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
 (code 13, CUBLAS_STATUS_EXECUTION_FAILED)┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasCreate_v2(cublasContext**) called:
│   handle: type=cublasHandle_t; val=POINTER (IN HEX:0x0x7ffff6596530)
│  Time: 2023-04-05T09:44:38 elapsed from start 0.200000 minutes or 12.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224

Stacktrace:┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasSetStream_v2(cublasHandle_t, cudaStream_t) called:
│   handle: type=cublasHandle_t; val=POINTER (IN HEX:0x0xa2cf1a0)
│   streamId: type=SOME TYPE; val=POINTER (IN HEX:0x0x1b49680)
│  Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x0xa2cf1a0); StreamId=POINTER (IN HEX:0x(nil)) (defaultStream); MathMode=CUBLAS_DEFAULT_MATH
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224

┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│   type: type=SOME TYPE; val=0
│   value: type=int; val=POINTER (IN HEX:0x0x7f5074134390)
│  Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
 ┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│   type: type=SOME TYPE; val=1
│   value: type=int; val=POINTER (IN HEX:0x0x7f50741343a0)
│  Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
 [1]┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│   type: type=SOME TYPE; val=2
│   value: type=int; val=POINTER (IN HEX:0x0x7f50741343b0)
│  Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
 ┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│   type: type=SOME TYPE; val=0
│   value: type=int; val=POINTER (IN HEX:0x0x7f50741343c0)
│  Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
throw_api_error┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│   type: type=SOME TYPE; val=1
│   value: type=int; val=POINTER (IN HEX:0x0x7f50741343d0)
│  Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
(┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasGetProperty(libraryPropertyType, int*) called:
│   type: type=SOME TYPE; val=2
│   value: type=int; val=POINTER (IN HEX:0x0x7f50741343e0)
│  Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x(nil))
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
res┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasSetMathMode(cublasHandle_t, cublasMath_t) called:
│   handle: type=cublasHandle_t; val=POINTER (IN HEX:0x0xa2cf1a0)
│   mode: type=cublasMath_t; val=CUBLAS_DEFAULT_MATH | CUBLAS_MATH_DISALLOW_REDUCED_PRECISION_REDUCTION(16)
│  Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x0xa2cf1a0); StreamId=POINTER (IN HEX:0x0x1b49680); MathMode=CUBLAS_DEFAULT_MATH
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
::┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasGemmEx(cublasHandle_t, cublasOperation_t, cublasOperation_t, int, int, int, const void*, const void*, cudaDataType_t, int, const void*, cudaDataType_t, int, const vo
id*, void*, cudaDataType_t, int, cublasComputeType_t, cublasGemmAlgo_t) called:
│   handle: type=cublasHandle_t; val=POINTER (IN HEX:0x0xa2cf1a0)
│   transa: type=cublasOperation_t; val=CUBLAS_OP_N(0)
│   transb: type=cublasOperation_t; val=CUBLAS_OP_N(0)
│   m: type=int; val=100
│   n: type=int; val=100
│   k: type=int; val=100
│   alpha: type=void; val=POINTER (IN HEX:0x0x7f5074134410)
│   A: type=void; val=POINTER (IN HEX:0x0x602000000)
│   Atype: type=cudaDataType_t; val=CUDA_R_32F(0)
│   lda: type=int; val=100
│   B: type=void; val=POINTER (IN HEX:0x0x602000000)
│   Btype: type=cudaDataType_t; val=CUDA_R_32F(0)
│   ldb: type=int; val=100
│   beta: type=void; val=POINTER (IN HEX:0x0x7f5074134420)
│   C: type=void; val=POINTER (IN HEX:0x0x602009e00)
│   Ctype: type=cudaDataType_t; val=CUDA_R_32F(0)
│   ldc: type=int; val=100
│   computeType: type=cublasComputeType_t; val=CUBLAS_COMPUTE_32F(68)
│   algo: type=SOME TYPE; val=CUBLAS_GEMM_DEFAULT(-1)
│  Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x0xa2cf1a0); StreamId=POINTER (IN HEX:0x0x1b49680); MathMode=CUBLAS_DEFAULT_MATH | CUBLAS_MATH_DISALLOW_REDUCED_PRECISION_REDUCTION
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224
CUDA.CUBLAS.cublasStatus_t)
    @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/libcublas.jl:11
  [2] macro expansion
    @ ~/.julia/packages/CUDA/is36v/lib/cublas/libcublas.jl:24 [inlined]
  [3] cublasGemmEx(handle::Ptr{CUDA.CUBLAS.cublasContext}, transa::Char, transb::Char, m::Int64, n::Int64, k::Int64, alpha::Base.RefValue{Float32}, A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Atype::Type, lda::Int64, B:
:CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Btype::Type, ldb::Int64, beta::Base.RefValue{Float32}, C::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Ctype::Type, ldc::Int64, computeType::CUDA.CUBLAS.cublasComputeType_t, algo
::CUDA.CUBLAS.cublasGemmAlgo_t)
    @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/utils/call.jl:26
  [4] gemmEx!(transA::Char, transB::Char, alpha::Number, A::StridedCuVecOrMat, B::StridedCuVecOrMat, beta::Number, C::StridedCuVecOrMat; algo::CUDA.CUBLAS.cublasGemmAlgo_t)
    @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/wrappers.jl:897
  [5] gemmEx!
    @ ~/.julia/packages/CUDA/is36v/lib/cublas/wrappers.jl:875 [inlined]
  [6] gemm_dispatch!(C::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, B::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, alpha::Bool, beta::Bool)
    @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/linalg.jl:298
  [7] mul!
    @ ~/.julia/packages/CUDA/is36v/lib/cublas/linalg.jl:309 [inlined]
  [8] mul!
    @ ~/julia-1.8.0/share/julia/stdlib/v1.8/LinearAlgebra/src/matmul.jl:276 [inlined]
  [9] *(A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, B::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
    @ LinearAlgebra ~/julia-1.8.0/share/julia/stdlib/v1.8/LinearAlgebra/src/matmul.jl:148
 [10] top-level scope
    @ REPL[3]:1
 [11] top-level scope
    @ ~/.julia/packages/CUDA/is36v/src/initialization.jl:162

maleadt · 2023-04-07T14:33:33Z

┌ Debug:  cuBLAS (v11.8) function cublasStatus_t cublasGemmEx(cublasHandle_t, cublasOperation_t, cublasOperation_t, int, int, int, const void*, const void*, cudaDataType_t, int, const void*, cudaDataType_t, int, const vo
id*, void*, cudaDataType_t, int, cublasComputeType_t, cublasGemmAlgo_t) called:
│   handle: type=cublasHandle_t; val=POINTER (IN HEX:0x0xa2cf1a0)
│   transa: type=cublasOperation_t; val=CUBLAS_OP_N(0)
│   transb: type=cublasOperation_t; val=CUBLAS_OP_N(0)
│   m: type=int; val=100
│   n: type=int; val=100
│   k: type=int; val=100
│   alpha: type=void; val=POINTER (IN HEX:0x0x7f5074134410)
│   A: type=void; val=POINTER (IN HEX:0x0x602000000)
│   Atype: type=cudaDataType_t; val=CUDA_R_32F(0)
│   lda: type=int; val=100
│   B: type=void; val=POINTER (IN HEX:0x0x602000000)
│   Btype: type=cudaDataType_t; val=CUDA_R_32F(0)
│   ldb: type=int; val=100
│   beta: type=void; val=POINTER (IN HEX:0x0x7f5074134420)
│   C: type=void; val=POINTER (IN HEX:0x0x602009e00)
│   Ctype: type=cudaDataType_t; val=CUDA_R_32F(0)
│   ldc: type=int; val=100
│   computeType: type=cublasComputeType_t; val=CUBLAS_COMPUTE_32F(68)
│   algo: type=SOME TYPE; val=CUBLAS_GEMM_DEFAULT(-1)
│  Time: 2023-04-05T09:44:42 elapsed from start 0.266667 minutes or 16.000000 seconds
│ Process=20286; Thread=139997437605696; GPU=0; Handle=POINTER (IN HEX:0x0xa2cf1a0); StreamId=POINTER (IN HEX:0x0x1b49680); MathMode=CUBLAS_DEFAULT_MATH | CUBLAS_MATH_DISALLOW_REDUCED_PRECISION_REDUCTION
│  COMPILED WITH: GNU GCC/G++ / 6.3.1 20170216 (Red Hat 6.3.1-3)
└ @ CUDA.CUBLAS ~/.julia/packages/CUDA/is36v/lib/cublas/CUBLAS.jl:224

Nothing looks wrong with that. Maybe try upgrading your NVIDIA driver; 11.4 is pretty old.

bjarthur · 2023-04-27T12:14:18Z

i can NOT reproduce with:

julia> CUDA.versioninfo()
CUDA runtime 11.8, artifact installation
CUDA driver 12.1
NVIDIA driver 530.30.2

Libraries: 
- CUBLAS: 11.11.3
- CURAND: 10.3.0
- CUFFT: 10.9.0
- CUSOLVER: 11.4.1
- CUSPARSE: 11.7.5
- CUPTI: 18.0.0
- NVML: 12.0.0+530.30.2

Toolchain:
- Julia: 1.8.5
- LLVM: 13.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2
- Device capability support: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

2 devices:
  0: NVIDIA TITAN RTX (sm_75, 23.246 GiB / 24.000 GiB available)
  1: NVIDIA TITAN RTX (sm_75, 23.636 GiB / 24.000 GiB available)

oscarvdvelde · 2023-06-05T03:38:48Z

Not sure it is useful for you, as I came to this thread because my CUDA.jl test fails on CUBLAS 22 times (e.g.
Test Failed at C:\Users\oscar\.julia\packages\CUDA\pCcGc\test\cublas.jl:1649 Expression: ≈(C, Array(dC), rtol = rtol)
Maybe I need a clean install because it could not update certain packages.

However, when I run the a * a test provided here, it succeeds.

 Info: System information:
│ CUDA runtime 11.8, artifact installation
│ CUDA driver 11.4
│ NVIDIA driver 471.86.0
│ 
│ CUDA libraries:
│ - CUBLAS: 11.11.3
│ - CURAND: 10.3.0
│ - CUFFT: 10.9.0
│ - CUSOLVER: 11.4.1
│ - CUSPARSE: 11.7.5
│ - CUPTI: 18.0.0
│ - NVML: 11.0.0+471.86
│ 
│ Julia packages:
│ - CUDA.jl: 4.3.2
│ - CUDA_Driver_jll: 0.5.0+1
│ - CUDA_Runtime_jll: 0.6.0+0
│ - CUDA_Runtime_Discovery: 0.2.2
│ 
│ Toolchain:
│ - Julia: 1.9.0
│ - LLVM: 14.0.6
│ - PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4
│ - Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86
│ 
│ 1 device:
└   0: NVIDIA RTX A3000 Laptop GPU (sm_86, 5.875 GiB / 6.000 GiB available)

maleadt · 2023-06-05T07:41:28Z

Can you verify you don't have LD_LIBRARY_PATH set?

oscarvdvelde · 2023-06-05T15:53:18Z

Can you verify you don't have LD_LIBRARY_PATH set?

julia> ENV["LD_LIBRARY_PATH"]
ERROR: KeyError: key "LD_LIBRARY_PATH" not found

maleadt · 2023-06-05T15:56:00Z

Can you try upgrading your driver so that CUDA 12.1 can be installed? Maybe this is a CUBLAS bug that got fixed.

oscarvdvelde · 2023-06-05T16:53:10Z

Upgraded the driver (Asus and NVIDIA software did not suggest any new updates, so I downloaded it) and made a fresh project environment in which I added only CUDA.

The test CUDA completed successfully!

⌅ [856f044c] MKL_jll v2022.2.0+0
Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading.  
Precompiling project...
  31 dependencies successfully precompiled in 65 seconds. 29 already precompiled.
     Testing Running tests...
┌ Info: System information:
│ CUDA runtime 12.1, artifact installation
│ CUDA driver 12.0
│ NVIDIA driver 528.2.0
│ 
│ CUDA libraries: 
│ - CUBLAS: 12.1.3
│ - CURAND: 10.3.2
│ - CUFFT: 11.0.2
│ - CUSOLVER: 11.4.5
│ - CUSPARSE: 12.1.0
│ - CUPTI: 18.0.0
│ - NVML: 12.0.0+528.2
│ 
│ Julia packages:
│ - CUDA.jl: 4.3.2
│ - CUDA_Driver_jll: 0.5.0+1
│ - CUDA_Runtime_jll: 0.6.0+0
│ - CUDA_Runtime_Discovery: 0.2.2
│ 
│ Toolchain:
│ - Julia: 1.9.0
│ - LLVM: 14.0.6
│ - PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5
│ - Device capability support: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86
│ 
│ 1 device:
└   0: NVIDIA RTX A3000 Laptop GPU (sm_86, 5.863 GiB / 6.000 GiB available)

Testing finished in 10 minutes, 23 seconds, 641 milliseconds

Test Summary: |  Pass  Broken  Total  Time
  Overall     | 20340       9  20349
    SUCCESS
     Testing CUDA tests passed

It did give a warning:

cufft                                         (8) |    41.69 |   0.01 |  0.0 |     144.27 |      N/A |   3.38 |  8.1 |    1936.71 |  4226.76 |
      From worker 7:    WARNING: Method definition #4326#kernel(Any) in module Main at C:\Users\oscar\.julia\packages\CUDA\pCcGc\test\execution.jl:321 overwritten at C:\Users\oscar\.julia\packages\CUDA\pCcGc\test\execution.jl:329.

The a*a test works also. It seems the problems with cublas are resolved.

maleadt · 2023-06-05T18:24:04Z

That's great! I guess we'll chalk this up to a CUDA/CUBLAS bug, which isn't unreasonable for Lovelace/Hopper (i.e. fairly recent) hardware on older CUDA toolkits.

If this would still reproduce on recent toolkits, feel free to comment or open a new issue.

marmarelis added the bug Something isn't working label Apr 3, 2023

maleadt added the needs information Further information is requested label Apr 4, 2023

maleadt closed this as completed Jun 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUBLAS fails on new CUDA.jl v4 #1852

CUBLAS fails on new CUDA.jl v4 #1852

marmarelis commented Apr 3, 2023

maleadt commented Apr 4, 2023 •

edited

marmarelis commented Apr 5, 2023

maleadt commented Apr 7, 2023

bjarthur commented Apr 27, 2023

oscarvdvelde commented Jun 5, 2023 •

edited by maleadt

maleadt commented Jun 5, 2023

oscarvdvelde commented Jun 5, 2023 •

edited

maleadt commented Jun 5, 2023

oscarvdvelde commented Jun 5, 2023 •

edited

maleadt commented Jun 5, 2023

CUBLAS fails on new CUDA.jl v4 #1852

CUBLAS fails on new CUDA.jl v4 #1852

Comments

marmarelis commented Apr 3, 2023

maleadt commented Apr 4, 2023 • edited

marmarelis commented Apr 5, 2023

maleadt commented Apr 7, 2023

bjarthur commented Apr 27, 2023

oscarvdvelde commented Jun 5, 2023 • edited by maleadt

maleadt commented Jun 5, 2023

oscarvdvelde commented Jun 5, 2023 • edited

maleadt commented Jun 5, 2023

oscarvdvelde commented Jun 5, 2023 • edited

maleadt commented Jun 5, 2023

maleadt commented Apr 4, 2023 •

edited

oscarvdvelde commented Jun 5, 2023 •

edited by maleadt

oscarvdvelde commented Jun 5, 2023 •

edited

oscarvdvelde commented Jun 5, 2023 •

edited