CUDA 11.2 CUBLASError and "CUDA.jl does not yet support CUDA with nvdisasm 11.2.67" #607

alexkyllo · 2020-12-24T07:27:06Z

Describe the bug

A clear and concise description of what the bug is.

I am using a fresh install of Ubuntu 20.04 with CUDA 11.2. The CUDA toolkit is installed and I can run the NVIDIA samples and compile CUDA code with nvcc successfully.

CUDA.jl does not work, though. If I use artifacts, it downloads and installs the CUDA 111 artifacts, then when I attempt to fit a model using Flux, I get the following CUBLASError:

[ Info: Loading data set
[ Info: Building model...
ERROR: CUBLASError: an absent device architectural feature is required (code 8, CUBLAS_STATUS_ARCH_MISMATCH)
Stacktrace:
 [1] throw_api_error(::CUDA.CUBLAS.cublasStatus_t) at /home/alex/.julia/packages/CUDA/YeS8q/lib/cublas/error.jl:47
 [2] macro expansion at /home/alex/.julia/packages/CUDA/YeS8q/lib/cublas/error.jl:58 [inlined]
 [3] cublasGemmEx(::Ptr{Nothing}, ::Char, ::Char, ::Int64, ::Int64, ::Int64, ::Base.RefValue{Float32}, ::CuArray{Float32,2}, ::Type{T} where T, ::Int64, ::CuArray{Float32,2}, ::Type{T} where T, ::Int64, ::Base.RefValue{Float32}, ::CuArray{Float32,2}, ::Type{T} where T, ::Int64, ::CUDA.CUBLAS.cublasComputeType_t, ::CUDA.CUBLAS.cublasGemmAlgo_t) at /home/alex/.julia/packages/CUDA/YeS8q/lib/utils/call.jl:93
 [4] gemmEx!(::Char, ::Char, ::Number, ::Union{CuArray{T,1}, CuArray{T,2}} where T, ::Union{CuArray{T,1}, CuArray{T,2}} where T, ::Number, ::Union{CuArray{T,1}, CuArray{T,2}} where T; algo::CUDA.CUBLAS.cublasGemmAlgo_t) at /home/alex/.julia/packages/CUDA/YeS8q/lib/cublas/wrappers.jl:836
 [5] gemmEx! at /home/alex/.julia/packages/CUDA/YeS8q/lib/cublas/wrappers.jl:818 [inlined]
 [6] gemm_dispatch!(::CuArray{Float32,2}, ::CuArray{Float32,2}, ::CuArray{Float32,2}, ::Bool, ::Bool) at /home/alex/.julia/packages/CUDA/YeS8q/lib/cublas/linalg.jl:216
 [7] mul! at /home/alex/.julia/packages/CUDA/YeS8q/lib/cublas/linalg.jl:227 [inlined]
 [8] mul! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:208 [inlined]
 [9] * at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:160 [inlined]
 [10] (::Flux.Dense{typeof(identity),CuArray{Float32,2},CuArray{Float32,1}})(::CuArray{Float32,2}) at /home/alex/.julia/packages/Flux/q3zeA/src/layers/basic.jl:123
 [11] Dense at /home/alex/.julia/packages/Flux/q3zeA/src/layers/basic.jl:134 [inlined]
 [12] applychain at /home/alex/.julia/packages/Flux/q3zeA/src/layers/basic.jl:36 [inlined] (repeats 8 times)
 [13] (::Flux.Chain{Tuple{Flux.Conv{2,2,typeof(NNlib.relu),CuArray{Float32,4},CuArray{Float32,1}},Flux.MaxPool{2,4},Flux.Conv{2,2,typeof(NNlib.relu),CuArray{Float32,4},CuArray{Float32,1}},Flux.MaxPool{2,4},Flux.Conv{2,2,typeof(NNlib.relu),CuArray{Float32,4},CuArray{Float32,1}},Flux.MaxPool{2,4},typeof(Flux.flatten),Flux.Dense{typeof(identity),CuArray{Float32,2},CuArray{Float32,1}}}})(::CuArray{Float32,4}) at /home/alex/.julia/packages/Flux/q3zeA/src/layers/basic.jl:38
 [14] train(; kws::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/alex/.julia/dev/FakeFaces/src/FakeFaces.jl:106
 [15] train() at /home/alex/.julia/dev/FakeFaces/src/FakeFaces.jl:90
 [16] top-level scope at REPL[4]:1

If I disable the artifacts build and use my system CUDA by setting JULIA_CUDA_USE_BINARYBUILDER=false then I get a differrent error:

julia> CUDA.versioninfo()
ERROR: CUDA.jl does not yet support CUDA with nvdisasm 11.2.67; please file an issue.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] parse_toolkit_version(::String, ::String) at /home/alex/.julia/packages/CUDA/YeS8q/deps/discovery.jl:394
 [3] use_local_cuda() at /home/alex/.julia/packages/CUDA/YeS8q/deps/bindeps.jl:204
 [4] __init_dependencies__() at /home/alex/.julia/packages/CUDA/YeS8q/deps/bindeps.jl:370
 [5] __runtime_init__() at /home/alex/.julia/packages/CUDA/YeS8q/src/initialization.jl:114
 [6] macro expansion at /home/alex/.julia/packages/CUDA/YeS8q/src/initialization.jl:32 [inlined]
 [7] macro expansion at ./lock.jl:183 [inlined]
 [8] _functional(::Bool) at /home/alex/.julia/packages/CUDA/YeS8q/src/initialization.jl:26
 [9] functional(::Bool) at /home/alex/.julia/packages/CUDA/YeS8q/src/initialization.jl:19
 [10] macro expansion at /home/alex/.julia/packages/CUDA/YeS8q/src/initialization.jl:50 [inlined]
 [11] toolkit_version at /home/alex/.julia/packages/CUDA/YeS8q/deps/bindeps.jl:25 [inlined]
 [12] versioninfo(::Base.TTY) at /home/alex/.julia/packages/CUDA/YeS8q/src/utilities.jl:43 (repeats 2 times)
 [13] top-level scope at REPL[3]:1

Is there a fix for this? Or will I need to downgrade my system NVIDIA drivers and CUDA version to 11.1 in order to use CUDA.jl?

To reproduce

The Minimal Working Example (MWE) for this bug:

The code from the Flux model zoo found here, copied verbatim: https://github.com/FluxML/model-zoo/blob/master/vision/mnist/conv.jl

It blows up at line 103:

model(train_set[1][1])

Manifest.toml

Paste your Manifest.toml here, or accurately describe which version of CUDA.jl and its dependencies (GPUArrays.jl, GPUCompiler.jl, LLVM.jl) you are using.

[[CUDA]]
deps = ["AbstractFFTs", "Adapt", "BFloat16s", "CEnum", "DataStructures", "ExprTools", "GPUArrays", "GPUCompiler", "LLVM", "Libdl", "LinearAlgebra", "Logging", "MacroTools", "NNlib", "Pkg", "Printf", "Random", "Reexport", "Requires", "SparseArrays", "Statistics", "TimerOutputs"]
git-tree-sha1 = "7663b61782b569b03fba91d330a5ed2f86cd4cb8"
uuid = "052768ef-5323-5732-b1bb-66c8b64840ba"
version = "2.3.0"

[[GPUArrays]]
deps = ["AbstractFFTs", "Adapt", "LinearAlgebra", "Printf", "Random", "Serialization"]
git-tree-sha1 = "2c1dd57bca7ba0b3b4bf81d9332aeb81b154ef4c"
uuid = "0c68f7d7-f131-5f86-a1c3-88cf8149b2d7"
version = "6.1.2"

[[GPUCompiler]]
deps = ["DataStructures", "InteractiveUtils", "LLVM", "Libdl", "Scratch", "Serialization", "TimerOutputs", "UUIDs"]
git-tree-sha1 = "c853c810b52a80f9aad79ab109207889e57f41ef"
uuid = "61eb1bfa-7361-4325-ad38-22787b887f55"
version = "0.8.3"

[[LLVM]]
deps = ["CEnum", "Libdl", "Printf", "Unicode"]
git-tree-sha1 = "a2101830a761d592b113129887fda626387f68d4"
uuid = "929cbde3-209d-540e-8aea-75f648917ca0"
version = "3.5.1"

Expected behavior

Expected these exceptions not to be thrown.

Version info

Details on Julia:

# please post the output of:
versioninfo()

Julia Version 1.5.3
Commit 788b2c77c1 (2020-11-09 13:37 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: AMD Ryzen 7 3700X 8-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, znver2)

Details on CUDA:

# please post the output of:
CUDA.versioninfo()

CUDA toolkit 11.1.1, artifact installation
CUDA driver 11.2.0
NVIDIA driver 460.27.4

Libraries: 
- CUBLAS: 11.3.1
- CURAND: 10.2.2
- CUFFT: 10.3.0
- CUSOLVER: 11.0.1
- CUSPARSE: 11.3.0
- CUPTI: 14.0.0
- NVML: 11.0.0+460.27.4
- CUDNN: 8.0.4 (for CUDA 11.1.0)
- CUTENSOR: 1.2.1 (for CUDA 11.1.0)

Toolchain:
- Julia: 1.5.3
- LLVM: 9.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4
- Device support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75

1 device:
  0: GeForce RTX 2060 SUPER (sm_75, 6.403 GiB / 7.792 GiB available)

The text was updated successfully, but these errors were encountered:

alexkyllo added the bug Something isn't working label Dec 24, 2020

maleadt mentioned this issue Jan 4, 2021

Initial compatibility with CUDA 11.2. #622

Merged

maleadt closed this as completed in #622 Jan 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA 11.2 CUBLASError and "CUDA.jl does not yet support CUDA with nvdisasm 11.2.67" #607

CUDA 11.2 CUBLASError and "CUDA.jl does not yet support CUDA with nvdisasm 11.2.67" #607

alexkyllo commented Dec 24, 2020

CUDA 11.2 CUBLASError and "CUDA.jl does not yet support CUDA with nvdisasm 11.2.67" #607

CUDA 11.2 CUBLASError and "CUDA.jl does not yet support CUDA with nvdisasm 11.2.67" #607

Comments

alexkyllo commented Dec 24, 2020