Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqrt broken in kernels 'Format of __nvvm__reflect function not recognized' #1817

Closed
michel2323 opened this issue Mar 21, 2023 · 1 comment
Closed
Labels
bug Something isn't working

Comments

@michel2323
Copy link

michel2323 commented Mar 21, 2023

Describe the bug

When using sqrt in a kernel CUDA.jl throws the following error:

ERROR: LoadError: Format of __nvvm__reflect function not recognized
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] macro expansion
    @ /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/ptx.jl:439 [inlined]
  [3] macro expansion
    @ /disk/mschanen/julia_depot/packages/TimerOutputs/LHjFw/src/TimerOutput.jl:253 [inlined]
  [4] nvvm_reflect!(fun::LLVM.Function)
    @ GPUCompiler /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/ptx.jl:413
  [5] function_pass_callback(ptr::Ptr{Nothing}, data::Ptr{Nothing})
    @ LLVM /disk/mschanen/julia_depot/packages/LLVM/X1AeZ/src/pass.jl:49
  [6] LLVMRunPassManager
    @ /disk/mschanen/julia_depot/packages/LLVM/X1AeZ/lib/13/libLLVM_h.jl:4898 [inlined]
  [7] run!
    @ /disk/mschanen/julia_depot/packages/LLVM/X1AeZ/src/passmanager.jl:39 [inlined]
  [8] macro expansion
    @ /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/ptx.jl:176 [inlined]
  [9] macro expansion
    @ /disk/mschanen/julia_depot/packages/LLVM/X1AeZ/src/base.jl:102 [inlined]
 [10] optimize_module!(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget}, mod::LLVM.Module)
    @ GPUCompiler /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/ptx.jl:149
 [11] optimize!(job::GPUCompiler.CompilerJob, mod::LLVM.Module)
    @ GPUCompiler /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/optim.jl:245
 [12] macro expansion
    @ /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/driver.jl:366 [inlined]
 [13] macro expansion
    @ /disk/mschanen/julia_depot/packages/TimerOutputs/LHjFw/src/TimerOutput.jl:253 [inlined]
 [14] macro expansion
    @ /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/driver.jl:365 [inlined]
 [15] macro expansion
    @ /disk/mschanen/julia_depot/packages/TimerOutputs/LHjFw/src/TimerOutput.jl:253 [inlined]
 [16] macro expansion
    @ /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/driver.jl:355 [inlined]
 [17] emit_llvm(job::GPUCompiler.CompilerJob, method_instance::Any; libraries::Bool, deferred_codegen::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool, ctx::LLVM.ThreadSafeContext)
    @ GPUCompiler /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/utils.jl:83
 [18] emit_llvm
    @ /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/utils.jl:77 [inlined]
 [19] compile(job::GPUCompiler.CompilerJob, ctx::LLVM.ThreadSafeContext)
    @ CUDA /disk/mschanen/julia_depot/dev/CUDA/src/compiler/compilation.jl:106
 [20] #203
    @ /disk/mschanen/julia_depot/dev/CUDA/src/compiler/compilation.jl:100 [inlined]
 [21] LLVM.ThreadSafeContext(f::CUDA.var"#203#204"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}})
    @ LLVM /disk/mschanen/julia_depot/packages/LLVM/X1AeZ/src/executionengine/ts_module.jl:14
 [22] JuliaContext(f::CUDA.var"#203#204"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}})
    @ GPUCompiler /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/driver.jl:74
 [23] compile
    @ /disk/mschanen/julia_depot/dev/CUDA/src/compiler/compilation.jl:99 [inlined]
 [24] actual_compilation(cache::Dict{UInt64, Any}, key::UInt64, cfg::GPUCompiler.CompilerConfig{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, ft::Type, tt::Type, world::UInt64, compiler::typeof(CUDA.compile), linker::typeof(CUDA.link))
    @ GPUCompiler /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/cache.jl:184
 [25] cached_compilation(cache::Dict{UInt64, Any}, cfg::GPUCompiler.CompilerConfig{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, ft::Type, tt::Type, compiler::Function, linker::Function)
    @ GPUCompiler /disk/mschanen/julia_depot/packages/GPUCompiler/anMCs/src/cache.jl:163
 [26] macro expansion
    @ /disk/mschanen/julia_depot/dev/CUDA/src/compiler/execution.jl:310 [inlined]
 [27] macro expansion
    @ ./lock.jl:267 [inlined]
 [28] cufunction(f::typeof(sqrt_kernel), tt::Type{Tuple{CuDeviceVector{Float64, 1}}}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA /disk/mschanen/julia_depot/dev/CUDA/src/compiler/execution.jl:306
 [29] cufunction(f::typeof(sqrt_kernel), tt::Type{Tuple{CuDeviceVector{Float64, 1}}})
    @ CUDA /disk/mschanen/julia_depot/dev/CUDA/src/compiler/execution.jl:303
 [30] macro expansion
    @ /disk/mschanen/julia_depot/dev/CUDA/src/compiler/execution.jl:104 [inlined]
 [31] top-level scope
    @ /disk/mschanen/julia_depot/dev/CUDA/src/utilities.jl:25
in expression starting at /home/mschanen/test/mwe/mwe.jl:11

To reproduce

The Minimal Working Example (MWE) for this bug:

using CUDA

function sqrt_kernel(x)
    tx = threadIdx().x
    if tx <= length(x)
        x[tx] = sqrt(x[tx])
    end
    return nothing
end
x = zeros(10) |> CuArray
CUDA.@sync @cuda sqrt_kernel(x)
Manifest.toml

# This file is machine-generated - editing it directly is not advised

julia_version = "1.9.0-rc1"
manifest_format = "2.0"
project_hash = "3509c5bf235fb7c0326b865a545a502f318a7ac8"

[[deps.AbstractFFTs]]
deps = ["LinearAlgebra"]
git-tree-sha1 = "16b6dbc4cf7caee4e1e75c49485ec67b667098a0"
uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c"
version = "1.3.1"

    [deps.AbstractFFTs.extensions]
    AbstractFFTsChainRulesCoreExt = "ChainRulesCore"

    [deps.AbstractFFTs.weakdeps]
    ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"

[[deps.Adapt]]
deps = ["LinearAlgebra", "Requires"]
git-tree-sha1 = "cc37d689f599e8df4f464b2fa3870ff7db7492ef"
uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
version = "3.6.1"

    [deps.Adapt.extensions]
    AdaptStaticArraysExt = "StaticArrays"

    [deps.Adapt.weakdeps]
    StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"

[[deps.ArgTools]]
uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f"
version = "1.1.1"

[[deps.Artifacts]]
uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"

[[deps.BFloat16s]]
deps = ["LinearAlgebra", "Printf", "Random", "Test"]
git-tree-sha1 = "dbf84058d0a8cbbadee18d25cf606934b22d7c66"
uuid = "ab4f0b2a-ad5b-11e8-123f-65d77653426b"
version = "0.4.2"

[[deps.Base64]]
uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"

[[deps.CEnum]]
git-tree-sha1 = "eb4cb44a499229b3b8426dcfb5dd85333951ff90"
uuid = "fa961155-64e5-5f13-b03f-caf6b980ea82"
version = "0.4.2"

[[deps.CUDA]]
deps = ["AbstractFFTs", "Adapt", "BFloat16s", "CEnum", "CUDA_Driver_jll", "CUDA_Runtime_Discovery", "CUDA_Runtime_jll", "CompilerSupportLibraries_jll", "ExprTools", "GPUArrays", "GPUCompiler", "LLVM", "LazyArtifacts", "Libdl", "LinearAlgebra", "Logging", "Preferences", "Printf", "Random", "Random123", "RandomNumbers", "Reexport", "Requires", "SparseArrays", "SpecialFunctions"]
path = "/disk/mschanen/julia_depot/dev/CUDA"
uuid = "052768ef-5323-5732-b1bb-66c8b64840ba"
version = "4.0.1"

[[deps.CUDA_Driver_jll]]
deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "Pkg"]
git-tree-sha1 = "a1efe2bfb96c10906dede10f1580aca0ae6d092f"
uuid = "4ee394cb-3365-5eb0-8335-949819d2adfc"
version = "0.4.0+2"

[[deps.CUDA_Runtime_Discovery]]
deps = ["Libdl"]
git-tree-sha1 = "58dd8ec29f54f08c04b052d2c2fa6760b4f4b3a4"
uuid = "1af6417a-86b4-443c-805f-a4643ffb695f"
version = "0.1.1"

[[deps.CUDA_Runtime_jll]]
deps = ["Artifacts", "CUDA_Driver_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"]
git-tree-sha1 = "de2e366b4e66a8690567d12dec6f550da9addec7"
uuid = "76a88914-d11a-5bdc-97e0-2f5a05c973a2"
version = "0.4.0+0"

[[deps.CompilerSupportLibraries_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae"
version = "1.0.2+0"

[[deps.Dates]]
deps = ["Printf"]
uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"

[[deps.DocStringExtensions]]
deps = ["LibGit2"]
git-tree-sha1 = "2fb1e02f2b635d0845df5d7c167fec4dd739b00d"
uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
version = "0.9.3"

[[deps.Downloads]]
deps = ["ArgTools", "FileWatching", "LibCURL", "NetworkOptions"]
uuid = "f43a241f-c20a-4ad4-852c-f6b1247861c6"
version = "1.6.0"

[[deps.ExprTools]]
git-tree-sha1 = "c1d06d129da9f55715c6c212866f5b1bddc5fa00"
uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04"
version = "0.1.9"

[[deps.FileWatching]]
uuid = "7b1f6079-737a-58dc-b8bc-7a2ca5c1b5ee"

[[deps.GPUArrays]]
deps = ["Adapt", "GPUArraysCore", "LLVM", "LinearAlgebra", "Printf", "Random", "Reexport", "Serialization", "Statistics"]
git-tree-sha1 = "7a2e790b1e2e6f648cfb25c4500c5de1f7b375ef"
uuid = "0c68f7d7-f131-5f86-a1c3-88cf8149b2d7"
version = "8.6.5"

[[deps.GPUArraysCore]]
deps = ["Adapt"]
git-tree-sha1 = "1cd7f0af1aa58abc02ea1d872953a97359cb87fa"
uuid = "46192b85-c4d5-4398-a991-12ede77f4527"
version = "0.1.4"

[[deps.GPUCompiler]]
deps = ["ExprTools", "InteractiveUtils", "LLVM", "Libdl", "Logging", "TimerOutputs", "UUIDs"]
git-tree-sha1 = "fd6431121f31fed05a5386ac88b9bb3f97fdfa69"
uuid = "61eb1bfa-7361-4325-ad38-22787b887f55"
version = "0.18.0"

[[deps.InteractiveUtils]]
deps = ["Markdown"]
uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"

[[deps.IrrationalConstants]]
git-tree-sha1 = "630b497eafcc20001bba38a4651b327dcfc491d2"
uuid = "92d709cd-6900-40b7-9082-c6be49f344b6"
version = "0.2.2"

[[deps.JLLWrappers]]
deps = ["Preferences"]
git-tree-sha1 = "abc9885a7ca2052a736a600f7fa66209f96506e1"
uuid = "692b3bcd-3c85-4b1f-b108-f13ce0eb3210"
version = "1.4.1"

[[deps.LLVM]]
deps = ["CEnum", "LLVMExtra_jll", "Libdl", "Printf", "Unicode"]
git-tree-sha1 = "1c614dfbecbaee4897b506bba2b432bf0d21f2ed"
uuid = "929cbde3-209d-540e-8aea-75f648917ca0"
version = "4.17.0"

[[deps.LLVMExtra_jll]]
deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"]
git-tree-sha1 = "e46e3a40daddcbe851f86db0ec4a4a3d4badf800"
uuid = "dad2f222-ce93-54a1-a47d-0025e8a3acab"
version = "0.0.19+0"

[[deps.LazyArtifacts]]
deps = ["Artifacts", "Pkg"]
uuid = "4af54fe1-eca0-43a8-85a7-787d91b784e3"

[[deps.LibCURL]]
deps = ["LibCURL_jll", "MozillaCACerts_jll"]
uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21"
version = "0.6.3"

[[deps.LibCURL_jll]]
deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll", "Zlib_jll", "nghttp2_jll"]
uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0"
version = "7.84.0+0"

[[deps.LibGit2]]
deps = ["Base64", "NetworkOptions", "Printf", "SHA"]
uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"

[[deps.LibSSH2_jll]]
deps = ["Artifacts", "Libdl", "MbedTLS_jll"]
uuid = "29816b5a-b9ab-546f-933c-edad1886dfa8"
version = "1.10.2+0"

[[deps.Libdl]]
uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"

[[deps.LinearAlgebra]]
deps = ["Libdl", "OpenBLAS_jll", "libblastrampoline_jll"]
uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"

[[deps.LogExpFunctions]]
deps = ["DocStringExtensions", "IrrationalConstants", "LinearAlgebra"]
git-tree-sha1 = "0a1b7c2863e44523180fdb3146534e265a91870b"
uuid = "2ab3a3ac-af41-5b50-aa03-7779005ae688"
version = "0.3.23"

    [deps.LogExpFunctions.extensions]
    LogExpFunctionsChainRulesCoreExt = "ChainRulesCore"
    LogExpFunctionsChangesOfVariablesExt = "ChangesOfVariables"
    LogExpFunctionsInverseFunctionsExt = "InverseFunctions"

    [deps.LogExpFunctions.weakdeps]
    ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
    ChangesOfVariables = "9e997f8a-9a97-42d5-a9f1-ce6bfc15e2c0"
    InverseFunctions = "3587e190-3f89-42d0-90ee-14403ec27112"

[[deps.Logging]]
uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"

[[deps.Markdown]]
deps = ["Base64"]
uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"

[[deps.MbedTLS_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1"
version = "2.28.2+0"

[[deps.MozillaCACerts_jll]]
uuid = "14a3606d-f60d-562e-9121-12d972cd8159"
version = "2022.10.11"

[[deps.NetworkOptions]]
uuid = "ca575930-c2e3-43a9-ace4-1e988b2c1908"
version = "1.2.0"

[[deps.OpenBLAS_jll]]
deps = ["Artifacts", "CompilerSupportLibraries_jll", "Libdl"]
uuid = "4536629a-c528-5b80-bd46-f80d51c5b363"
version = "0.3.21+4"

[[deps.OpenLibm_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "05823500-19ac-5b8b-9628-191a04bc5112"
version = "0.8.1+0"

[[deps.OpenSpecFun_jll]]
deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"]
git-tree-sha1 = "13652491f6856acfd2db29360e1bbcd4565d04f1"
uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e"
version = "0.5.5+0"

[[deps.Pkg]]
deps = ["Artifacts", "Dates", "Downloads", "FileWatching", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Serialization", "TOML", "Tar", "UUIDs", "p7zip_jll"]
uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
version = "1.9.0"

[[deps.Preferences]]
deps = ["TOML"]
git-tree-sha1 = "47e5f437cc0e7ef2ce8406ce1e7e24d44915f88d"
uuid = "21216c6a-2e73-6563-6e65-726566657250"
version = "1.3.0"

[[deps.Printf]]
deps = ["Unicode"]
uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7"

[[deps.REPL]]
deps = ["InteractiveUtils", "Markdown", "Sockets", "Unicode"]
uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"

[[deps.Random]]
deps = ["SHA", "Serialization"]
uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"

[[deps.Random123]]
deps = ["Random", "RandomNumbers"]
git-tree-sha1 = "7a1a306b72cfa60634f03a911405f4e64d1b718b"
uuid = "74087812-796a-5b5d-8853-05524746bad3"
version = "1.6.0"

[[deps.RandomNumbers]]
deps = ["Random", "Requires"]
git-tree-sha1 = "043da614cc7e95c703498a491e2c21f58a2b8111"
uuid = "e6cf234a-135c-5ec9-84dd-332b85af5143"
version = "1.5.3"

[[deps.Reexport]]
git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b"
uuid = "189a3867-3050-52da-a836-e630ba90ab69"
version = "1.2.2"

[[deps.Requires]]
deps = ["UUIDs"]
git-tree-sha1 = "838a3a4188e2ded87a4f9f184b4b0d78a1e91cb7"
uuid = "ae029012-a4dd-5104-9daa-d747884805df"
version = "1.3.0"

[[deps.SHA]]
uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
version = "0.7.0"

[[deps.Serialization]]
uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"

[[deps.Sockets]]
uuid = "6462fe0b-24de-5631-8697-dd941f90decc"

[[deps.SparseArrays]]
deps = ["Libdl", "LinearAlgebra", "Random", "Serialization", "SuiteSparse_jll"]
uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"

[[deps.SpecialFunctions]]
deps = ["IrrationalConstants", "LogExpFunctions", "OpenLibm_jll", "OpenSpecFun_jll"]
git-tree-sha1 = "ef28127915f4229c971eb43f3fc075dd3fe91880"
uuid = "276daf66-3868-5448-9aa4-cd146d93841b"
version = "2.2.0"

    [deps.SpecialFunctions.extensions]
    SpecialFunctionsChainRulesCoreExt = "ChainRulesCore"

    [deps.SpecialFunctions.weakdeps]
    ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"

[[deps.Statistics]]
deps = ["LinearAlgebra", "SparseArrays"]
uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
version = "1.9.0"

[[deps.SuiteSparse_jll]]
deps = ["Artifacts", "Libdl", "Pkg", "libblastrampoline_jll"]
uuid = "bea87d4a-7f5b-5778-9afe-8cc45184846c"
version = "5.10.1+6"

[[deps.TOML]]
deps = ["Dates"]
uuid = "fa267f1f-6049-4f14-aa54-33bafae1ed76"
version = "1.0.3"

[[deps.Tar]]
deps = ["ArgTools", "SHA"]
uuid = "a4e569a6-e804-4fa4-b0f3-eef7a1d5b13e"
version = "1.10.0"

[[deps.Test]]
deps = ["InteractiveUtils", "Logging", "Random", "Serialization"]
uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[[deps.TimerOutputs]]
deps = ["ExprTools", "Printf"]
git-tree-sha1 = "f2fd3f288dfc6f507b0c3a2eb3bac009251e548b"
uuid = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"
version = "0.5.22"

[[deps.UUIDs]]
deps = ["Random", "SHA"]
uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"

[[deps.Unicode]]
uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"

[[deps.Zlib_jll]]
deps = ["Libdl"]
uuid = "83775a58-1f1d-513f-b197-d71354ab007a"
version = "1.2.13+0"

[[deps.libblastrampoline_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "8e850b90-86db-534c-a0d3-1478176c7d93"
version = "5.4.0+0"

[[deps.nghttp2_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d"
version = "1.48.0+0"

[[deps.p7zip_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "3f19e933-33d8-53b3-aaab-bd5110c3b7a0"
version = "17.4.0+0"

Expected behavior

The kernel executes successfully as it has with a previous stack.

Version info

Details on Julia:

# please post the output of:
versioninfo()
Julia Version 1.9.0-rc1
Commit 3b2e0d8fbc1 (2023-03-07 07:51 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 144 × Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake-avx512)
  Threads: 1 on 144 virtual cores
Environment:
  JULIA_FOLDER = /disk/mschanen/julia
  JULIA_HIOP_LIBRARY_PATH = /home/mschanen/local/lib
  LD_LIBRARY_PATH = /home/mschanen/local/lib:
  JULIA_DEPOT_PATH = /disk/mschanen/julia_depot

Details on CUDA:

# please post the output of:
CUDA.versioninfo()
CUDA runtime 12.1, artifact installation
CUDA driver 12.1
NVIDIA driver 525.78.1, originally for CUDA 12.0

Libraries:
- CUBLAS: 12.1.0
- CURAND: 10.3.2
- CUFFT: 11.0.2
- CUSOLVER: 11.4.4
- CUSPARSE: 12.0.2
- CUPTI: 18.0.0
- NVML: 12.0.0+525.78.1

Toolchain:
- Julia: 1.9.0-rc1
- LLVM: 14.0.6
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5
- Device capability support: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

2 devices:
  0: Quadro GV100 (sm_70, 31.733 GiB / 32.000 GiB available)
  1: Quadro GV100 (sm_70, 31.733 GiB / 32.000 GiB available)

Additional context:

Attached is a Manifest.toml where the kernel still works. Unfortunately, it is from another project.

Edit: Hardware reproduced:

  • V100 (the one filed here in this report) with Julia 1.8 and 1.9
  • RTX 4080 with Windows WSL (also Julia 1.8 and Julia 1.9)
  • A A1000 based system installs the CUDA 11.8 runtime. There it still works.
julia> CUDA.versioninfo()
CUDA runtime 11.8, artifact installation
CUDA driver 12.0
NVIDIA driver 525.89.2

Libraries: 
- CUBLAS: 11.11.3
- CURAND: 10.3.0
- CUFFT: 10.9.0
- CUSOLVER: 11.4.1
- CUSPARSE: 11.7.5
- CUPTI: 18.0.0
- NVML: 12.0.0+525.89.2

Toolchain:
- Julia: 1.8.5
- LLVM: 13.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

1 device:
  0: NVIDIA RTX A1000 Laptop GPU (sm_86, 3.700 GiB / 4.000 GiB available)

Manifest.zip

@michel2323 michel2323 added the bug Something isn't working label Mar 21, 2023
@maleadt
Copy link
Member

maleadt commented Mar 21, 2023

@maleadt maleadt closed this as not planned Won't fix, can't repro, duplicate, stale Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants