Skip to content
This repository has been archived by the owner on Mar 12, 2021. It is now read-only.

import CuArrays always fails with CUDA 10.2.89 (but works fine with CUDA 10.0.130 and 10.1.105) #601

Closed
DilumAluthge opened this issue Feb 24, 2020 · 14 comments
Labels

Comments

@DilumAluthge
Copy link

DilumAluthge commented Feb 24, 2020

Summary

I am unable to run import CuArrays with CUDA 10.2.89. However, I am able to successfully run import CuArrays with either CUDA 10.0.130 or CUDA 10.1.105 on the same cluster. (This is an HPC cluster with multiple different versions of CUDA available.)

The error I get looks like this:

┌ Error: CuArrays.jl failed to initialize
│   exception =
│    could not load library "libcublas"
│    libcublas.so: cannot open shared object file: No such file or directory
│    Stacktrace:
│     [1] dlopen(::String, ::UInt32; throw_error::Bool) at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109
│     [2] dlopen at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109 [inlined] (repeats 2 times)
│     [3] (::CuArrays.CUBLAS.var"#509#lookup_fptr#28")() at /users/daluthge/.julia/packages/CUDAapi/wYUAO/src/call.jl:29
│     [4] macro expansion at /users/daluthge/.julia/packages/CUDAapi/wYUAO/src/call.jl:37 [inlined]
│     [5] macro expansion at /users/daluthge/.julia/packages/CuArrays/HE8G6/src/blas/error.jl:65 [inlined]
│     [6] cublasGetProperty at /users/daluthge/.julia/packages/CuArrays/HE8G6/src/blas/libcublas.jl:27 [inlined]
│     [7] cublasGetProperty at /users/daluthge/.julia/packages/CuArrays/HE8G6/src/blas/wrappers.jl:38 [inlined]
│     [8] version() at /users/daluthge/.julia/packages/CuArrays/HE8G6/src/blas/wrappers.jl:42
│     [9] __init__() at /users/daluthge/.julia/packages/CuArrays/HE8G6/src/CuArrays.jl:98
│     [10] _include_from_serialized(::String, ::Array{Any,1}) at ./loading.jl:697
│     [11] _require_from_serialized(::String) at ./loading.jl:748
│     [12] _require(::Base.PkgId) at ./loading.jl:1039
│     [13] require(::Base.PkgId) at ./loading.jl:927
│     [14] require(::Module, ::Symbol) at ./loading.jl:922
│     [15] eval(::Module, ::Any) at ./boot.jl:331
│     [16] eval_user_input(::Any, ::REPL.REPLBackend) at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:118
│     [17] macro expansion at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:150 [inlined]
│     [18] (::REPL.var"#31#32"{REPL.REPLBackend})() at ./task.jl:358
└ @ CuArrays ~/.julia/packages/CuArrays/HE8G6/src/CuArrays.jl:141

How to reproduce

First run these commands in Bash:

export JULIA_CUDA_VERBOSE="true"
export JULIA_DEBUG="all"
rm -rf ~/.julia

Then open Julia and run the following:

julia> versioninfo(verbose = true)

julia> import Pkg

julia> Pkg.add("CuArrays")

julia> import CuArrays

Full output

CUDA 10.2.89: (fails)

Click to expand
$ which nvcc
/gpfs/runtime/opt/cuda/10.2/cuda/bin/nvcc

$ which nvdisasm
/gpfs/runtime/opt/cuda/10.2/cuda/bin/nvdisasm

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

$ nvdisasm --version
nvdisasm: NVIDIA (R) CUDA disassembler
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:25:30_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.5.0-DEV.274 (2020-02-15)
 _/ |\__'_|_|_|\__'_|  |  Commit 8eb0f9fefb (8 days old master)
|__/                   |

julia> versioninfo(verbose = true)
Julia Version 1.5.0-DEV.274
Commit 8eb0f9fefb (2020-02-15 12:41 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      "Red Hat Enterprise Linux Server release 7.3 (Maipo)"
  uname: Linux 3.10.0-957.5.1.el7.x86_64 #1 SMP Wed Dec 19 10:46:58 EST 2018 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Gold 5122 CPU @ 3.60GHz:
              speed         user         nice          sys         idle          irq
       #1  3601 MHz  260888663 s        209 s   79076485 s  314462620 s          0 s
       #2  3601 MHz  225435476 s        556 s   66634601 s  362178032 s          0 s
       #3  3601 MHz  153758996 s        331 s   47119890 s  453453880 s          0 s
       #4  3601 MHz  133428383 s        355 s   44200083 s  476426058 s          0 s
       #5  3601 MHz  231463336 s        286 s   61205976 s  362261076 s          0 s
       #6  3601 MHz  154414432 s        604 s   45161270 s  455708660 s          0 s
       #7  3601 MHz  104946172 s        268 s   34510105 s  515709887 s          0 s
       #8  3601 MHz   94884882 s        496 s   31751569 s  528198402 s          0 s

  Memory: 93.04103088378906 GB (80644.640625 MB free)
  Uptime: 6.578586e6 sec
  Load Avg:  1.201171875  1.01953125  0.62353515625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
Environment:
  JULIA_DEBUG = all
  JULIA_CUDA_VERBOSE = true
  CPLUS_INCLUDE_PATH = /gpfs/runtime/opt/gcc/8.3/include
  MANPATH = /gpfs/runtime/opt/gcc/8.3/share/man:/gpfs/runtime/opt/python/3.7.4/share/man:/gpfs/runtime/opt/git/2.20.2/share/man:/gpfs/runtime/opt/binutils/2.31/share/man:/gpfs/runtime/opt/intel/2017.0/man/common/man1:
  TERM = xterm-256color
  LIBRARY_PATH = /gpfs/runtime/opt/cuda/10.2/cuda/lib64:/gpfs/runtime/opt/cuda/10.2/cuda/lib:/gpfs/runtime/opt/python/3.7.4/lib:/gpfs/runtime/opt/binutils/2.31/lib:/gpfs/runtime/opt/intel/2017.0/lib/intel64:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64
  CUDA_HOME = /gpfs/runtime/opt/cuda/10.2/cuda
  LD_LIBRARY_PATH = /gpfs/runtime/opt/cuda/10.2/cuda/lib64:/gpfs/runtime/opt/cuda/10.2/cuda/lib:/gpfs/runtime/opt/gcc/8.3/lib64:/gpfs/runtime/opt/python/3.7.4/lib:/gpfs/runtime/opt/binutils/2.31/lib:/gpfs/runtime/opt/intel/2017.0/lib/intel64:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64:/gpfs/runtime/opt/java/8u111/jre/lib/amd64
  CPATH = /gpfs/runtime/opt/cuda/10.2/cuda/include:/gpfs/runtime/opt/gcc/8.3/include:/gpfs/runtime/opt/python/3.7.4/include:/gpfs/runtime/opt/binutils/2.31/include:/gpfs/runtime/opt/intel/2017.0/mkl/include
  NLSPATH = /gpfs/runtime/opt/intel/2017.0/lib/intel64/locale/en_US:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64/locale/en_US
  PATH = /gpfs/runtime/opt/cuda/10.2/cuda/bin:/gpfs/runtime/opt/gcc/8.3/bin:/users/daluthge/bin:/gpfs/runtime/opt/python/3.7.4/bin:/gpfs/runtime/opt/git/2.20.2/bin:/gpfs/runtime/opt/binutils/2.31/bin:/gpfs/runtime/opt/intel/2017.0/bin:/gpfs/runtime/opt/matlab/R2017b/bin:/gpfs/runtime/opt/java/8u111/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/lpp/mmfs/bin:/usr/lpp/mmfs/sbin:/opt/ibutils/bin:/gpfs/runtime/bin
  C_INCLUDE_PATH = /gpfs/runtime/opt/gcc/8.3/include
  LD_RUN_PATH = /gpfs/runtime/opt/cuda/10.2/cuda/lib64:/gpfs/runtime/opt/cuda/10.2/cuda/lib:/gpfs/runtime/opt/gcc/8.3/lib64:/gpfs/runtime/opt/python/3.7.4/lib:/gpfs/runtime/opt/binutils/2.31/lib:/gpfs/runtime/opt/intel/2017.0/lib/intel64:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64
  JAVA_HOME = /gpfs/runtime/opt/java/8u111
  MODULEPATH = /gpfs/runtime/modulefiles
  HOME = /users/daluthge
  IPP_PATH = /gpfs/runtime/opt/intel/2017.0/ipp
  MODULEHOME = /gpfs/runtime/pymodules
  PKG_CONFIG_PATH = /gpfs/runtime/opt/python/3.7.4/lib/pkgconfig
  QT_PLUGIN_PATH = /usr/lib64/kde4/plugins:/usr/lib/kde4/plugins

julia> import Pkg

julia> Pkg.add("CuArrays")
    Cloning default registries into `~/.julia`
######################################################################## 100.0%
      Added registry `General` to `~/.julia/registries/General`
  Resolving package versions...
  Installed Requires ─────────── v1.0.1
  Installed Adapt ────────────── v1.0.1
  Installed TimerOutputs ─────── v0.5.3
  Installed CUDAapi ──────────── v3.1.0
  Installed AbstractFFTs ─────── v0.5.0
  Installed GPUArrays ────────── v2.0.1
  Installed CuArrays ─────────── v1.7.2
  Installed CUDAnative ───────── v2.10.2
  Installed CEnum ────────────── v0.2.0
  Installed OrderedCollections ─ v1.1.0
  Installed DataStructures ───── v0.17.9
  Installed MacroTools ───────── v0.5.4
  Installed BinaryProvider ───── v0.5.8
  Installed NNlib ────────────── v0.6.4
  Installed CUDAdrv ──────────── v6.0.0
  Installed LLVM ─────────────── v1.3.3
   Updating `/gpfs_home/daluthge/.julia/environments/v1.5/Project.toml`
  [3a865a2d] + CuArrays v1.7.2
   Updating `/gpfs_home/daluthge/.julia/environments/v1.5/Manifest.toml`
  [621f4979] + AbstractFFTs v0.5.0
  [79e6a3ab] + Adapt v1.0.1
  [b99e7846] + BinaryProvider v0.5.8
  [fa961155] + CEnum v0.2.0
  [3895d2a7] + CUDAapi v3.1.0
  [c5f51814] + CUDAdrv v6.0.0
  [be33ccc6] + CUDAnative v2.10.2
  [3a865a2d] + CuArrays v1.7.2
  [864edb3b] + DataStructures v0.17.9
  [0c68f7d7] + GPUArrays v2.0.1
  [929cbde3] + LLVM v1.3.3
  [1914dd2f] + MacroTools v0.5.4
  [872c559c] + NNlib v0.6.4
  [bac558e1] + OrderedCollections v1.1.0
  [ae029012] + Requires v1.0.1
  [a759f4b9] + TimerOutputs v0.5.3
  [2a0f44e3] + Base64
  [8ba89e20] + Distributed
  [b77e0a4c] + InteractiveUtils
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [de0858da] + Printf
  [9a3f8284] + Random
  [ea8e919c] + SHA
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [2f01184e] + SparseArrays
  [10745b16] + Statistics
  [8dfed614] + Test
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
   Building NNlib → `~/.julia/packages/NNlib/3krvM/deps/build.log`

julia> import CuArrays
[ Info: Precompiling CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae]
┌ Debug: Precompiling CUDAapi [3895d2a7-ec45-59b8-82bb-cfc6a382f9b3]
└ @ Base loading.jl:1276
┌ Debug: Precompiling CUDAdrv [c5f51814-7f29-56b8-a69c-e4d8f6be1fde]
└ @ Base loading.jl:1276
┌ Debug: Precompiling CEnum [fa961155-64e5-5f13-b03f-caf6b980ea82]
└ @ Base loading.jl:1276
┌ Debug: Precompiling CUDAnative [be33ccc6-a3ff-5ff2-a52e-74243cff1e17]
└ @ Base loading.jl:1276
┌ Debug: Precompiling LLVM [929cbde3-209d-540e-8aea-75f648917ca0]
└ @ Base loading.jl:1276
┌ Debug: Found LLVM v9.0.1 at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/bin/../lib/libLLVM-9.so with support for AArch64, AMDGPU, ARC, ARM, AVR, BPF, Hexagon, Lanai, MSP430, Mips, NVPTX, PowerPC, RISCV, Sparc, SystemZ, WebAssembly, X86, XCore
└ @ LLVM ~/.julia/packages/LLVM/DAnFH/src/LLVM.jl:47
┌ Debug: Using LLVM.jl wrapper for LLVM v9.0
└ @ LLVM ~/.julia/packages/LLVM/DAnFH/src/LLVM.jl:75
┌ Debug: Precompiling Adapt [79e6a3ab-5dfb-504d-930d-738a2a938a0e]
└ @ Base loading.jl:1276
┌ Debug: Precompiling TimerOutputs [a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f]
└ @ Base loading.jl:1276
┌ Debug: Precompiling DataStructures [864edb3b-99cc-5e75-8d2d-829cb0a9cfe8]
└ @ Base loading.jl:1276
┌ Debug: Precompiling OrderedCollections [bac558e1-5e72-5ebc-8fee-abe8a469f55d]
└ @ Base loading.jl:1276
┌ Debug: Looking for CUDA toolkit via environment variables CUDA_HOME
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Request to look for binary nvdisasm
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.2/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for binary nvdisasm
│   locations =
│    20-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.2/cuda"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/bin"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/bin"
│     "/gpfs/runtime/opt/gcc/8.3/bin"
│     "/users/daluthge/bin"
│     "/gpfs/runtime/opt/python/3.7.4/bin"
│     "/gpfs/runtime/opt/git/2.20.2/bin"
│     "/gpfs/runtime/opt/binutils/2.31/bin"
│     ⋮
│     "/usr/bin"
│     "/usr/local/sbin"
│     "/usr/sbin"
│     "/usr/lpp/mmfs/bin"
│     "/usr/lpp/mmfs/sbin"
│     "/opt/ibutils/bin"
│     "/gpfs/runtime/bin"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found binary nvdisasm at /gpfs/runtime/opt/cuda/10.2/cuda/bin
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:141
┌ Debug: CUDA toolkit identified as 10.2.89
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:297
┌ Debug: Request to look for libdevice
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.2/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Look for libdevice
│   locations =
│    2-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.2/cuda"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/nvvm/libdevice"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found unified device library at /gpfs/runtime/opt/cuda/10.2/cuda/nvvm/libdevice/libdevice.10.bc
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:327
┌ Debug: Request to look for libcudadevrt
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.2/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for CUDA device runtime library libcudadevrt.a
│   locations =
│    3-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.2/cuda"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/lib"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/lib64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found CUDA device runtime library libcudadevrt.a at /gpfs/runtime/opt/cuda/10.2/cuda/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:379
┌ Debug: Request to look for library nvToolsExt
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.2/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for library libnvToolsExt.so, libnvToolsExt.so.1, libnvToolsExt.so.1.0
│   locations =
│    4-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.2/cuda"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/lib"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/lib64"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/libx64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found library libnvToolsExt.so at /gpfs/runtime/opt/cuda/10.2/cuda/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:90
┌ Debug: Request to look for library cupti
│   locations =
│    2-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.2/cuda"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for library libcupti.so, libcupti.so.10, libcupti.so.10.2
│   locations =
│    8-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.2/cuda"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/lib"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/lib64"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/libx64"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI/lib"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI/lib64"
│     "/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI/libx64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found library libcupti.so at /gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:90
┌ Debug: Using LLVM v9.0.1
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:170
┌ Debug: LLVM supports capabilities 2.0, 2.1, 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5 with PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3 and 6.4
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:191
┌ Debug: Using CUDA driver v10.2.0 and toolkit v10.2.0
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:196
┌ Debug: CUDA driver supports capabilities 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5 with PTX 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 2.0, 2.1, 2.2, 2.3, 3.0, 3.1, 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.2, 6.3, 6.4 and 6.5
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:213
┌ Debug: CUDA toolkit supports capabilities 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5 with PTX 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 2.0, 2.1, 2.2, 2.3, 3.0, 3.1, 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.2, 6.3, 6.4 and 6.5
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:214
┌ Debug: CUDAnative supports devices 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5; PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3 and 6.4
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:174
┌ Debug: Precompiling GPUArrays [0c68f7d7-f131-5f86-a1c3-88cf8149b2d7]
└ @ Base loading.jl:1276
┌ Debug: Precompiling AbstractFFTs [621f4979-c628-5d54-868e-fcf4e3e8185c]
└ @ Base loading.jl:1276
┌ Debug: Precompiling Requires [ae029012-a4dd-5104-9daa-d747884805df]
└ @ Base loading.jl:1276
┌ Debug: Precompiling MacroTools [1914dd2f-81c6-5fcd-8719-6d5c9610ff09]
└ @ Base loading.jl:1276
┌ Debug: Precompiling NNlib [872c559c-99b0-510c-b3b7-b6c96a88d5cd]
└ @ Base loading.jl:1276
┌ Warning: Incompatibility detected between CUDA and LLVM 8.0+; disabling debug info emission for CUDA kernels
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:114
┌ Error: CuArrays.jl failed to initialize
│   exception =
│    could not load library "libcublas"
│    libcublas.so: cannot open shared object file: No such file or directory
│    Stacktrace:
│     [1] dlopen(::String, ::UInt32; throw_error::Bool) at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109
│     [2] dlopen at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109 [inlined] (repeats 2 times)
│     [3] (::CuArrays.CUBLAS.var"#509#lookup_fptr#28")() at /users/daluthge/.julia/packages/CUDAapi/wYUAO/src/call.jl:29
│     [4] macro expansion at /users/daluthge/.julia/packages/CUDAapi/wYUAO/src/call.jl:37 [inlined]
│     [5] macro expansion at /users/daluthge/.julia/packages/CuArrays/HE8G6/src/blas/error.jl:65 [inlined]
│     [6] cublasGetProperty at /users/daluthge/.julia/packages/CuArrays/HE8G6/src/blas/libcublas.jl:27 [inlined]
│     [7] cublasGetProperty at /users/daluthge/.julia/packages/CuArrays/HE8G6/src/blas/wrappers.jl:38 [inlined]
│     [8] version() at /users/daluthge/.julia/packages/CuArrays/HE8G6/src/blas/wrappers.jl:42
│     [9] __init__() at /users/daluthge/.julia/packages/CuArrays/HE8G6/src/CuArrays.jl:98
│     [10] _include_from_serialized(::String, ::Array{Any,1}) at ./loading.jl:697
│     [11] _require_from_serialized(::String) at ./loading.jl:748
│     [12] _require(::Base.PkgId) at ./loading.jl:1039
│     [13] require(::Base.PkgId) at ./loading.jl:927
│     [14] require(::Module, ::Symbol) at ./loading.jl:922
│     [15] eval(::Module, ::Any) at ./boot.jl:331
│     [16] eval_user_input(::Any, ::REPL.REPLBackend) at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:118
│     [17] macro expansion at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:150 [inlined]
│     [18] (::REPL.var"#31#32"{REPL.REPLBackend})() at ./task.jl:358
└ @ CuArrays ~/.julia/packages/CuArrays/HE8G6/src/CuArrays.jl:141

CUDA 10.1.105: (works fine)

Click to expand
$ which nvcc
/gpfs/runtime/opt/cuda/10.1.105/cuda/bin/nvcc

$ which nvdisasm
/gpfs/runtime/opt/cuda/10.1.105/cuda/bin/nvdisasm

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105

$ nvdisasm --version
nvdisasm: NVIDIA (R) CUDA disassembler
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:51_PST_2019
Cuda compilation tools, release 10.1, V10.1.105
bash-4.2$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.5.0-DEV.274 (2020-02-15)
 _/ |\__'_|_|_|\__'_|  |  Commit 8eb0f9fefb (8 days old master)
|__/                   |

julia> versioninfo(verbose = true)
Julia Version 1.5.0-DEV.274
Commit 8eb0f9fefb (2020-02-15 12:41 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      "Red Hat Enterprise Linux Server release 7.3 (Maipo)"
  uname: Linux 3.10.0-957.5.1.el7.x86_64 #1 SMP Wed Dec 19 10:46:58 EST 2018 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Gold 5122 CPU @ 3.60GHz:
              speed         user         nice          sys         idle          irq
       #1  3601 MHz  260888239 s        209 s   79076296 s  314453175 s          0 s
       #2  3601 MHz  225435401 s        556 s   66634561 s  362168060 s          0 s
       #3  3601 MHz  153758683 s        331 s   47119620 s  453444427 s          0 s
       #4  3601 MHz  133428321 s        355 s   44199667 s  476416450 s          0 s
       #5  3601 MHz  231457860 s        286 s   61205722 s  362256718 s          0 s
       #6  3601 MHz  154411110 s        604 s   45161031 s  455702131 s          0 s
       #7  3601 MHz  104945705 s        268 s   34510058 s  515700335 s          0 s
       #8  3601 MHz   94884872 s        496 s   31751407 s  528188490 s          0 s

  Memory: 93.04103088378906 GB (80663.54296875 MB free)
  Uptime: 6.578485e6 sec
  Load Avg:  1.19580078125  0.96240234375  0.56298828125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
Environment:
  JULIA_DEBUG = all
  JULIA_CUDA_VERBOSE = true
  CPLUS_INCLUDE_PATH = /gpfs/runtime/opt/gcc/8.3/include
  MANPATH = /gpfs/runtime/opt/python/3.7.4/share/man:/gpfs/runtime/opt/git/2.20.2/share/man:/gpfs/runtime/opt/gcc/8.3/share/man:/gpfs/runtime/opt/binutils/2.31/share/man:/gpfs/runtime/opt/intel/2017.0/man/common/man1:
  TERM = xterm-256color
  LIBRARY_PATH = /gpfs/runtime/opt/cuda/10.1.105/cuda/lib64:/gpfs/runtime/opt/cuda/10.1.105/cuda/lib:/gpfs/runtime/opt/python/3.7.4/lib:/gpfs/runtime/opt/binutils/2.31/lib:/gpfs/runtime/opt/intel/2017.0/lib/intel64:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64
  CUDA_HOME = /gpfs/runtime/opt/cuda/10.1.105/cuda
  LD_LIBRARY_PATH = /gpfs/runtime/opt/cuda/10.1.105/cuda/lib64:/gpfs/runtime/opt/cuda/10.1.105/cuda/lib:/gpfs/runtime/opt/python/3.7.4/lib:/gpfs/runtime/opt/gcc/8.3/lib64:/gpfs/runtime/opt/binutils/2.31/lib:/gpfs/runtime/opt/intel/2017.0/lib/intel64:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64:/gpfs/runtime/opt/java/8u111/jre/lib/amd64
  CPATH = /gpfs/runtime/opt/cuda/10.1.105/cuda/include:/gpfs/runtime/opt/python/3.7.4/include:/gpfs/runtime/opt/gcc/8.3/include:/gpfs/runtime/opt/binutils/2.31/include:/gpfs/runtime/opt/intel/2017.0/mkl/include
  NLSPATH = /gpfs/runtime/opt/intel/2017.0/lib/intel64/locale/en_US:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64/locale/en_US
  PATH = /gpfs/runtime/opt/cuda/10.1.105/cuda/bin:/users/daluthge/bin:/gpfs/runtime/opt/python/3.7.4/bin:/gpfs/runtime/opt/git/2.20.2/bin:/gpfs/runtime/opt/gcc/8.3/bin:/gpfs/runtime/opt/binutils/2.31/bin:/gpfs/runtime/opt/intel/2017.0/bin:/gpfs/runtime/opt/matlab/R2017b/bin:/gpfs/runtime/opt/java/8u111/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/lpp/mmfs/bin:/usr/lpp/mmfs/sbin:/opt/ibutils/bin:/gpfs/runtime/bin
  C_INCLUDE_PATH = /gpfs/runtime/opt/gcc/8.3/include
  LD_RUN_PATH = /gpfs/runtime/opt/cuda/10.1.105/cuda/lib64:/gpfs/runtime/opt/cuda/10.1.105/cuda/lib:/gpfs/runtime/opt/python/3.7.4/lib:/gpfs/runtime/opt/gcc/8.3/lib64:/gpfs/runtime/opt/binutils/2.31/lib:/gpfs/runtime/opt/intel/2017.0/lib/intel64:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64
  JAVA_HOME = /gpfs/runtime/opt/java/8u111
  MODULEPATH = /gpfs/runtime/modulefiles
  HOME = /users/daluthge
  IPP_PATH = /gpfs/runtime/opt/intel/2017.0/ipp
  MODULEHOME = /gpfs/runtime/pymodules
  PKG_CONFIG_PATH = /gpfs/runtime/opt/python/3.7.4/lib/pkgconfig
  QT_PLUGIN_PATH = /usr/lib64/kde4/plugins:/usr/lib/kde4/plugins

julia> import Pkg

julia> Pkg.add("CuArrays")
    Cloning default registries into `~/.julia`
######################################################################## 100.0%
      Added registry `General` to `~/.julia/registries/General`
  Resolving package versions...
  Installed Requires ─────────── v1.0.1
  Installed Adapt ────────────── v1.0.1
  Installed TimerOutputs ─────── v0.5.3
  Installed AbstractFFTs ─────── v0.5.0
  Installed CUDAapi ──────────── v3.1.0
  Installed GPUArrays ────────── v2.0.1
  Installed CuArrays ─────────── v1.7.2
  Installed CUDAnative ───────── v2.10.2
  Installed CEnum ────────────── v0.2.0
  Installed OrderedCollections ─ v1.1.0
  Installed DataStructures ───── v0.17.9
  Installed MacroTools ───────── v0.5.4
  Installed BinaryProvider ───── v0.5.8
  Installed NNlib ────────────── v0.6.4
  Installed CUDAdrv ──────────── v6.0.0
  Installed LLVM ─────────────── v1.3.3
   Updating `/gpfs_home/daluthge/.julia/environments/v1.5/Project.toml`
  [3a865a2d] + CuArrays v1.7.2
   Updating `/gpfs_home/daluthge/.julia/environments/v1.5/Manifest.toml`
  [621f4979] + AbstractFFTs v0.5.0
  [79e6a3ab] + Adapt v1.0.1
  [b99e7846] + BinaryProvider v0.5.8
  [fa961155] + CEnum v0.2.0
  [3895d2a7] + CUDAapi v3.1.0
  [c5f51814] + CUDAdrv v6.0.0
  [be33ccc6] + CUDAnative v2.10.2
  [3a865a2d] + CuArrays v1.7.2
  [864edb3b] + DataStructures v0.17.9
  [0c68f7d7] + GPUArrays v2.0.1
  [929cbde3] + LLVM v1.3.3
  [1914dd2f] + MacroTools v0.5.4
  [872c559c] + NNlib v0.6.4
  [bac558e1] + OrderedCollections v1.1.0
  [ae029012] + Requires v1.0.1
  [a759f4b9] + TimerOutputs v0.5.3
  [2a0f44e3] + Base64
  [8ba89e20] + Distributed
  [b77e0a4c] + InteractiveUtils
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [de0858da] + Printf
  [9a3f8284] + Random
  [ea8e919c] + SHA
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [2f01184e] + SparseArrays
  [10745b16] + Statistics
  [8dfed614] + Test
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
   Building NNlib → `~/.julia/packages/NNlib/3krvM/deps/build.log`

julia> import CuArrays
[ Info: Precompiling CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae]
┌ Debug: Precompiling CUDAapi [3895d2a7-ec45-59b8-82bb-cfc6a382f9b3]
└ @ Base loading.jl:1276
┌ Debug: Precompiling CUDAdrv [c5f51814-7f29-56b8-a69c-e4d8f6be1fde]
└ @ Base loading.jl:1276
┌ Debug: Precompiling CEnum [fa961155-64e5-5f13-b03f-caf6b980ea82]
└ @ Base loading.jl:1276
┌ Debug: Precompiling CUDAnative [be33ccc6-a3ff-5ff2-a52e-74243cff1e17]
└ @ Base loading.jl:1276
┌ Debug: Precompiling LLVM [929cbde3-209d-540e-8aea-75f648917ca0]
└ @ Base loading.jl:1276
┌ Debug: Found LLVM v9.0.1 at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/bin/../lib/libLLVM-9.so with support for AArch64, AMDGPU, ARC, ARM, AVR, BPF, Hexagon, Lanai, MSP430, Mips, NVPTX, PowerPC, RISCV, Sparc, SystemZ, WebAssembly, X86, XCore
└ @ LLVM ~/.julia/packages/LLVM/DAnFH/src/LLVM.jl:47
┌ Debug: Using LLVM.jl wrapper for LLVM v9.0
└ @ LLVM ~/.julia/packages/LLVM/DAnFH/src/LLVM.jl:75
┌ Debug: Precompiling Adapt [79e6a3ab-5dfb-504d-930d-738a2a938a0e]
└ @ Base loading.jl:1276
┌ Debug: Precompiling TimerOutputs [a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f]
└ @ Base loading.jl:1276
┌ Debug: Precompiling DataStructures [864edb3b-99cc-5e75-8d2d-829cb0a9cfe8]
└ @ Base loading.jl:1276
┌ Debug: Precompiling OrderedCollections [bac558e1-5e72-5ebc-8fee-abe8a469f55d]
└ @ Base loading.jl:1276
┌ Debug: Looking for CUDA toolkit via environment variables CUDA_HOME
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Request to look for binary nvdisasm
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for binary nvdisasm
│   locations =
│    20-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/bin"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/bin"
│     "/users/daluthge/bin"
│     "/gpfs/runtime/opt/python/3.7.4/bin"
│     "/gpfs/runtime/opt/git/2.20.2/bin"
│     "/gpfs/runtime/opt/gcc/8.3/bin"
│     "/gpfs/runtime/opt/binutils/2.31/bin"
│     ⋮
│     "/usr/bin"
│     "/usr/local/sbin"
│     "/usr/sbin"
│     "/usr/lpp/mmfs/bin"
│     "/usr/lpp/mmfs/sbin"
│     "/opt/ibutils/bin"
│     "/gpfs/runtime/bin"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found binary nvdisasm at /gpfs/runtime/opt/cuda/10.1.105/cuda/bin
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:141
┌ Debug: CUDA toolkit identified as 10.1.105
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:297
┌ Debug: Request to look for libdevice
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Look for libdevice
│   locations =
│    2-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/nvvm/libdevice"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found unified device library at /gpfs/runtime/opt/cuda/10.1.105/cuda/nvvm/libdevice/libdevice.10.bc
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:327
┌ Debug: Request to look for libcudadevrt
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for CUDA device runtime library libcudadevrt.a
│   locations =
│    3-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/lib"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/lib64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found CUDA device runtime library libcudadevrt.a at /gpfs/runtime/opt/cuda/10.1.105/cuda/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:379
┌ Debug: Request to look for library nvToolsExt
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for library libnvToolsExt.so, libnvToolsExt.so.1, libnvToolsExt.so.1.0
│   locations =
│    4-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/lib"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/lib64"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/libx64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found library libnvToolsExt.so at /gpfs/runtime/opt/cuda/10.1.105/cuda/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:90
┌ Debug: Request to look for library cupti
│   locations =
│    2-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/extras/CUPTI"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for library libcupti.so, libcupti.so.10, libcupti.so.10.1
│   locations =
│    8-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/lib"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/lib64"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/libx64"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/extras/CUPTI"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/extras/CUPTI/lib"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/extras/CUPTI/lib64"
│     "/gpfs/runtime/opt/cuda/10.1.105/cuda/extras/CUPTI/libx64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found library libcupti.so at /gpfs/runtime/opt/cuda/10.1.105/cuda/extras/CUPTI/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:90
┌ Debug: Using LLVM v9.0.1
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:170
┌ Debug: LLVM supports capabilities 2.0, 2.1, 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5 with PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3 and 6.4
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:191
┌ Debug: Using CUDA driver v10.2.0 and toolkit v10.1.0
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:196
┌ Debug: CUDA driver supports capabilities 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5 with PTX 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 2.0, 2.1, 2.2, 2.3, 3.0, 3.1, 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.2, 6.3, 6.4 and 6.5
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:213
┌ Debug: CUDA toolkit supports capabilities 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5 with PTX 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 2.0, 2.1, 2.2, 2.3, 3.0, 3.1, 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.2, 6.3 and 6.4
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:214
┌ Debug: CUDAnative supports devices 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5; PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3 and 6.4
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:174
┌ Debug: Precompiling GPUArrays [0c68f7d7-f131-5f86-a1c3-88cf8149b2d7]
└ @ Base loading.jl:1276
┌ Debug: Precompiling AbstractFFTs [621f4979-c628-5d54-868e-fcf4e3e8185c]
└ @ Base loading.jl:1276
┌ Debug: Precompiling Requires [ae029012-a4dd-5104-9daa-d747884805df]
└ @ Base loading.jl:1276
┌ Debug: Precompiling MacroTools [1914dd2f-81c6-5fcd-8719-6d5c9610ff09]
└ @ Base loading.jl:1276
┌ Debug: Precompiling NNlib [872c559c-99b0-510c-b3b7-b6c96a88d5cd]
└ @ Base loading.jl:1276
┌ Warning: Incompatibility detected between CUDA and LLVM 8.0+; disabling debug info emission for CUDA kernels
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:114

CUDA 10.0.130: (works fine)

Click to expand
$ which nvcc
/gpfs/runtime/opt/cuda/10.0.130/cuda/bin/nvcc

$ which nvdisasm
/gpfs/runtime/opt/cuda/10.0.130/cuda/bin/nvdisasm

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

$ nvdisasm --version
nvdisasm: NVIDIA (R) CUDA disassembler
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:11_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
bash-4.2$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.5.0-DEV.274 (2020-02-15)
 _/ |\__'_|_|_|\__'_|  |  Commit 8eb0f9fefb (8 days old master)
|__/                   |

julia> versioninfo(verbose = true)
Julia Version 1.5.0-DEV.274
Commit 8eb0f9fefb (2020-02-15 12:41 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      "Red Hat Enterprise Linux Server release 7.3 (Maipo)"
  uname: Linux 3.10.0-957.5.1.el7.x86_64 #1 SMP Wed Dec 19 10:46:58 EST 2018 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Gold 5122 CPU @ 3.60GHz:
              speed         user         nice          sys         idle          irq
       #1  3601 MHz  260887968 s        209 s   79076153 s  314446557 s          0 s
       #2  3601 MHz  225435375 s        556 s   66634534 s  362161063 s          0 s
       #3  3601 MHz  153758324 s        331 s   47119535 s  453437851 s          0 s
       #4  3601 MHz  133428293 s        355 s   44199325 s  476409768 s          0 s
       #5  3601 MHz  231454591 s        286 s   61205506 s  362253151 s          0 s
       #6  3601 MHz  154408009 s        604 s   45160867 s  455698340 s          0 s
       #7  3601 MHz  104945700 s        268 s   34510038 s  515693325 s          0 s
       #8  3601 MHz   94884863 s        496 s   31751307 s  528181552 s          0 s

  Memory: 93.04103088378906 GB (80678.2890625 MB free)
  Uptime: 6.578414e6 sec
  Load Avg:  1.1494140625  0.908203125  0.515625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
Environment:
  JULIA_DEBUG = all
  JULIA_CUDA_VERBOSE = true
  CPLUS_INCLUDE_PATH = /gpfs/runtime/opt/gcc/8.3/include
  MANPATH = /gpfs/runtime/opt/python/3.7.4/share/man:/gpfs/runtime/opt/git/2.20.2/share/man:/gpfs/runtime/opt/gcc/8.3/share/man:/gpfs/runtime/opt/binutils/2.31/share/man:/gpfs/runtime/opt/intel/2017.0/man/common/man1:
  TERM = xterm-256color
  LIBRARY_PATH = /gpfs/runtime/opt/cuda/10.0.130/cuda/lib64:/gpfs/runtime/opt/cuda/10.0.130/cuda/lib:/gpfs/runtime/opt/python/3.7.4/lib:/gpfs/runtime/opt/binutils/2.31/lib:/gpfs/runtime/opt/intel/2017.0/lib/intel64:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64
  CUDA_HOME = /gpfs/runtime/opt/cuda/10.0.130/cuda
  LD_LIBRARY_PATH = /gpfs/runtime/opt/cuda/10.0.130/cuda/lib64:/gpfs/runtime/opt/cuda/10.0.130/cuda/lib:/gpfs/runtime/opt/python/3.7.4/lib:/gpfs/runtime/opt/gcc/8.3/lib64:/gpfs/runtime/opt/binutils/2.31/lib:/gpfs/runtime/opt/intel/2017.0/lib/intel64:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64:/gpfs/runtime/opt/java/8u111/jre/lib/amd64
  CPATH = /gpfs/runtime/opt/cuda/10.0.130/cuda/include:/gpfs/runtime/opt/python/3.7.4/include:/gpfs/runtime/opt/gcc/8.3/include:/gpfs/runtime/opt/binutils/2.31/include:/gpfs/runtime/opt/intel/2017.0/mkl/include
  NLSPATH = /gpfs/runtime/opt/intel/2017.0/lib/intel64/locale/en_US:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64/locale/en_US
  PATH = /gpfs/runtime/opt/cuda/10.0.130/cuda/bin:/users/daluthge/bin:/gpfs/runtime/opt/python/3.7.4/bin:/gpfs/runtime/opt/git/2.20.2/bin:/gpfs/runtime/opt/gcc/8.3/bin:/gpfs/runtime/opt/binutils/2.31/bin:/gpfs/runtime/opt/intel/2017.0/bin:/gpfs/runtime/opt/matlab/R2017b/bin:/gpfs/runtime/opt/java/8u111/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/lpp/mmfs/bin:/usr/lpp/mmfs/sbin:/opt/ibutils/bin:/gpfs/runtime/bin
  C_INCLUDE_PATH = /gpfs/runtime/opt/gcc/8.3/include
  LD_RUN_PATH = /gpfs/runtime/opt/cuda/10.0.130/cuda/lib64:/gpfs/runtime/opt/cuda/10.0.130/cuda/lib:/gpfs/runtime/opt/python/3.7.4/lib:/gpfs/runtime/opt/gcc/8.3/lib64:/gpfs/runtime/opt/binutils/2.31/lib:/gpfs/runtime/opt/intel/2017.0/lib/intel64:/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64
  JAVA_HOME = /gpfs/runtime/opt/java/8u111
  MODULEPATH = /gpfs/runtime/modulefiles
  HOME = /users/daluthge
  IPP_PATH = /gpfs/runtime/opt/intel/2017.0/ipp
  MODULEHOME = /gpfs/runtime/pymodules
  PKG_CONFIG_PATH = /gpfs/runtime/opt/python/3.7.4/lib/pkgconfig
  QT_PLUGIN_PATH = /usr/lib64/kde4/plugins:/usr/lib/kde4/plugins

julia> import Pkg

julia> Pkg.add("CuArrays")
    Cloning default registries into `~/.julia`
######################################################################## 100.0%
      Added registry `General` to `~/.julia/registries/General`
  Resolving package versions...
  Installed TimerOutputs ─────── v0.5.3
  Installed Adapt ────────────── v1.0.1
  Installed Requires ─────────── v1.0.1
  Installed AbstractFFTs ─────── v0.5.0
  Installed CUDAapi ──────────── v3.1.0
  Installed GPUArrays ────────── v2.0.1
  Installed CEnum ────────────── v0.2.0
  Installed OrderedCollections ─ v1.1.0
  Installed DataStructures ───── v0.17.9
  Installed CuArrays ─────────── v1.7.2
  Installed MacroTools ───────── v0.5.4
  Installed BinaryProvider ───── v0.5.8
  Installed CUDAnative ───────── v2.10.2
  Installed NNlib ────────────── v0.6.4
  Installed CUDAdrv ──────────── v6.0.0
  Installed LLVM ─────────────── v1.3.3
   Updating `/gpfs_home/daluthge/.julia/environments/v1.5/Project.toml`
  [3a865a2d] + CuArrays v1.7.2
   Updating `/gpfs_home/daluthge/.julia/environments/v1.5/Manifest.toml`
  [621f4979] + AbstractFFTs v0.5.0
  [79e6a3ab] + Adapt v1.0.1
  [b99e7846] + BinaryProvider v0.5.8
  [fa961155] + CEnum v0.2.0
  [3895d2a7] + CUDAapi v3.1.0
  [c5f51814] + CUDAdrv v6.0.0
  [be33ccc6] + CUDAnative v2.10.2
  [3a865a2d] + CuArrays v1.7.2
  [864edb3b] + DataStructures v0.17.9
  [0c68f7d7] + GPUArrays v2.0.1
  [929cbde3] + LLVM v1.3.3
  [1914dd2f] + MacroTools v0.5.4
  [872c559c] + NNlib v0.6.4
  [bac558e1] + OrderedCollections v1.1.0
  [ae029012] + Requires v1.0.1
  [a759f4b9] + TimerOutputs v0.5.3
  [2a0f44e3] + Base64
  [8ba89e20] + Distributed
  [b77e0a4c] + InteractiveUtils
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [de0858da] + Printf
  [9a3f8284] + Random
  [ea8e919c] + SHA
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [2f01184e] + SparseArrays
  [10745b16] + Statistics
  [8dfed614] + Test
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
   Building NNlib → `~/.julia/packages/NNlib/3krvM/deps/build.log`

julia> import CuArrays
[ Info: Precompiling CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae]
┌ Debug: Precompiling CUDAapi [3895d2a7-ec45-59b8-82bb-cfc6a382f9b3]
└ @ Base loading.jl:1276
┌ Debug: Precompiling CUDAdrv [c5f51814-7f29-56b8-a69c-e4d8f6be1fde]
└ @ Base loading.jl:1276
┌ Debug: Precompiling CEnum [fa961155-64e5-5f13-b03f-caf6b980ea82]
└ @ Base loading.jl:1276
┌ Debug: Precompiling CUDAnative [be33ccc6-a3ff-5ff2-a52e-74243cff1e17]
└ @ Base loading.jl:1276
┌ Debug: Precompiling LLVM [929cbde3-209d-540e-8aea-75f648917ca0]
└ @ Base loading.jl:1276
┌ Debug: Found LLVM v9.0.1 at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/bin/../lib/libLLVM-9.so with support for AArch64, AMDGPU, ARC, ARM, AVR, BPF, Hexagon, Lanai, MSP430, Mips, NVPTX, PowerPC, RISCV, Sparc, SystemZ, WebAssembly, X86, XCore
└ @ LLVM ~/.julia/packages/LLVM/DAnFH/src/LLVM.jl:47
┌ Debug: Using LLVM.jl wrapper for LLVM v9.0
└ @ LLVM ~/.julia/packages/LLVM/DAnFH/src/LLVM.jl:75
┌ Debug: Precompiling Adapt [79e6a3ab-5dfb-504d-930d-738a2a938a0e]
└ @ Base loading.jl:1276
┌ Debug: Precompiling TimerOutputs [a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f]
└ @ Base loading.jl:1276
┌ Debug: Precompiling DataStructures [864edb3b-99cc-5e75-8d2d-829cb0a9cfe8]
└ @ Base loading.jl:1276
┌ Debug: Precompiling OrderedCollections [bac558e1-5e72-5ebc-8fee-abe8a469f55d]
└ @ Base loading.jl:1276
┌ Debug: Looking for CUDA toolkit via environment variables CUDA_HOME
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Request to look for binary nvdisasm
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for binary nvdisasm
│   locations =
│    20-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/bin"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/bin"
│     "/users/daluthge/bin"
│     "/gpfs/runtime/opt/python/3.7.4/bin"
│     "/gpfs/runtime/opt/git/2.20.2/bin"
│     "/gpfs/runtime/opt/gcc/8.3/bin"
│     "/gpfs/runtime/opt/binutils/2.31/bin"
│     ⋮
│     "/usr/bin"
│     "/usr/local/sbin"
│     "/usr/sbin"
│     "/usr/lpp/mmfs/bin"
│     "/usr/lpp/mmfs/sbin"
│     "/opt/ibutils/bin"
│     "/gpfs/runtime/bin"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found binary nvdisasm at /gpfs/runtime/opt/cuda/10.0.130/cuda/bin
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:141
┌ Debug: CUDA toolkit identified as 10.0.130
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:297
┌ Debug: Request to look for libdevice
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Look for libdevice
│   locations =
│    2-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/nvvm/libdevice"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found unified device library at /gpfs/runtime/opt/cuda/10.0.130/cuda/nvvm/libdevice/libdevice.10.bc
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:327
┌ Debug: Request to look for libcudadevrt
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for CUDA device runtime library libcudadevrt.a
│   locations =
│    3-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/lib"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/lib64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found CUDA device runtime library libcudadevrt.a at /gpfs/runtime/opt/cuda/10.0.130/cuda/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:379
┌ Debug: Request to look for library nvToolsExt
│   locations =
│    1-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for library libnvToolsExt.so, libnvToolsExt.so.1, libnvToolsExt.so.1.0
│   locations =
│    4-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/lib"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/lib64"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/libx64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found library libnvToolsExt.so at /gpfs/runtime/opt/cuda/10.0.130/cuda/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:90
┌ Debug: Request to look for library cupti
│   locations =
│    2-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/extras/CUPTI"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for library libcupti.so, libcupti.so.10, libcupti.so.10.0
│   locations =
│    8-element Array{String,1}:
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/lib"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/lib64"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/libx64"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/extras/CUPTI"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/extras/CUPTI/lib"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/extras/CUPTI/lib64"
│     "/gpfs/runtime/opt/cuda/10.0.130/cuda/extras/CUPTI/libx64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found library libcupti.so at /gpfs/runtime/opt/cuda/10.0.130/cuda/extras/CUPTI/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:90
┌ Debug: Using LLVM v9.0.1
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:170
┌ Debug: LLVM supports capabilities 2.0, 2.1, 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5 with PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3 and 6.4
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:191
┌ Debug: Using CUDA driver v10.2.0 and toolkit v10.0.0
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:196
┌ Debug: CUDA driver supports capabilities 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5 with PTX 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 2.0, 2.1, 2.2, 2.3, 3.0, 3.1, 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.2, 6.3, 6.4 and 6.5
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:213
┌ Debug: CUDA toolkit supports capabilities 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5 with PTX 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 2.0, 2.1, 2.2, 2.3, 3.0, 3.1, 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.2 and 6.3
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/compatibility.jl:214
┌ Debug: CUDAnative supports devices 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2 and 7.5; PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1 and 6.3
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:174
┌ Debug: Precompiling GPUArrays [0c68f7d7-f131-5f86-a1c3-88cf8149b2d7]
└ @ Base loading.jl:1276
┌ Debug: Precompiling AbstractFFTs [621f4979-c628-5d54-868e-fcf4e3e8185c]
└ @ Base loading.jl:1276
┌ Debug: Precompiling Requires [ae029012-a4dd-5104-9daa-d747884805df]
└ @ Base loading.jl:1276
┌ Debug: Precompiling MacroTools [1914dd2f-81c6-5fcd-8719-6d5c9610ff09]
└ @ Base loading.jl:1276
┌ Debug: Precompiling NNlib [872c559c-99b0-510c-b3b7-b6c96a88d5cd]
└ @ Base loading.jl:1276
┌ Warning: Incompatibility detected between CUDA and LLVM 8.0+; disabling debug info emission for CUDA kernels
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:114
@DilumAluthge DilumAluthge changed the title import CuArrays always fails with CUDA 10.2 (but works fine with CUDA 10.0.130 and 10.1.105) import CuArrays always fails with CUDA 10.2.89 (but works fine with CUDA 10.0.130 and 10.1.105) Feb 24, 2020
@DilumAluthge
Copy link
Author

cc: @maleadt

@maleadt
Copy link
Member

maleadt commented Feb 24, 2020

The relevant code is here:

# discover libraries
for (name,version) in (("cublas", CUDAnative.version()),
("cusparse", CUDAnative.version()),
("cusolver", CUDAnative.version()),
("cufft", CUDAnative.version()),
("curand", CUDAnative.version()),
("cudnn", v"7"),
("cutensor", v"1"))
mod = getfield(CuArrays, Symbol(uppercase(name)))
lib = Symbol("lib$name")
handle = getfield(mod, lib)
# on Windows, the library name is version dependent
if Sys.iswindows()
cuda = CUDAnative.version()
suffix = cuda >= v"10.1" ? "$(cuda.major)" : "$(cuda.major)$(cuda.minor)"
handle[] = "$(name)$(Sys.WORD_SIZE)_$(suffix)"
end
# check if we can't find the library
if Libdl.dlopen_e(handle[]) == C_NULL
path = find_cuda_library(name, CUDAnative.prefix(), [version])
if path !== nothing
handle[] = path
end
end
end
# library dependencies
CUBLAS.version()

There isn't any mention about searching for libcublas in the debug output, which is strange given how the dlopen subsequently fails. The initial value for libcublas is libcublas, could you show what the output of Libdl.dlopen_e("libcublas") is? Maybe it's nothing and not C_NULL? I've seen BB try to check for both (if dlopen_e in (nothing, C_NULL)).

The difference between CUDA versions is probably caused by libcublas being on the default library search path for some versions and not for others.

@DilumAluthge
Copy link
Author

WIth CUDA 10.0.130:

julia> import Libdl

julia> Libdl.dlopen("libcublas")
Ptr{Nothing} @0x00000000013c7360

julia> Libdl.dlopen_e("libcublas")
Ptr{Nothing} @0x00000000013c7360

WIth CUDA 10.1.105:

julia> import Libdl

julia> Libdl.dlopen("libcublas")
Ptr{Nothing} @0x00000000012f0300

julia> Libdl.dlopen_e("libcublas")
Ptr{Nothing} @0x00000000012f0300

WIth CUDA 10.2.89

julia> import Libdl

julia> Libdl.dlopen("libcublas")
ERROR: could not load library "libcublas"
libcublas.so: cannot open shared object file: No such file or directory
Stacktrace:
 [1] dlopen(::String, ::UInt32; throw_error::Bool) at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109
 [2] dlopen at /gpfs_home/daluthge/dev/JuliaLang/julia/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109 [inlined] (repeats 2 times)
 [3] top-level scope at REPL[2]:1

julia> Libdl.dlopen_e("libcublas")
Ptr{Nothing} @0x0000000000000000

julia> Libdl.dlopen_e("libcublas") == C_NULL
true

julia> Libdl.dlopen_e("libcublas") === C_NULL
true

@maleadt
Copy link
Member

maleadt commented Feb 24, 2020

Confirms my suspicion, but doesn't explain the lack of debug info. Could you run the following with JULIA_DEBUG=CUDAapi:

find_cuda_library("cublas", CUDAnative.prefix(), [CUDAnative.version()]) 

@DilumAluthge
Copy link
Author

DilumAluthge commented Feb 24, 2020

julia> ENV["JULIA_DEBUG"] = "CUDAapi"
"CUDAapi"

julia> import CUDAapi

julia> import CUDAnative
┌ Warning: Incompatibility detected between CUDA and LLVM 8.0+; disabling debug info emission for CUDA kernels
└ @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:114
┌ Debug: Looking for CUDA toolkit via environment variables CUDA_HOME
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Request to look for binary nvdisasm
│   locations =1-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for binary nvdisasm
│   locations =20-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda""/gpfs/runtime/opt/cuda/10.2/cuda/bin""/gpfs/runtime/opt/gcc/8.3/bin""/gpfs/runtime/opt/cuda/10.2/cuda/bin""/users/daluthge/bin""/gpfs/runtime/opt/python/3.7.4/bin""/gpfs/runtime/opt/git/2.20.2/bin""/gpfs/runtime/opt/binutils/2.31/bin""/usr/bin""/usr/local/sbin""/usr/sbin""/usr/lpp/mmfs/bin""/usr/lpp/mmfs/sbin""/opt/ibutils/bin""/gpfs/runtime/bin"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found binary nvdisasm at /gpfs/runtime/opt/cuda/10.2/cuda/bin
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:141
┌ Debug: CUDA toolkit identified as 10.2.89
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:297
┌ Debug: Request to look for libdevice
│   locations =1-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Look for libdevice
│   locations =2-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda""/gpfs/runtime/opt/cuda/10.2/cuda/nvvm/libdevice"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found unified device library at /gpfs/runtime/opt/cuda/10.2/cuda/nvvm/libdevice/libdevice.10.bc
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:327
┌ Debug: Request to look for libcudadevrt
│   locations =1-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for CUDA device runtime library libcudadevrt.a
│   locations =3-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda""/gpfs/runtime/opt/cuda/10.2/cuda/lib""/gpfs/runtime/opt/cuda/10.2/cuda/lib64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found CUDA device runtime library libcudadevrt.a at /gpfs/runtime/opt/cuda/10.2/cuda/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:379
┌ Debug: Request to look for library nvToolsExt
│   locations =1-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for library libnvToolsExt.so, libnvToolsExt.so.1, libnvToolsExt.so.1.0
│   locations =4-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda""/gpfs/runtime/opt/cuda/10.2/cuda/lib""/gpfs/runtime/opt/cuda/10.2/cuda/lib64""/gpfs/runtime/opt/cuda/10.2/cuda/libx64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found library libnvToolsExt.so at /gpfs/runtime/opt/cuda/10.2/cuda/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:90
┌ Debug: Request to look for library cupti
│   locations =2-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda""/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for library libcupti.so, libcupti.so.10, libcupti.so.10.2
│   locations =8-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda""/gpfs/runtime/opt/cuda/10.2/cuda/lib""/gpfs/runtime/opt/cuda/10.2/cuda/lib64""/gpfs/runtime/opt/cuda/10.2/cuda/libx64""/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI""/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI/lib""/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI/lib64""/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI/libx64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Found library libcupti.so at /gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI/lib64
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:90

julia> CUDAnative.prefix()
2-element Array{String,1}:
 "/gpfs/runtime/opt/cuda/10.2/cuda"
 "/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI"

julia> CUDAnative.version()
v"10.2.89"

julia> CUDAapi.find_cuda_library("cublas", CUDAnative.prefix(), [CUDAnative.version()])
┌ Debug: Request to look for library cublas
│   locations =2-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda""/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8
┌ Debug: Looking for library libcublas.so, libcublas.so.10, libcublas.so.10.2
│   locations =8-element Array{String,1}:"/gpfs/runtime/opt/cuda/10.2/cuda""/gpfs/runtime/opt/cuda/10.2/cuda/lib""/gpfs/runtime/opt/cuda/10.2/cuda/lib64""/gpfs/runtime/opt/cuda/10.2/cuda/libx64""/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI""/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI/lib""/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI/lib64""/gpfs/runtime/opt/cuda/10.2/cuda/extras/CUPTI/libx64"
└ @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/CUDAapi.jl:8

@maleadt
Copy link
Member

maleadt commented Feb 24, 2020

So it returns nothing? What's the actual name of libcublas in /gpfs/runtime/opt/cuda/10.2/cuda/lib*?

@DilumAluthge
Copy link
Author

Yeah it returns nothing.

$ find /gpfs/runtime/opt/cuda/10.2/ -name "*blas*"
/gpfs/runtime/opt/cuda/10.2/cuda/doc/man/man7/libcublas.7
/gpfs/runtime/opt/cuda/10.2/cuda/doc/man/man7/libcublas.so.7
/gpfs/runtime/opt/cuda/10.2/cuda/doc/html/cublas
/gpfs/runtime/opt/cuda/10.2/cuda/doc/html/cublas/graphics/cublasmg_gemm.jpg
/gpfs/runtime/opt/cuda/10.2/cuda/doc/html/nvblas
/gpfs/runtime/opt/cuda/10.2/cuda/targets/x86_64-linux/include/cublas_api.h
/gpfs/runtime/opt/cuda/10.2/cuda/targets/x86_64-linux/include/nvblas.h
/gpfs/runtime/opt/cuda/10.2/cuda/targets/x86_64-linux/include/cublas.h
/gpfs/runtime/opt/cuda/10.2/cuda/targets/x86_64-linux/include/cublas_v2.h
/gpfs/runtime/opt/cuda/10.2/cuda/targets/x86_64-linux/include/cublasLt.h
/gpfs/runtime/opt/cuda/10.2/cuda/targets/x86_64-linux/include/cublasXt.h
/gpfs/runtime/opt/cuda/10.2/src/include/cublas_api.h
/gpfs/runtime/opt/cuda/10.2/src/include/nvblas.h
/gpfs/runtime/opt/cuda/10.2/src/include/cublas.h
/gpfs/runtime/opt/cuda/10.2/src/include/cublas_v2.h
/gpfs/runtime/opt/cuda/10.2/src/include/cublasLt.h
/gpfs/runtime/opt/cuda/10.2/src/include/cublasXt.h
/gpfs/runtime/opt/cuda/10.2/src/lib64/libcublasLt.so
/gpfs/runtime/opt/cuda/10.2/src/lib64/libcublas.so.10.2.2.89
/gpfs/runtime/opt/cuda/10.2/src/lib64/libnvblas.so.10.2.2.89
/gpfs/runtime/opt/cuda/10.2/src/lib64/libcublasLt.so.10
/gpfs/runtime/opt/cuda/10.2/src/lib64/libcublas_static.a
/gpfs/runtime/opt/cuda/10.2/src/lib64/libcublas.so.10
/gpfs/runtime/opt/cuda/10.2/src/lib64/libcublasLt_static.a
/gpfs/runtime/opt/cuda/10.2/src/lib64/stubs/libcublasLt.so
/gpfs/runtime/opt/cuda/10.2/src/lib64/stubs/libcublas.so
/gpfs/runtime/opt/cuda/10.2/src/lib64/libnvblas.so.10
/gpfs/runtime/opt/cuda/10.2/src/lib64/libcublasLt.so.10.2.2.89
/gpfs/runtime/opt/cuda/10.2/src/lib64/libnvblas.so
/gpfs/runtime/opt/cuda/10.2/src/lib64/libcublas.so

$ find /gpfs/runtime/opt/cuda/10.1.105/ -name "*blas*"
/gpfs/runtime/opt/cuda/10.1.105/cuda/doc/man/man7/libcublas.7
/gpfs/runtime/opt/cuda/10.1.105/cuda/doc/man/man7/libcublas.so.7
/gpfs/runtime/opt/cuda/10.1.105/cuda/doc/html/cublas
/gpfs/runtime/opt/cuda/10.1.105/cuda/doc/html/cublas/graphics/cublasmg_gemm.jpg
/gpfs/runtime/opt/cuda/10.1.105/cuda/doc/html/nvblas
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/include/cublas_api.h
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/include/nvblas.h
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/include/cublas.h
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/include/cublas_v2.h
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/include/cublasLt.h
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/include/cublasXt.h
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/libcublasLt.so
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/libnvblas.so.10.1.0.105
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/libcublasLt.so.10.1.0.105
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/libcublasLt.so.10
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/libcublas_static.a
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/libcublas.so.10
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/libcublasLt_static.a
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/stubs/libcublasLt.so
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/stubs/libcublas.so
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/libnvblas.so.10
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/libcublas.so.10.1.0.105
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/libnvblas.so
/gpfs/runtime/opt/cuda/10.1.105/cuda/targets/x86_64-linux/lib/libcublas.so

$ find /gpfs/runtime/opt/cuda/10.0.130/ -name "*blas*"
/gpfs/runtime/opt/cuda/10.0.130/cuda/include/cublas_api.h
/gpfs/runtime/opt/cuda/10.0.130/cuda/include/nvblas.h
/gpfs/runtime/opt/cuda/10.0.130/cuda/include/cublas.h
/gpfs/runtime/opt/cuda/10.0.130/cuda/include/cublas_v2.h
/gpfs/runtime/opt/cuda/10.0.130/cuda/include/cublasXt.h
/gpfs/runtime/opt/cuda/10.0.130/cuda/lib64/libnvblas.so.10.0
/gpfs/runtime/opt/cuda/10.0.130/cuda/lib64/libcublas_static.a
/gpfs/runtime/opt/cuda/10.0.130/cuda/lib64/stubs/libcublas.so
/gpfs/runtime/opt/cuda/10.0.130/cuda/lib64/libnvblas.so
/gpfs/runtime/opt/cuda/10.0.130/cuda/lib64/libnvblas.so.10.0.130
/gpfs/runtime/opt/cuda/10.0.130/cuda/lib64/libcublas.so.10.0.130
/gpfs/runtime/opt/cuda/10.0.130/cuda/lib64/libcublas.so.10.0
/gpfs/runtime/opt/cuda/10.0.130/cuda/lib64/libcublas.so
/gpfs/runtime/opt/cuda/10.0.130/cuda/doc/man/man7/libcublas.7
/gpfs/runtime/opt/cuda/10.0.130/cuda/doc/man/man7/libcublas.so.7
/gpfs/runtime/opt/cuda/10.0.130/cuda/doc/html/cublas
/gpfs/runtime/opt/cuda/10.0.130/cuda/doc/html/cublas/graphics/cublasmg_gemm.jpg
/gpfs/runtime/opt/cuda/10.0.130/cuda/doc/html/nvblas
/gpfs/runtime/opt/cuda/10.0.130/cuda/pkgconfig/cublas-10.0.pc

@DilumAluthge
Copy link
Author

/gpfs/runtime/opt/cuda/10.2/src/lib64/libcublas.so

Why is it in the src directory???

@maleadt
Copy link
Member

maleadt commented Feb 24, 2020

OK, this is your cluster being set-up all weird. nvdisasm is picked up in /gpfs/runtime/opt/cuda/10.2/cuda/bin, so CUDAapi assumes /gpfs/runtime/opt/cuda/10.2/cuda is the toolkit root. However, the CUBLAS library is located in /gpfs/runtime/opt/cuda/10.2/src/lib64... I haven't seen such a src prefix anywhere, so it's not a common thing. If the admins want to structure it like that, at least there should be some configuration pointing towards it, e.g. adding that directory to the library search path (as apparently was done for CUDA 10.0 and 10.1).

@DilumAluthge
Copy link
Author

You have to load the different environments with "modules".

So e.g. when I run module load cuda/10.0.130, it exports the LD_LIBRARY_PATH environment variable to include the /gpfs/runtime/opt/cuda/10.0.130/cuda/lib64 directory.

So presumably the cluster admins need to do something similar for their cuda/10.2 module.

@DilumAluthge
Copy link
Author

While I wait for them, is there any way I can point Julia directly to the location of libcublas.so?

@maleadt
Copy link
Member

maleadt commented Feb 24, 2020

Define LD_LIBRARY_PATH yourself? CUDAapi doesn't support per-library overrides, but picks up whatever is loadable out of the box. Alternatively, if that src directory contains the entire toolkit, you can set CUDA_ROOT to force the prefix to that directory.

@DilumAluthge
Copy link
Author

Alternatively, if that src directory contains the entire toolkit, you can set CUDA_ROOT to force the prefix to that directory.

Unfortunately it does not. It only contains the libcublas libraries and some libnvblas libraries.

All the other libraries (e.g. libcudart, libcurand, libcufft, etc.) are in /gpfs/runtime/opt/cuda/10.2/cuda/lib64

Seems like maybe a botched install of CUDA by the cluster admins? It's unclear to me why you would move those specific libraries to a separate location.

@DilumAluthge
Copy link
Author

I'll probably just use their CUDA 10.0 or 10.1 install (since those seem to be installed correctly) until they fix it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants