Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dot product of Array and CuArray fails with CPU address error. #586

Closed
dangirsh opened this issue Dec 1, 2020 · 1 comment
Closed

Dot product of Array and CuArray fails with CPU address error. #586

dangirsh opened this issue Dec 1, 2020 · 1 comment

Comments

@dangirsh
Copy link

dangirsh commented Dec 1, 2020

Describe the bug

Dot products of CuArrays and Arrays fail with: ERROR: LoadError: ArgumentError: cannot take the CPU address of a CuArray{Float64,1}. Same with Float64. Same with CUDA.dot.

It's hard to tell if this is related / the same as an existing open issue.

To reproduce

dot(ones(Float64, 3), CUDA.ones(Float64, 3))
ERROR: ArgumentError: cannot take the CPU address of a CuArray{Float64, 1}
Stacktrace:
 [1] unsafe_convert(#unused#::Type{Ptr{Float64}}, x::CuArray{Float64, 1})
   @ CUDA ~/.julia/packages/CUDA/OOtNT/src/array.jl:211
 [2] dot(n::Int64, DX::Vector{Float64}, incx::Int64, DY::CuArray{Float64, 1}, incy::Int64)
   @ LinearAlgebra.BLAS ~/julia-master/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/blas.jl:394
 [3] dot(DX::Vector{Float64}, DY::CuArray{Float64, 1})
   @ LinearAlgebra.BLAS ~/julia-master/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/blas.jl:443
 [4] dot(x::Vector{Float64}, y::CuArray{Float64, 1})
   @ LinearAlgebra ~/julia-master/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/matmul.jl:9
 [5] top-level scope
   @ REPL[8]:1

Additional context

The generic fallback is hit when the float types are different. This succeeds, unless CuArrays.allowscalar(false) is set:

dot(ones(Float64, 3), CUDA.ones(Float32, 3))
┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)`
└ @ GPUArrays ~/.julia/packages/GPUArrays/jhRU7/src/host/indexing.jl:43
3.0
@which dot(ones(Float32, 3), CUDA.ones(Float64, 3))
dot(x::AbstractArray, y::AbstractArray) in LinearAlgebra at /home/dangirsh/julia-master/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/generic.jl:906
@which dot(ones(Float64, 3), CUDA.ones(Float64, 3))
dot(...) where T<:Union{Float32, Float64} in LinearAlgebra at /home/dangirsh/julia-master/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/matmul.jl:9
Manifest.toml

# This file is machine-generated - editing it directly is not advised

[[AbstractFFTs]]
deps = ["LinearAlgebra"]
git-tree-sha1 = "051c95d6836228d120f5f4b984dd5aba1624f716"
uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c"
version = "0.5.0"

[[Adapt]]
deps = ["LinearAlgebra"]
git-tree-sha1 = "42c42f2221906892ceb765dbcb1a51deeffd86d7"
uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
version = "2.3.0"

[[ArgTools]]
uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f"

[[Artifacts]]
uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"

[[BFloat16s]]
deps = ["LinearAlgebra", "Test"]
git-tree-sha1 = "4af69e205efc343068dc8722b8dfec1ade89254a"
uuid = "ab4f0b2a-ad5b-11e8-123f-65d77653426b"
version = "0.1.0"

[[Base64]]
uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"

[[CEnum]]
git-tree-sha1 = "215a9aa4a1f23fbd05b92769fdd62559488d70e9"
uuid = "fa961155-64e5-5f13-b03f-caf6b980ea82"
version = "0.4.1"

[[CUDA]]
deps = ["AbstractFFTs", "Adapt", "BFloat16s", "CEnum", "DataStructures", "ExprTools", "GPUArrays", "GPUCompiler", "LLVM", "Libdl", "LinearAlgebra", "Logging", "MacroTools", "NNlib", "Pkg", "Printf", "Random", "Reexport", "Requires", "SparseArrays", "Statistics", "TimerOutputs"]
git-tree-sha1 = "7663b61782b569b03fba91d330a5ed2f86cd4cb8"
uuid = "052768ef-5323-5732-b1bb-66c8b64840ba"
version = "2.3.0"

[[Compat]]
deps = ["Base64", "Dates", "DelimitedFiles", "Distributed", "InteractiveUtils", "LibGit2", "Libdl", "LinearAlgebra", "Markdown", "Mmap", "Pkg", "Printf", "REPL", "Random", "SHA", "Serialization", "SharedArrays", "Sockets", "SparseArrays", "Statistics", "Test", "UUIDs", "Unicode"]
git-tree-sha1 = "a706ff10f1cd8dab94f59fd09c0e657db8e77ff0"
uuid = "34da2185-b29b-5c13-b0c7-acf172513d20"
version = "3.23.0"

[[DataStructures]]
deps = ["Compat", "InteractiveUtils", "OrderedCollections"]
git-tree-sha1 = "fb0aa371da91c1ff9dc7fbed6122d3e411420b9c"
uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
version = "0.18.8"

[[Dates]]
deps = ["Printf"]
uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"

[[DelimitedFiles]]
deps = ["Mmap"]
uuid = "8bb1440f-4735-579b-a4ab-409b98df4dab"

[[Distributed]]
deps = ["Random", "Serialization", "Sockets"]
uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b"

[[Downloads]]
deps = ["ArgTools", "LibCURL", "NetworkOptions"]
uuid = "f43a241f-c20a-4ad4-852c-f6b1247861c6"

[[ExprTools]]
git-tree-sha1 = "10407a39b87f29d47ebaca8edbc75d7c302ff93e"
uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04"
version = "0.1.3"

[[GPUArrays]]
deps = ["AbstractFFTs", "Adapt", "LinearAlgebra", "Printf", "Random", "Serialization"]
git-tree-sha1 = "2c1dd57bca7ba0b3b4bf81d9332aeb81b154ef4c"
uuid = "0c68f7d7-f131-5f86-a1c3-88cf8149b2d7"
version = "6.1.2"

[[GPUCompiler]]
deps = ["DataStructures", "InteractiveUtils", "LLVM", "Libdl", "Scratch", "Serialization", "TimerOutputs", "UUIDs"]
git-tree-sha1 = "c853c810b52a80f9aad79ab109207889e57f41ef"
uuid = "61eb1bfa-7361-4325-ad38-22787b887f55"
version = "0.8.3"

[[InteractiveUtils]]
deps = ["Markdown"]
uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"

[[LLVM]]
deps = ["CEnum", "Libdl", "Printf", "Unicode"]
git-tree-sha1 = "000a737732aa4eb996414c4685368f6a74b41d14"
uuid = "929cbde3-209d-540e-8aea-75f648917ca0"
version = "3.4.0"

[[LibCURL]]
deps = ["LibCURL_jll", "MozillaCACerts_jll"]
uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21"

[[LibCURL_jll]]
deps = ["Libdl"]
uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0"

[[LibGit2]]
deps = ["Base64", "NetworkOptions", "Printf", "SHA"]
uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"

[[Libdl]]
uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"

[[LinearAlgebra]]
deps = ["Libdl"]
uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"

[[Logging]]
uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"

[[MacroTools]]
deps = ["Markdown", "Random"]
git-tree-sha1 = "6a8a2a625ab0dea913aba95c11370589e0239ff0"
uuid = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09"
version = "0.5.6"

[[Markdown]]
deps = ["Base64"]
uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"

[[Mmap]]
uuid = "a63ad114-7e13-5084-954f-fe012c677804"

[[MozillaCACerts_jll]]
uuid = "14a3606d-f60d-562e-9121-12d972cd8159"

[[NNlib]]
deps = ["Compat", "Libdl", "LinearAlgebra", "Pkg", "Requires", "Statistics"]
git-tree-sha1 = "1ae42464fea5258fd2ff49f1c4a40fc41cba3860"
uuid = "872c559c-99b0-510c-b3b7-b6c96a88d5cd"
version = "0.7.7"

[[NetworkOptions]]
uuid = "ca575930-c2e3-43a9-ace4-1e988b2c1908"

[[OrderedCollections]]
git-tree-sha1 = "cf59cfed2e2c12e8a2ff0a4f1e9b2cd8650da6db"
uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
version = "1.3.2"

[[Pkg]]
deps = ["Artifacts", "Dates", "Downloads", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Serialization", "TOML", "Tar", "UUIDs"]
uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"

[[Printf]]
deps = ["Unicode"]
uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7"

[[REPL]]
deps = ["InteractiveUtils", "Markdown", "Sockets", "Unicode"]
uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"

[[Random]]
deps = ["Serialization"]
uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"

[[Reexport]]
deps = ["Pkg"]
git-tree-sha1 = "7b1d07f411bc8ddb7977ec7f377b97b158514fe0"
uuid = "189a3867-3050-52da-a836-e630ba90ab69"
version = "0.2.0"

[[Requires]]
deps = ["UUIDs"]
git-tree-sha1 = "e05c53ebc86933601d36212a93b39144a2733493"
uuid = "ae029012-a4dd-5104-9daa-d747884805df"
version = "1.1.1"

[[SHA]]
uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"

[[Scratch]]
deps = ["Dates"]
git-tree-sha1 = "ad4b278adb62d185bbcb6864dc24959ab0627bf6"
uuid = "6c6a2e73-6563-6170-7368-637461726353"
version = "1.0.3"

[[Serialization]]
uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"

[[SharedArrays]]
deps = ["Distributed", "Mmap", "Random", "Serialization"]
uuid = "1a1011a3-84de-559e-8e89-a11a2f7dc383"

[[Sockets]]
uuid = "6462fe0b-24de-5631-8697-dd941f90decc"

[[SparseArrays]]
deps = ["LinearAlgebra", "Random"]
uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"

[[Statistics]]
deps = ["LinearAlgebra", "SparseArrays"]
uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"

[[TOML]]
deps = ["Dates"]
uuid = "fa267f1f-6049-4f14-aa54-33bafae1ed76"

[[Tar]]
deps = ["ArgTools", "SHA"]
uuid = "a4e569a6-e804-4fa4-b0f3-eef7a1d5b13e"

[[Test]]
deps = ["InteractiveUtils", "Logging", "Random", "Serialization"]
uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[[TimerOutputs]]
deps = ["Printf"]
git-tree-sha1 = "3318281dd4121ecf9713ce1383b9ace7d7476fdd"
uuid = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"
version = "0.5.7"

[[UUIDs]]
deps = ["Random", "SHA"]
uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"

[[Unicode]]
uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"

Expected behavior

Return 3.0

Version info

Details on Julia:

Julia Version 1.6.0-DEV.1620
Commit 8c35003541 (2020-12-01 12:25 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.0 (ORCJIT, broadwell)

Details on CUDA:

CUDA toolkit 11.1.1, artifact installation
CUDA driver 11.1.0
NVIDIA driver 455.23.5

Libraries: 
- CUBLAS: 11.3.0
- CURAND: 10.2.2
- CUFFT: 10.3.0
- CUSOLVER: 11.0.1
- CUSPARSE: 11.3.0
- CUPTI: 14.0.0
- NVML: 11.0.0+455.23.5
- CUDNN: 8.0.4 (for CUDA 11.1.0)
- CUTENSOR: 1.2.1 (for CUDA 11.1.0)

Toolchain:
- Julia: 1.6.0-DEV.1620
- LLVM: 11.0.0
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0
- Device support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80

9 devices:
  0: Tesla V100-PCIE-32GB (sm_70, 30.915 GiB / 31.749 GiB available)
  1: Tesla V100-PCIE-32GB (sm_70, 31.721 GiB / 31.749 GiB available)
  2: Tesla V100-PCIE-32GB (sm_70, 31.343 GiB / 31.749 GiB available)
  3: Tesla V100-PCIE-32GB (sm_70, 4.546 GiB / 31.749 GiB available)
  4: Tesla V100-PCIE-16GB (sm_70, 15.778 GiB / 15.782 GiB available)
  5: Tesla P100-PCIE-16GB (sm_60, 15.431 GiB / 15.899 GiB available)
  6: Tesla P100-PCIE-16GB (sm_60, 15.897 GiB / 15.899 GiB available)
  7: GeForce GTX 1080 Ti (sm_61, 10.913 GiB / 10.917 GiB available)
  8: GeForce GTX 1080 Ti (sm_61, 10.913 GiB / 10.917 GiB available)
@dangirsh dangirsh added the bug Something isn't working label Dec 1, 2020
@maleadt
Copy link
Member

maleadt commented Dec 1, 2020

This is expected behavior. You're dispatching to the CPU BLAS implementation, which attempts to take a CPU pointer of its arguments. The fact that this differs depending on the types is a property of the Base implementation. Mixing array types is generally not supported.

@maleadt maleadt closed this as completed Dec 1, 2020
@maleadt maleadt removed the bug Something isn't working label Dec 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants