Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing implementation of right multiply for QR decomposition #1738

Closed
evelyne-ringoot opened this issue Jan 20, 2023 · 1 comment · Fixed by #1739
Closed

Missing implementation of right multiply for QR decomposition #1738

evelyne-ringoot opened this issue Jan 20, 2023 · 1 comment · Fixed by #1739
Labels
bug Something isn't working cuda libraries Stuff about CUDA library wrappers.

Comments

@evelyne-ringoot
Copy link
Contributor

evelyne-ringoot commented Jan 20, 2023

Hi, There appears to be missing an implementation of right multiply for QR decomposition.

using LinearAlgebra, CUDA
a=CUDA.randn(3,3)
b=CUDA.randn(3,3)
q=qr(a).Q
q * b # works fine
b * q #results in an error

It results in this error:

ERROR: ArgumentError: cannot take the CPU address of a CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
Stacktrace:
 [1] unsafe_convert(#unused#::Type{Ptr{Float32}}, x::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
 [2] ormqr!(side::Char, trans::Char, A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, tau::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, C::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
 [3] rmul!(A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, B::LinearAlgebra.QRPackedQ{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}})
 [4] *(A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Q::LinearAlgebra.QRPackedQ{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}})

Version info

Julia Version 1.8.0
CUDA toolkit 11.7, artifact installation
NVIDIA driver 516.54.0, for CUDA 11.7
CUDA driver 11.7

Libraries:
- CUBLAS: 11.10.1
- CURAND: 10.2.10
- CUFFT: 10.7.1
- CUSOLVER: 11.3.5
- CUSPARSE: 11.7.3
- CUPTI: 17.0.0
- NVML: 11.0.0+516.54
- CUDNN: 8.30.2 (for CUDA 11.5.0)
- CUTENSOR: 1.4.0 (for CUDA 11.5.0)

Toolchain:
- Julia: 1.8.0
- LLVM: 13.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

1 device:
  0: NVIDIA GeForce RTX 3050 Laptop GPU (sm_86, 3.370 GiB / 4.000 GiB available)
@evelyne-ringoot evelyne-ringoot added the bug Something isn't working label Jan 20, 2023
@maleadt
Copy link
Member

maleadt commented Jan 20, 2023

Adding this seems easy enough, #1739, but this isn't my domain so I'm not sure how to test this (e.g. what are the properties of x * qr(y).Q). We already have a bunch of qr tests, maybe you can suggest how to cover the added rmul! functionality?

@testset "qr" begin
tol = min(m, n)*eps(real(elty))*(1 + (elty <: Complex))
A = rand(elty, m, n)
qra = qr(A)
d_A = CuArray(A)
d_F = qr(d_A)
d_RR = d_F.Q'*d_A
@test d_RR[1:n,:] d_F.R atol=tol*norm(A)
@test norm(d_RR[n+1:end,:]) < tol*norm(A)
@test size(d_F) == size(A)
@test size(d_F.Q, 1) == size(A, 1)
@test det(d_F.Q) det(collect(d_F.Q * CuMatrix{elty}(I, size(d_F.Q)))) atol=tol*norm(A)
CUDA.@allowscalar begin
qval = d_F.Q[1, 1]
@test qval qra.Q[1, 1]
qrstr = sprint(show, MIME"text/plain"(), d_F)
if VERSION >= v"1.8-"
@test qrstr == "$(typeof(d_F))\nQ factor:\n$(sprint(show, MIME"text/plain"(), d_F.Q))\nR factor:\n$(sprint(show, MIME"text/plain"(), d_F.R))"
else
@test qrstr == "$(typeof(d_F)) with factors Q and R:\n$(sprint(show, d_F.Q))\n$(sprint(show, d_F.R))"
end
end
dQ, dR = d_F
@test collect(dQ*dR) A
A = rand(elty, n, m)
d_A = CuArray(A)
d_F = qr(d_A)
@test d_F.Q'*d_A d_F.R atol=tol*norm(A)
@test det(d_F.Q) det(collect(d_F.Q * CuMatrix{elty}(I, size(d_F.Q)))) atol=tol*norm(A)
A = rand(elty, m, n)
d_A = CuArray(A)
h_q, h_r = qr(d_A)
q, r = qr(A)
@test Array(h_q) Array(q)
@test collect(CuArray(h_q)) Array(q)
@test Array(h_r) Array(r)
@test CuArray(h_q) convert(typeof(d_A), h_q)
A = rand(elty, n, m)
d_A = CuArray(A)
h_q, h_r = qr(d_A) # FixMe! Use iteration protocol when implemented
q, r = qr(A)
@test Array(h_q) Array(q)
@test Array(h_r) Array(r)
A = rand(elty, n) # A and B are vectors
d_A = CuArray(A)
M = qr(A)
h_M = qr(d_A)
B = rand(elty, n)
d_B = CuArray(B)
@test Array(M \ B) Array(h_M \ d_B)
A = rand(elty, m, n) # A is a matrix and B is a vector
d_A = CuArray(A)
M = qr(A)
h_M = qr(d_A)
B = rand(elty, m)
d_B = CuArray(B)
@test Array(M \ B) Array(h_M \ d_B)
A = rand(elty, m, n) # A and B are matrices
d_A = CuArray(A)
M = qr(A)
h_M = qr(d_A)
B = rand(elty, m, n)
d_B = CuArray(B)
@test Array(M \ B) Array(h_M \ d_B)
end

@kshyatt kshyatt added the cuda libraries Stuff about CUDA library wrappers. label Jan 22, 2023
evelyne-ringoot added a commit to evelyne-ringoot/CUDA.jl that referenced this issue Jan 24, 2023
maleadt pushed a commit that referenced this issue Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuda libraries Stuff about CUDA library wrappers.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants