Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unsafe_copy3d!: srcPos and dstPos handling #27

Closed
samo-lin opened this issue May 25, 2020 · 5 comments · Fixed by #197
Closed

unsafe_copy3d!: srcPos and dstPos handling #27

samo-lin opened this issue May 25, 2020 · 5 comments · Fixed by #197
Labels
bug Something isn't working cuda libraries Stuff about CUDA library wrappers.

Comments

@samo-lin
Copy link
Contributor

The srcPos and dstPos taken as arguments by unsafe_copy3d! are modified as follows before being passed further to the underlying CUDA function:

        # source
        srcPos.x-1, srcPos.y-1, srcPos.z-1,
        # destination
        dstPos.x-1, dstPos.y-1, dstPos.z-1,

However, srcPos.x and dstPos.x should be multiplied by sizeof(T) in addition as showed my debugging.

Here is a successful copy of a YZ plane with the unmodified version of CUDAdrv (I had to set srcPos=((2-1)*sizeof(A[1])+1, 1, 1) instead of srcPos=(2, 1, 1) as the multiplication by sizeof(T) is missing in unsafe_copy3d!):

$> julia

julia> using CUDAdrv, CUDAnative, CuArrays, Test


julia> nx, ny, nz = 4, 4, 4
(4, 4, 4)

julia> custream = CuStream(flags=CUDAdrv.STREAM_NON_BLOCKING);

julia> ############
       ## Y-Z-plane

       A = zeros(Float64, ny, nz);

julia> B = CuArrays.rand(Float64, nx, ny, nz);

julia> # Copy from device to host.
       Mem.unsafe_copy3d!(
           pointer(A), Mem.Host, pointer(B), Mem.Device,
           1, ny, nz; 
           srcPos=(1*sizeof(A[1])+1,1,1),           # srcPos=(2,1,1), 
           srcPitch=nx*sizeof(A[1]), srcHeight=ny,
           dstPitch=1*sizeof(A[1]), dstHeight=ny,
           async=true, stream=custream
       )

julia> CUDAdrv.synchronize();

julia> # [inserted] Check copy
       @show A
A = [0.22951299386838836 0.734574221326824 0.05751589824146003 0.522626039000645; 0.4430254511045066 0.7971512170110366 0.7708275111938903 0.5534828193805892; 0.8789944224363411 0.6523963055855746 0.3520796694846117 0.9520430524852794; 0.5267335785181211 0.8813394638451009 0.20243530919636582 0.8570615213018375]
4×4 Array{Float64,2}:
 0.229513  0.734574  0.0575159  0.522626
 0.443025  0.797151  0.770828   0.553483
 0.878994  0.652396  0.35208    0.952043
 0.526734  0.881339  0.202435   0.857062

julia> @test all(B[2,:,:] .== CuArray(A))
Test Passed

julia> # [inserted] Set a different value for A
       A .= 7;

julia> # Copy from host to device.
       Mem.unsafe_copy3d!(
           pointer(B), Mem.Device, pointer(A), Mem.Host,
           1, ny, nz; 
           dstPos=(1*sizeof(A[1])+1,1,1),          # dstPos=(2,1,1),
           srcPitch=1*sizeof(A[1]), srcHeight=ny,
           dstPitch=nx*sizeof(A[1]), dstHeight=ny,
           async=true, stream=custream
       )

julia> CUDAdrv.synchronize();

julia> # [inserted] Check copy
       @show B[2,:,:]
B[2, :, :] = [7.0 7.0 7.0 7.0; 7.0 7.0 7.0 7.0; 7.0 7.0 7.0 7.0; 7.0 7.0 7.0 7.0]
4×4 CuArray{Float64,2,Nothing}:
 7.0  7.0  7.0  7.0
 7.0  7.0  7.0  7.0
 7.0  7.0  7.0  7.0
 7.0  7.0  7.0  7.0

julia> @test all(B[2,:,:] .== CuArray(A))
Test Passed

Besides, the test for copying a x-z plane in test/memory.jl does not seem quite correct even though it does not fail. But, I am not sure to fully understand its intention.

@samo-lin
Copy link
Contributor Author

@maleadt, I hope I was clear above: all that needs to be done in the code is to
replace

# source
srcPos.x-1, srcPos.y-1, srcPos.z-1,

with

# source
(srcPos.x-1)*sizeof(T), srcPos.y-1, srcPos.z-1,

and

# destination
dstPos.x-1, dstPos.y-1, dstPos.z-1,

with

# destination
(dstPos.x-1)*sizeof(T), dstPos.y-1, dstPos.z-1,

Then, maybe it is worthwhile to adapt also the tests.

Besides that everything works as it should...

Thanks!!

@maleadt maleadt transferred this issue from JuliaGPU/CUDAdrv.jl May 27, 2020
@maleadt maleadt added bug Something isn't working cuda libraries Stuff about CUDA library wrappers. labels May 27, 2020
@samo-lin
Copy link
Contributor Author

@maleadt , do I assume right that you will not release any new independent CUDAdrv.jl version, but just CUDA.jl instead?

@maleadt
Copy link
Member

maleadt commented May 27, 2020

@samo-lin
Copy link
Contributor Author

Thanks for the info. I will try CUDA.jl as soon a I get to it.

@samo-lin
Copy link
Contributor Author

samo-lin commented Jun 3, 2020

@maleadt, I will do the PR, I hope you can then merge and tag it quickly as it is the last missing thing to conclude our update to CUDA.jl - thanks!

bors bot added a commit that referenced this issue Jun 3, 2020
197: [bugfix] Handling of srcPos and dstPos in unsafe_copy3d! r=maleadt a=samo-lin

Closes #27 

Co-authored-by: samo-lin <omlin.samuel@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuda libraries Stuff about CUDA library wrappers.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants