Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Ref pass by-reference #267

Closed
maleadt opened this issue Jul 2, 2020 · 5 comments · Fixed by #2109
Closed

Make Ref pass by-reference #267

maleadt opened this issue Jul 2, 2020 · 5 comments · Fixed by #2109
Labels
cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request speculative Not sure about this one yet.

Comments

@maleadt
Copy link
Member

maleadt commented Jul 2, 2020

julia> a = CUDA.rand(1)
julia> kernel(a) = (@inbounds a[1] = 0; nothing)
kernel (generic function with 1 method)

julia> @device_code_ptx @cuda kernel(a)
// PTX CompilerJob of kernel kernel(CuDeviceArray{Float32,1,CUDA.AS.Global}) for sm_75

//
// Generated by LLVM NVPTX Back-End
//

.version 6.3
.target sm_75
.address_size 64

        // .globl       _Z17julia_kernel_435613CuDeviceArrayI7Float32Li1E6GlobalE // -- Begin function _Z17julia_kernel_435613CuDeviceArrayI7Float32Li1E6GlobalE
.weak .global .align 8 .u64 exception_flag;
                                        // @_Z17julia_kernel_435613CuDeviceArrayI7Float32Li1E6GlobalE
.visible .entry _Z17julia_kernel_435613CuDeviceArrayI7Float32Li1E6GlobalE(
        .param .align 8 .b8 _Z17julia_kernel_435613CuDeviceArrayI7Float32Li1E6GlobalE_param_0[16]
)
{
        .reg .b32       %r<2>;
        .reg .b64       %rd<3>;

// %bb.0:                               // %top
        mov.b64         %rd1, _Z17julia_kernel_435613CuDeviceArrayI7Float32Li1E6GlobalE_param_0;
        ld.param.u64    %rd2, [%rd1+8];
        mov.u32         %r1, 0;
        st.global.u32   [%rd2], %r1;
        ret;
                                        // -- End function
}
julia> kernel(a) = (@inbounds a[][1] = 1; nothing)
kernel (generic function with 1 method)

julia> @device_code_ptx @cuda kernel(Ref(a))
// PTX CompilerJob of kernel kernel(CUDA.CuRefValue{CuDeviceArray{Float32,1,CUDA.AS.Global}}) for sm_75

//
// Generated by LLVM NVPTX Back-End
//

.version 6.3
.target sm_75
.address_size 64

        // .globl       _Z17julia_kernel_433710CuRefValueI13CuDeviceArrayI7Float32Li1E6GlobalEE // -- Begin function _Z17julia_kernel_433710CuRefValueI13CuDeviceArrayI7Float32Li1E6GlobalEE
.weak .global .align 8 .u64 exception_flag;
                                        // @_Z17julia_kernel_433710CuRefValueI13CuDeviceArrayI7Float32Li1E6GlobalEE
.visible .entry _Z17julia_kernel_433710CuRefValueI13CuDeviceArrayI7Float32Li1E6GlobalEE(
        .param .align 8 .b8 _Z17julia_kernel_433710CuRefValueI13CuDeviceArrayI7Float32Li1E6GlobalEE_param_0[16]
)
{
        .reg .b32       %r<2>;
        .reg .b64       %rd<3>;

// %bb.0:                               // %top
        mov.b64         %rd1, _Z17julia_kernel_433710CuRefValueI13CuDeviceArrayI7Float32Li1E6GlobalEE_param_0;
        ld.param.u64    %rd2, [%rd1+8];
        mov.u32         %r1, 1065353216;
        st.global.u32   [%rd2], %r1;
        ret;
                                        // -- End function
}

Could be useful to work around parameter state space size restrictions like CliMA/Oceananigans.jl#746. But might pessimize Broadcast operations where Ref is commonly used.

@jagoosw
Copy link

jagoosw commented Aug 17, 2022

Hi @maleadt,

I've come across an issue similar to CliMA/Oceananigans.jl#746 while I've been trying to build a biogeochemical model on top of Oceananigans. I was wondering if you had any updated advice on solving the issue or if the above suggestion would now work as a workaround?

Thanks,
Jago

@maleadt
Copy link
Member Author

maleadt commented Aug 17, 2022

Solving which issue?

If you want pass-by-reference behavior, use an array instead of a Ref for now.

@jagoosw
Copy link

jagoosw commented Aug 17, 2022

Sorry basically my issue is just trying to pass a massive parameter like in the Oceananigans issue so I think pass by reference behaviour is what I'm after.

How do you mean to use an array?

@maleadt
Copy link
Member Author

maleadt commented Aug 17, 2022

Like the example here, use a single-element array.

@jagoosw
Copy link

jagoosw commented Aug 17, 2022

I see, thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request speculative Not sure about this one yet.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants