-
Notifications
You must be signed in to change notification settings - Fork 79
Description
Hello everyone!
We are trying to use KernelAbstraction.jl for parallelising some nested integrals inside a Julia code (GaPSE.jl, for the ones who are interested).
We encountered however an issue we are not able to solve: we need to pass a complicated struct into the kernel to make the computations,
but we get stuck at the error Argument XXX to your kernel function is of type YYY, which is not isbits: that exceptions is due to Vector{Float64}, which is not
a bit type (even if I don't get why, because we don't use pointers and it's made of primitive concrete Float64 values).
We created a minimal working example with oneAPI of what we are doing, I put it at the bottom.
The same same behaviour can be produced with Metal, after the replacements oneAPI.oneAPIBackend()->Metal.MetalBackend() and Float64->Float32 (because Metal does not support doubles).
The part which bugs us the most: it IS possible to pass a Vector{Float64}, but only if:
- it is first transformed into the GPU type (e.g
oneArray([1 2 3])) - that vector is then given as input DIRECTLY to the kernel
My question is: are we doing something wrong/missing something, or really is not possible to pass structs containing Vectors to the kernel?
Some notes:
- the input
structcan be read-only, we don't want to modify it; - we cannot use
StaticArrays, because we don't know the size at compile time but at runtime; - we tried to replace
Vector{Float64}withoneArray{Float64}, no change; - we tried to add
@Constto the struct, no change; - it would be very complicated and unpractical for us to unpack the whole
struct; have a look at GaPSE.Cosmology if you need; we thought about creating another struct from this one removing the obvious nonbits types (e.g.Strings), but we absolutely need to bring the vectors (GaPSE.MySplineis just anotherstructcontainingVector{Float64})
Thank you in advance for the help; KernelAbstractions.jl is a great idea, we would really like to make it work.
Here the MWE:
using KernelAbstractions
using oneAPI
struct MyStruct
a::Float64
b::Vector{Float64}
end
MS = MyStruct(1.0, [2.0, 3.0])
@kernel function my_kernel(A, MS)
I = @index(Global)
A[I] = 2 * A[I] + MS.b[1]
end
#backend = CPU() # This works
backend = oneAPI.oneAPIBackend() # This doesn't
A = KernelAbstractions.ones(backend, Float64, 1024, 1024)
ev = my_kernel(backend, 64)(A, MS, ndrange=size(A))
KernelAbstractions.synchronize(backend)
println( "Test: ", all(A .== 4.0) ? "Passed" : "Failed" )which gives:
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.10.7 (2024-11-26)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> include("./KernelAbstraction_struct.jl") # with "backend = CPU()"
Test: Passed
julia> include("./KernelAbstraction_struct.jl") # with "backend = oneAPI.oneAPIBackend()"
ERROR: LoadError: GPU compilation of MethodInstance for gpu_my_kernel(::KernelAbstractions.CompilerMetadata{…}, ::oneDeviceMatrix{…}, ::MyStruct) failed
KernelError: passing and using non-bitstype argument
Argument 4 to your kernel function is of type MyStruct, which is not isbits:
.b is of type Vector{Float64} which is not isbits.
Stacktrace:
[1] check_invocation(job::GPUCompiler.CompilerJob)
@ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/validation.jl:92
[2] macro expansion
@ ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:128 [inlined]
[3] macro expansion
@ ~/.julia/packages/TimerOutputs/Lw5SP/src/TimerOutput.jl:253 [inlined]
[4] codegen(output::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
@ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:126
[5] codegen
@ ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:115 [inlined]
[6] compile(target::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool)
@ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:111
[7] compile
@ ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:103 [inlined]
[8] #58
@ ~/.julia/packages/oneAPI/1GTs3/src/compiler/compilation.jl:81 [inlined]
[9] JuliaContext(f::oneAPI.var"#58#59"{GPUCompiler.CompilerJob{GPUCompiler.SPIRVCompilerTarget, oneAPI.oneAPICompilerParams}}; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:52
[10] JuliaContext(f::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:42
[11] compile(job::GPUCompiler.CompilerJob)
@ oneAPI ~/.julia/packages/oneAPI/1GTs3/src/compiler/compilation.jl:80
[12] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(oneAPI.compile), linker::typeof(oneAPI.link))
@ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/execution.jl:128
[13] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/execution.jl:103
[14] macro expansion
@ ~/.julia/packages/oneAPI/1GTs3/src/compiler/execution.jl:203 [inlined]
[15] macro expansion
@ ./lock.jl:267 [inlined]
[16] zefunction(f::typeof(gpu_my_kernel), tt::Type{Tuple{KernelAbstractions.CompilerMetadata{…}, oneDeviceMatrix{…}, MyStruct}}; kwargs::@Kwargs{})
@ oneAPI ~/.julia/packages/oneAPI/1GTs3/src/compiler/execution.jl:198
[17] zefunction(f::typeof(gpu_my_kernel), tt::Type{Tuple{KernelAbstractions.CompilerMetadata{…}, oneDeviceMatrix{…}, MyStruct}})
@ oneAPI ~/.julia/packages/oneAPI/1GTs3/src/compiler/execution.jl:195
[18] macro expansion
@ ~/.julia/packages/oneAPI/1GTs3/src/compiler/execution.jl:66 [inlined]
[19] (::KernelAbstractions.Kernel{…})(::oneArray{…}, ::Vararg{…}; ndrange::Tuple{…}, workgroupsize::Nothing)
@ oneAPI.oneAPIKernels ~/.julia/packages/oneAPI/1GTs3/src/oneAPIKernels.jl:89