Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impossible to cast a float into an integer #441

Closed
albop opened this issue May 23, 2024 · 4 comments
Closed

Impossible to cast a float into an integer #441

albop opened this issue May 23, 2024 · 4 comments

Comments

@albop
Copy link

albop commented May 23, 2024

It currently seems impossible (to me?) to convert a float into an integer inside a oneapi kernel.
I tried with several functions like round(Int, ...), trunc(Int, ...) and also with convert(Int(rount(...))) but I always get a compilation error.

Here is a self contained example with the corresponding error log.

using oneAPI
using KernelAbstractions
using Adapt
const KA = KernelAbstractions

@kernel function round_kernel(A,B)
    I = @index(Global)
    # rounding operation
    A[I] = trunc(Int,B[I])
end

A = zeros(Int8, 1024, 1024)
B = rand(Float32, 1024, 1024)*10
A_gpu = adapt(oneArray, A)
B_gpu = adapt(oneArray, B)
dev = get_backend(A_gpu)
round_kernel(dev, 64)(A_gpu, B_gpu, ndrange=size(A_gpu))
KA.synchronize(dev)
ERROR: InvalidIRError: compiling MethodInstance for gpu_round_kernel(::KernelAbstractions.CompilerMetadata{…}, ::oneDeviceMatrix{…}, ::oneDeviceMatrix{…}) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to gpu_malloc)
Stacktrace:
 [1] malloc
   @ ~/.julia/packages/GPUCompiler/kqxyC/src/runtime.jl:88
 [2] macro expansion
   @ ~/.julia/packages/GPUCompiler/kqxyC/src/runtime.jl:183
 [3] macro expansion
   @ ./none:0
 [4] box
  @ ./none:0
 [5] box_float32
   @ ~/.julia/packages/GPUCompiler/kqxyC/src/runtime.jl:212
 [6] trunc
   @ ./float.jl:905
 [7] macro expansion
   @ ~/Econforge/Dolo.jl/misc/test_oneapi.jl:173
 [8] gpu_round_kernel
   @ ~/.julia/packages/KernelAbstractions/sZvJo/src/macros.jl:95
 [9] gpu_round_kernel
   @ ./none:0
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.SPIRVCompilerTarget, oneAPI.oneAPICompilerParams}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/kqxyC/src/validation.jl:147
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:445 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/Lw5SP/src/TimerOutput.jl:253 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:444 [inlined]
  [5] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/kqxyC/src/utils.jl:92
  [6] emit_llvm
    @ ~/.julia/packages/GPUCompiler/kqxyC/src/utils.jl:86 [inlined]
  [7] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:134
  [8] codegen
    @ ~/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:115 [inlined]
  [9] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:111
 [10] compile
    @ ~/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:103 [inlined]
 [11] #58
    @ ~/.julia/dev/oneAPI/src/compiler/compilation.jl:81 [inlined]
 [12] JuliaContext(f::oneAPI.var"#58#59"{GPUCompiler.CompilerJob{…}}; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:52
 [13] JuliaContext(f::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:42
 [14] compile(job::GPUCompiler.CompilerJob)
    @ oneAPI ~/.julia/dev/oneAPI/src/compiler/compilation.jl:80
 [15] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(oneAPI.compile), linker::typeof(oneAPI.link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/kqxyC/src/execution.jl:128
 [16] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/kqxyC/src/execution.jl:103
 [17] macro expansion
    @ ~/.julia/dev/oneAPI/src/compiler/execution.jl:203 [inlined]
 [18] macro expansion
    @ ./lock.jl:267 [inlined]
 [19] zefunction(f::typeof(gpu_round_kernel), tt::Type{Tuple{…}}; kwargs::@Kwargs{})
    @ oneAPI ~/.julia/dev/oneAPI/src/compiler/execution.jl:198
 [20] zefunction(f::typeof(gpu_round_kernel), tt::Type{Tuple{…}})
    @ oneAPI ~/.julia/dev/oneAPI/src/compiler/execution.jl:195
 [21] macro expansion
    @ ~/.julia/dev/oneAPI/src/compiler/execution.jl:66 [inlined]
 [22] (::KernelAbstractions.Kernel{…})(::oneArray{…}, ::Vararg{…}; ndrange::Tuple{…}, workgroupsize::Nothing)
    @ oneAPI.oneAPIKernels ~/.julia/dev/oneAPI/src/oneAPIKernels.jl:89
 [23] top-level scope
    @ ~/Econforge/Dolo.jl/misc/test_oneapi.jl:182
Some type information was truncated. Use `show(err)` to see complete types.
@albop
Copy link
Author

albop commented May 23, 2024

Actually, I have just found a way:

@kernel function round_kernel(A,B)
  I = @index(Global)
  v = trunc(B[I])
  A[I] = (v::Int8)
end

Not sure what to do with this issue. I guess the GPU definitions of trunc, round, ceil, could potentially be enhanced.

@albop
Copy link
Author

albop commented May 23, 2024

The comment above is wrong: the kernel above doesn' t raise an exception but doesn' t do anything either. I guess the instruction v::Int8 is just an assertion, which invalidates silently the whole kernel.

@maleadt
Copy link
Member

maleadt commented May 23, 2024

This is a known limitation, tracked in #65.
In your case, use unsafe_trunc instead, which doesn't generate an exception (and thus doesn't require an allocation). It's unfortunate, but adding overlay definitions to trunc is much too invasive, so time is better spent working on a device-side allocator fixing #65.

@maleadt maleadt closed this as completed May 23, 2024
@albop
Copy link
Author

albop commented May 23, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants