Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opaque pointers support #511

Closed
pxl-th opened this issue Sep 11, 2023 · 5 comments
Closed

Opaque pointers support #511

pxl-th opened this issue Sep 11, 2023 · 5 comments

Comments

@pxl-th
Copy link
Member

pxl-th commented Sep 11, 2023

ROCm 5.5+ uses LLVM 16 and opaque pointers, which leads to issues like:

julia> x = ROCArray{Float32}(undef, 16);

julia> fill!(x, 0f0)
error: Opaque pointers are only supported in -opaque-pointers mode (Producer: 'LLVM16.0.0git' Reader: 'LLVM 15.0.7jl')

And launching Julia (1.10 in this case) with JULIA_LLVM_ARGS="--opaque-pointers" results in:

julia> using AMDGPU

julia> x = ROCArray{Float32}(undef, 16);

julia> fill!(x, 0f0)
ERROR: Taking the type of an opaque pointer is illegal
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] eltype(typ::LLVM.PointerType)
    @ LLVM ~/.julia/packages/LLVM/lq6lJ/src/core/type.jl:167
  [3] classify_arguments(job::GPUCompiler.CompilerJob, codegen_ft::LLVM.FunctionType)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/irgen.jl:384
  [4] macro expansion
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/irgen.jl:86 [inlined]
  [5] macro expansion
    @ GPUCompiler ~/.julia/packages/TimerOutputs/RsWnF/src/TimerOutput.jl:253 [inlined]
  [6] irgen(job::GPUCompiler.CompilerJob)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/irgen.jl:82
  [7] macro expansion
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/driver.jl:202 [inlined]
  [8] macro expansion
    @ GPUCompiler ~/.julia/packages/TimerOutputs/RsWnF/src/TimerOutput.jl:253 [inlined]
  [9] macro expansion
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/driver.jl:201 [inlined]
 [10] emit_llvm(job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/utils.jl:89
 [11] emit_llvm
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/utils.jl:83 [inlined]
 [12] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/driver.jl:129
 [13] codegen
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/driver.jl:110 [inlined]
 [14] compile(target::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/driver.jl:106
 [15] compile
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/driver.jl:98 [inlined]
 [16] #37
    @ GPUCompiler ~/.julia/dev/AMDGPU/src/compiler/codegen.jl:122 [inlined]
 [17] JuliaContext(f::AMDGPU.Compiler.var"#37#38"{GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/driver.jl:47
 [18] hipcompile(job::GPUCompiler.CompilerJob)
    @ AMDGPU.Compiler ~/.julia/dev/AMDGPU/src/compiler/codegen.jl:121
 [19] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(AMDGPU.Compiler.hipcompile), linker::typeof(AMDGPU.Compiler.hiplink))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/execution.jl:125
 [20] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/XwKB0/src/execution.jl:103
 [21] macro expansion
    @ AMDGPU.Compiler ~/.julia/dev/AMDGPU/src/compiler/codegen.jl:91 [inlined]
 [22] macro expansion
    @ AMDGPU.Compiler ./lock.jl:267 [inlined]
 [23] hipfunction(f::GPUArrays.var"#6#7", tt::Type{Tuple{AMDGPU.ROCKernelContext, AMDGPU.Device.ROCDeviceVector{Float32, 1}, Float32}}; kwargs::@Kwargs{name::Nothing})
    @ AMDGPU.Compiler ~/.julia/dev/AMDGPU/src/compiler/codegen.jl:85
 [24] hipfunction
    @ GPUArrays ~/.julia/dev/AMDGPU/src/compiler/codegen.jl:84 [inlined]
 [25] macro expansion
    @ GPUArrays ~/.julia/dev/AMDGPU/src/highlevel.jl:159 [inlined]
 [26] #gpu_call#58
    @ GPUArrays ~/.julia/dev/AMDGPU/src/gpuarrays.jl:9 [inlined]
 [27] gpu_call
    @ GPUArrays ~/.julia/dev/AMDGPU/src/gpuarrays.jl:5 [inlined]
 [28] gpu_call(::GPUArrays.var"#6#7", ::ROCArray{…}, ::Float32; target::ROCArray{…}, elements::Nothing, threads::Nothing, blocks::Nothing, name::Nothing)
    @ GPUArrays ~/.julia/packages/GPUArrays/EZkix/src/device/execution.jl:65
 [29] gpu_call
    @ GPUArrays ~/.julia/packages/GPUArrays/EZkix/src/device/execution.jl:34 [inlined]
 [30] fill!(A::ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}, x::Float32)
    @ GPUArrays ~/.julia/packages/GPUArrays/EZkix/src/host/construction.jl:14
 [31] top-level scope
    @ REPL[3]:1

CC @jpsamaroo

@vchuravy
Copy link
Member

Generally we won't be able to read LLVM IR being produced by a newer version of LLVM. This is why I said earlier this summer that I believe for a long-term solution you will need to generate the bitcode archive with multiple LLVM versions.

You can use LLVM.supports_typed_pointers(ctx) == false to see if a context supports opaque pointers, and run with JULIA_LLVM_ARGS="--opaque-pointers" but YMMV. We haven't turned on opaque pointers on Julia master yet since we
found performance regression and no one had time to investigate those.

I think the first step here would be to add a CI job that runs with --opaque-pointers and go through the code-base and use eltype only conditionally.

@gbaraldi
Copy link
Contributor

LLVM.jl already tests with --opaque-pointers and it has the opaque pointer infrastructure in place, though I'm not sure we fixed everything downstream.

@pxl-th
Copy link
Member Author

pxl-th commented Sep 15, 2023

Generally we won't be able to read LLVM IR being produced by a newer version of LLVM.

I've tried a simpler case with a kernel without any arguments to avoid calling eltype:

f() = return
@roc f()

And it fails with error: Unknown attribute kind (86) (Producer: 'LLVM16.0.0git' Reader: 'LLVM 15.0.7jl') during linking of device libraries.
I take it this confirms it?

I was using Julia 1.9 (LLVM 14) with ROCm 5.4 (LLVM 15) without any issues, so I was hoping LLVM 15 & LLVM 16 would also work :/

This is why I said earlier this summer that I believe for a long-term solution you will need to generate the bitcode archive with multiple LLVM versions.

That might indeed help, I just had low motivation to build ROCm libraries with BinaryBuilder again.
The only question is, will it work with devices support for which was added in newer LLVM versions, since we are building devlibs with older LLVM versions (and other features like FP16 atomics).


Will Julia 1.11 use LLVM 16?
Currently the lack of ROCm 5.5+ support prevents us from supporting Navi 3 and Windows.

@vchuravy
Copy link
Member

Will Julia 1.11 use LLVM 16?

Maybe? It heavily depends on the bandwidth folks have (I currently can't work on it) and @gbaraldi is busy with a lot of things.

@gbaraldi
Copy link
Contributor

I would like for that to be the case. And to me the most annoying thing is getting the BinaryBuilder build working. And as always 32bit and windows are the holdups. Though in the case of windows I'm concerned because we've hit the symbol cap and there doesn't seem to be a clear solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants