-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running example on AMD GPU fails #162
Comments
In a clean environment, it gives this somewhat similar message: ERROR: LoadError: GPUCompiler.InvalidIRError(GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}(MethodInstance for (::WaterLily.var"#gpu_##kern_#545#223"{WaterLily.var"#fill!#220"{2, Float32, Int64, AutoBody{WaterLily.var"#comp#234"{Bool, var"#10#14"{SVector{3, ComplexF64}, Float64}, WaterLily.var"#231#235"}, WaterLily.var"#231#235"}}})(::KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{2, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.StaticSize{(64, 1)}, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, Nothing}}, ::AMDGPU.Device.ROCDeviceArray{Float32, 3, 1}, ::AMDGPU.Device.ROCDeviceArray{Float32, 4, 1}, ::AMDGPU.Device.ROCDeviceArray{Float32, 3, 1}, ::AMDGPU.Device.ROCDeviceMatrix{Float32, 1}, ::CartesianIndex{2}), GPUCompiler.CompilerConfig{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}(GPUCompiler.GCNCompilerTarget("gfx1030", "+wavefrontsize32,-wavefrontsize64"), AMDGPU.Compiler.HIPCompilerParams(false, true), true, nothing, :specfunc, true, 2), 0x0000000000007bc2), Tuple{String, Vector{Base.StackTraces.StackFrame}, Any}[("dynamic function invocation", [_cutdim at multidimensional.jl:863, to_indices at indices.jl:355, to_indices at indices.jl:344, view at subarray.jl:183, maybeview at views.jl:148, dotview at broadcast.jl:1244, fill! at Body.jl:42, macro expansion at util.jl:111, gpu_ at macros.jl:95, gpu_ at none:0], getindex(t::Tuple, i::Int64) @ Base tuple.jl:31)])
Stacktrace:
[1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}, args::LLVM.Module)
@ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/validation.jl:147
[2] macro expansion
@ ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:458 [inlined]
[3] macro expansion
@ ~/.julia/packages/TimerOutputs/Lw5SP/src/TimerOutput.jl:253 [inlined]
[4] macro expansion
@ ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:457 [inlined]
[5] emit_llvm(job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool)
@ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:103
[6] emit_llvm
@ ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:97 [inlined]
[7] codegen(output::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
@ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:136
[8] codegen
@ ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:115 [inlined]
[9] compile(target::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool)
@ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:111
[10] compile
@ ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:103 [inlined]
[11] #40
@ ~/.julia/packages/AMDGPU/a1v0k/src/compiler/codegen.jl:194 [inlined]
[12] JuliaContext(f::AMDGPU.Compiler.var"#40#41"{GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}}; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:52
[13] JuliaContext(f::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:42
[14] hipcompile(job::GPUCompiler.CompilerJob)
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/a1v0k/src/compiler/codegen.jl:193
[15] actual_compilation(cache::Dict{Any, AMDGPU.HIP.HIPFunction}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}, compiler::typeof(AMDGPU.Compiler.hipcompile), linker::typeof(AMDGPU.Compiler.hiplink))
@ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/execution.jl:237
[16] cached_compilation(cache::Dict{Any, AMDGPU.HIP.HIPFunction}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}, compiler::Function, linker::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/execution.jl:151
[17] macro expansion
@ ~/.julia/packages/AMDGPU/a1v0k/src/compiler/codegen.jl:161 [inlined]
[18] macro expansion
@ ./lock.jl:267 [inlined]
[19] hipfunction(f::WaterLily.var"#gpu_##kern_#545#223"{WaterLily.var"#fill!#220"{2, Float32, Int64, AutoBody{WaterLily.var"#comp#234"{Bool, var"#10#14"{SVector{3, ComplexF64}, Float64}, WaterLily.var"#231#235"}, WaterLily.var"#231#235"}}}, tt::Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{2, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.StaticSize{(64, 1)}, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, Nothing}}, AMDGPU.Device.ROCDeviceArray{Float32, 3, 1}, AMDGPU.Device.ROCDeviceArray{Float32, 4, 1}, AMDGPU.Device.ROCDeviceArray{Float32, 3, 1}, AMDGPU.Device.ROCDeviceMatrix{Float32, 1}, CartesianIndex{2}}}; kwargs::@Kwargs{})
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/a1v0k/src/compiler/codegen.jl:155
[20] hipfunction(f::WaterLily.var"#gpu_##kern_#545#223"{WaterLily.var"#fill!#220"{2, Float32, Int64, AutoBody{WaterLily.var"#comp#234"{Bool, var"#10#14"{SVector{3, ComplexF64}, Float64}, WaterLily.var"#231#235"}, WaterLily.var"#231#235"}}}, tt::Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{2, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.StaticSize{(64, 1)}, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, Nothing}}, AMDGPU.Device.ROCDeviceArray{Float32, 3, 1}, AMDGPU.Device.ROCDeviceArray{Float32, 4, 1}, AMDGPU.Device.ROCDeviceArray{Float32, 3, 1}, AMDGPU.Device.ROCDeviceMatrix{Float32, 1}, CartesianIndex{2}}})
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/a1v0k/src/compiler/codegen.jl:154
[21] macro expansion
@ ~/.julia/packages/AMDGPU/a1v0k/src/highlevel.jl:172 [inlined]
[22] (::KernelAbstractions.Kernel{AMDGPU.ROCKernels.ROCBackend, KernelAbstractions.NDIteration.StaticSize{(64,)}, KernelAbstractions.NDIteration.DynamicSize, WaterLily.var"#gpu_##kern_#545#223"{WaterLily.var"#fill!#220"{2, Float32, Int64, AutoBody{WaterLily.var"#comp#234"{Bool, var"#10#14"{SVector{3, ComplexF64}, Float64}, WaterLily.var"#231#235"}, WaterLily.var"#231#235"}}}})(::AMDGPU.ROCArray{Float32, 3, AMDGPU.Runtime.Mem.HIPBuffer}, ::Vararg{Any}; ndrange::Tuple{Int64, Int64}, workgroupsize::Nothing)
@ AMDGPU.ROCKernels ~/.julia/packages/AMDGPU/a1v0k/src/ROCKernels.jl:86
[23] (::WaterLily.var"##kern#544#221"{WaterLily.var"#fill!#220"{2, Float32, Int64, AutoBody{WaterLily.var"#comp#234"{Bool, var"#10#14"{SVector{3, ComplexF64}, Float64}, WaterLily.var"#231#235"}, WaterLily.var"#231#235"}}, Flow{2, Float32, AMDGPU.ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer}, AMDGPU.ROCArray{Float32, 3, AMDGPU.Runtime.Mem.HIPBuffer}, AMDGPU.ROCArray{Float32, 4, AMDGPU.Runtime.Mem.HIPBuffer}}})(μ₀::AMDGPU.ROCArray{Float32, 3, AMDGPU.Runtime.Mem.HIPBuffer}, μ₁::AMDGPU.ROCArray{Float32, 4, AMDGPU.Runtime.Mem.HIPBuffer}, V::AMDGPU.ROCArray{Float32, 3, AMDGPU.Runtime.Mem.HIPBuffer}, σ::AMDGPU.ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer}, ::Val{12})
@ WaterLily ~/.julia/packages/WaterLily/raEyO/src/util.jl:114
[24] macro expansion
@ ~/.julia/packages/WaterLily/raEyO/src/util.jl:116 [inlined]
[25] measure!(a::Flow{2, Float32, AMDGPU.ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer}, AMDGPU.ROCArray{Float32, 3, AMDGPU.Runtime.Mem.HIPBuffer}, AMDGPU.ROCArray{Float32, 4, AMDGPU.Runtime.Mem.HIPBuffer}}, body::AutoBody{WaterLily.var"#comp#234"{Bool, var"#10#14"{SVector{3, ComplexF64}, Float64}, WaterLily.var"#231#235"}, WaterLily.var"#231#235"}; t::Float32, ϵ::Int64)
@ WaterLily ~/.julia/packages/WaterLily/raEyO/src/Body.jl:45
[26] Simulation(dims::Tuple{Int64, Int64}, u_BC::Tuple{Int64, Int64}, L::Float64; Δt::Float64, ν::Float64, g::Nothing, U::Nothing, ϵ::Int64, perdir::Tuple{}, uλ::Nothing, exitBC::Bool, body::AutoBody{WaterLily.var"#comp#234"{Bool, var"#10#14"{SVector{3, ComplexF64}, Float64}, WaterLily.var"#231#235"}, WaterLily.var"#231#235"}, T::Type, mem::Type)
@ WaterLily ~/.julia/packages/WaterLily/raEyO/src/WaterLily.jl:75
[27] Simulation
@ ~/.julia/packages/WaterLily/raEyO/src/WaterLily.jl:65 [inlined]
[28] run_TwoD_julia(; p::Int64, Re::Int64, stop::Float64)
@ Main ~/MEGA/Julia/WaterLily-examples/examples/TwoD_Julia.jl:24
[29] top-level scope
@ ~/MEGA/Julia/WaterLily-examples/examples/TwoD_Julia.jl:42
[30] include(fname::String)
@ Base.MainInclude ./client.jl:489
[31] run(debug_session::VSCodeDebugger.DebugAdapter.DebugSession, error_handler::VSCodeDebugger.var"#3#4"{String})
@ VSCodeDebugger.DebugAdapter ~/.vscode-oss/extensions/julialang.language-julia-1.120.2-universal/scripts/packages/DebugAdapter/src/packagedef.jl:122
[32] startdebugger()
@ VSCodeDebugger ~/.vscode-oss/extensions/julialang.language-julia-1.120.2-universal/scripts/packages/VSCodeDebugger/src/VSCodeDebugger.jl:45
[33] top-level scope
@ ~/.vscode-oss/extensions/julialang.language-julia-1.120.2-universal/scripts/debugger/run_debugger.jl:12
[34] include(mod::Module, _path::String)
@ Base ./Base.jl:495
[35] exec_options(opts::Base.JLOptions)
@ Base ./client.jl:318
[36] _start()
@ Base ./client.jl:552
in expression starting at /home/fra/MEGA/Julia/WaterLily-examples/examples/TwoD_Julia.jl:42 |
Hi! I recently fixed a bug when creating a |
I am on master (I think: from Pkg> I did
Could it be that ROCm 6 is too recent? If I try to use AMDGPU with their own artifacts ( |
Hi again. It seems that the error that I posted at first was not the one stemming from
Seems awfully close to that |
Hi Francesco, can you still reproduce this bug in the latest WaterLily? If I understand correctly, this happens when running the simple TwoD_Julia.jl case, right? If I have time later I will try again on an AMD GPU, but as far as I recall I could not reproduce this. |
Hi!
Arch/Manjaro user with a RX 6950 XT. I was trying to run the
TwoD_Julia
tutorial with an AMD GPU backend. I can successfully dousing AMDGPU
(I should say that some tests fail though) with and without bundled ROCm artifacts. I can follow the basic AMD tutorials (e.g. creating arrays on the GPU and summing them and also offload functions with the @roc macro). Running the WaterLily tutorial withmem=AMDGPU.ROCArray
when building the Simulation() struct throws the following error:Julia version: 1.10.
Any suggestions?
The text was updated successfully, but these errors were encountered: