New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates for CUDA v4, KernelAbstractions v0.9 #177
Conversation
@vchuravy as developer of KernelAbstractions, can you confirm that downstream packages are expected to remove |
@mcabbott can you take a look? |
Yes KA 0.9 removed the event handling system entirely, operations are now task-ordered. |
@mcabbott I'm getting some Zygote errors with this branch. Do you have any idea why that might be happening? using Tullio, Zygote
N = 8
p = rand(N)
function f(p)
@tullio y[i] := p[i]
sum(y)
end
@show f(p)
Zygote.gradient(f, p) # on v0.3.5 ([1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],) On this branch, julia> include("examples/diffusion_fourier/gpu/a.jl") [41/1827]
f(p) = 5.1281264238049316
ERROR: LoadError: ArgumentError: tuple must be non-empty
Stacktrace:
[1] first(#unused#::Tuple{})
@ Base ./tuple.jl:192
[2] unzip(xs::Tuple{})
@ ChainRules ~/.julia/packages/ChainRules/9sNmB/src/unzipped.jl:133
[3] unzip_broadcast(f::ChainRules.var"#1735#1737"{Zygote.ZygoteRuleConfig{Zygote.Context{false}}, typeof(length)}, args::Tuple{})
@ ChainRules ~/.julia/packages/ChainRules/9sNmB/src/unzipped.jl:41
[4] split_bc_pullbacks(cfg::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, f::typeof(length), args::Tuple{})
@ ChainRules ~/.julia/packages/ChainRules/9sNmB/src/rulesets/Base/broadcast.jl:128
[5] rrule(cfg::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, #unused#::typeof(Base.Broadcast.broadcasted), #unused#::Base.Broadcast.Style{Tuple}, f::typeof(length), args::Tuple{})
@ ChainRules ~/.julia/packages/ChainRules/9sNmB/src/rulesets/Base/broadcast.jl:44
[6] chain_rrule
@ ~/.julia/packages/Zygote/JeHtr/src/compiler/chainrules.jl:223 [inlined]
[7] macro expansion
@ ~/.julia/packages/Zygote/JeHtr/src/compiler/interface2.jl:101 [inlined]
[8] _pullback(::Zygote.Context{false}, ::typeof(Base.Broadcast.broadcasted), ::Base.Broadcast.Style{Tuple}, ::typeof(length), ::Tuple{})
@ Zygote ~/.julia/packages/Zygote/JeHtr/src/compiler/interface2.jl:101
[9] _apply(::Function, ::Vararg{Any})
@ Core ./boot.jl:838
[10] adjoint
@ ~/.julia/packages/Zygote/JeHtr/src/lib/lib.jl:203 [inlined]
[11] _pullback
@ ~/.julia/packages/ZygoteRules/OgCVT/src/adjoint.jl:66 [inlined]
[12] _pullback
@ ./broadcast.jl:1311 [inlined]
[13] _pullback
@ ~/.julia/dev/Tullio/src/threads.jl:260 [inlined]
[14] _pullback
@ ~/.julia/dev/Tullio/src/threads.jl:59 [inlined]
[15] _pullback(::Zygote.Context{false}, ::typeof(Tullio.threader), ::var"#𝒜𝒸𝓉!#7", ::Type{Vector{Float64}}, ::Vector{Float64}, ::Tuple{Vector{Float64}}, ::Tuple{Base.OneTo{Int64}}, ::Tuple{}, ::typeof(+), ::
Int64, ::Nothing)
@ Zygote ~/.julia/packages/Zygote/JeHtr/src/compiler/interface2.jl:0
[16] _pullback
@ ~/.julia/dev/Tullio/src/macro.jl:807 [inlined] Is there anything I am doing wrong here? EditThis is erroring on |
uncomment tests
lets see if things break with cuda v5... |
Are they any news on this? I see that some tests are failing, but I would also like to use it with CUDA v4. Not sure exactly how I could contribute. |
Sorry I haven't had the bandwidth to look at this properly. If @vpuri3 thinks it done then perhaps we should just merge. If it would simplify things, can we just drop CUDA 3 support completely? |
@mcabbott do you think below error is due to this PR? This is the only Tullio error. https://github.com/mcabbott/Tullio.jl/actions/runs/6498985585/job/17651416329?pr=177#step:6:418 The errors on Julia nightly are due to some compile failure in CUDA. Lemme rerun again to make sure they are fixed. The buildkite failures can be resolved by removing CUDA 3 support. |
I think it would be quite cumbersome to maintain both, CUDA.jl 3 and CUDA.jl 4 (and 5) support. As far as I can see it, many other libraries also dropped CUDA.jl 3 support. |
@mcabbott , the only questionable test failures are below.
https://github.com/mcabbott/Tullio.jl/actions/runs/6534658486/job/17742367080?pr=177#step:6:613
Besides that, nightly is failing due to some CUDA precompile error unnrelated to this work. LMK if you want me to dig deeper into the tracker issue. Otherwise this is good to go |
Thanks! Re TensorOperations we should probably just remove all of that... it's probably broken by their latest re-write, which I believe also adds their own gradient rules. CUDA + Tracker I don't know, can investigate at some point. At present Tracker is unusable with Flux and nobody even made an issue so maybe not the top priority. |
KernelAbstractions v0.9
removes support for event handling (kernels are now implicitly ordered), I removed lines related towait(Device())
andEvent(Device())
.CUDADevice
->CUDABackend
close #172 , close #176, close #168