Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates for CUDA v4, KernelAbstractions v0.9 #177

Merged
merged 19 commits into from Oct 16, 2023
Merged

Conversation

vpuri3
Copy link
Contributor

@vpuri3 vpuri3 commented Jul 15, 2023

  • As KernelAbstractions v0.9 removes support for event handling (kernels are now implicitly ordered), I removed lines related to wait(Device()) and Event(Device()).
  • switch CUDADevice -> CUDABackend
  • add tests for Tullio with CUDA v4, KernelAbstractions v0.9

close #172 , close #176, close #168

@vpuri3
Copy link
Contributor Author

vpuri3 commented Jul 15, 2023

@vchuravy as developer of KernelAbstractions, can you confirm that downstream packages are expected to remove Event(backend) and wait(backend, event) statements?

@vpuri3
Copy link
Contributor Author

vpuri3 commented Jul 15, 2023

@mcabbott can you take a look?

@vchuravy
Copy link

Yes KA 0.9 removed the event handling system entirely, operations are now task-ordered.

@vpuri3
Copy link
Contributor Author

vpuri3 commented Jul 16, 2023

@mcabbott I'm getting some Zygote errors with this branch. Do you have any idea why that might be happening?

using Tullio, Zygote

N = 8
p = rand(N)

function f(p)
    @tullio y[i] := p[i]
    sum(y)
end

@show f(p)
Zygote.gradient(f, p) # on v0.3.5 ([1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],)

On this branch,

julia> include("examples/diffusion_fourier/gpu/a.jl")                                                                                                                                                  [41/1827]
f(p) = 5.1281264238049316                                                                                                                                                                                       
ERROR: LoadError: ArgumentError: tuple must be non-empty                                                                                                                                                        
Stacktrace:                                                                                                                                                                                                     
  [1] first(#unused#::Tuple{})                                                                                                                                                                                  
    @ Base ./tuple.jl:192                                                                                                                                                                                       
  [2] unzip(xs::Tuple{})                                                                                                                                                                                        
    @ ChainRules ~/.julia/packages/ChainRules/9sNmB/src/unzipped.jl:133                                                                                                                                         
  [3] unzip_broadcast(f::ChainRules.var"#1735#1737"{Zygote.ZygoteRuleConfig{Zygote.Context{false}}, typeof(length)}, args::Tuple{})                                                                             
    @ ChainRules ~/.julia/packages/ChainRules/9sNmB/src/unzipped.jl:41                                                                                                                                          
  [4] split_bc_pullbacks(cfg::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, f::typeof(length), args::Tuple{})                                                                                                 
    @ ChainRules ~/.julia/packages/ChainRules/9sNmB/src/rulesets/Base/broadcast.jl:128                                                                                                                          
  [5] rrule(cfg::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, #unused#::typeof(Base.Broadcast.broadcasted), #unused#::Base.Broadcast.Style{Tuple}, f::typeof(length), args::Tuple{})                         
    @ ChainRules ~/.julia/packages/ChainRules/9sNmB/src/rulesets/Base/broadcast.jl:44                                                                                                                           
  [6] chain_rrule                                                                                                                                                                                               
    @ ~/.julia/packages/Zygote/JeHtr/src/compiler/chainrules.jl:223 [inlined]                                                                                                                                   
  [7] macro expansion                                                                                                                                                                                           
    @ ~/.julia/packages/Zygote/JeHtr/src/compiler/interface2.jl:101 [inlined]                                                                                                                                   
  [8] _pullback(::Zygote.Context{false}, ::typeof(Base.Broadcast.broadcasted), ::Base.Broadcast.Style{Tuple}, ::typeof(length), ::Tuple{})                                                                      
    @ Zygote ~/.julia/packages/Zygote/JeHtr/src/compiler/interface2.jl:101                                                                                                                                      
  [9] _apply(::Function, ::Vararg{Any})                                                                                                                                                                         
    @ Core ./boot.jl:838                                                                                
 [10] adjoint                                                                                           
    @ ~/.julia/packages/Zygote/JeHtr/src/lib/lib.jl:203 [inlined]            
 [11] _pullback                                                                                                                                                                                                 
    @ ~/.julia/packages/ZygoteRules/OgCVT/src/adjoint.jl:66 [inlined]        
 [12] _pullback                                                                                                                                                                                                 
    @ ./broadcast.jl:1311 [inlined]
 [13] _pullback
    @ ~/.julia/dev/Tullio/src/threads.jl:260 [inlined]
 [14] _pullback
    @ ~/.julia/dev/Tullio/src/threads.jl:59 [inlined]
 [15] _pullback(::Zygote.Context{false}, ::typeof(Tullio.threader), ::var"#𝒜𝒸𝓉!#7", ::Type{Vector{Float64}}, ::Vector{Float64}, ::Tuple{Vector{Float64}}, ::Tuple{Base.OneTo{Int64}}, ::Tuple{}, ::typeof(+), ::
Int64, ::Nothing)
    @ Zygote ~/.julia/packages/Zygote/JeHtr/src/compiler/interface2.jl:0
 [16] _pullback
    @ ~/.julia/dev/Tullio/src/macro.jl:807 [inlined]

Is there anything I am doing wrong here?

Edit

This is erroring on master as well as this branch, and working on v0.3.5

@vpuri3 vpuri3 mentioned this pull request Jul 16, 2023
3 tasks
uncomment tests
@vpuri3 vpuri3 mentioned this pull request Sep 8, 2023
@vpuri3
Copy link
Contributor Author

vpuri3 commented Sep 29, 2023

lets see if things break with cuda v5...

@maximilian-gelbrecht
Copy link
Contributor

Are they any news on this? I see that some tests are failing, but I would also like to use it with CUDA v4. Not sure exactly how I could contribute.

@mcabbott
Copy link
Owner

Sorry I haven't had the bandwidth to look at this properly. If @vpuri3 thinks it done then perhaps we should just merge.

If it would simplify things, can we just drop CUDA 3 support completely?

@mcabbott mcabbott reopened this Oct 12, 2023
@vpuri3
Copy link
Contributor Author

vpuri3 commented Oct 14, 2023

@mcabbott do you think below error is due to this PR? This is the only Tullio error.

https://github.com/mcabbott/Tullio.jl/actions/runs/6498985585/job/17651416329?pr=177#step:6:418

The errors on Julia nightly are due to some compile failure in CUDA. Lemme rerun again to make sure they are fixed. The buildkite failures can be resolved by removing CUDA 3 support.

@vpuri3 vpuri3 closed this Oct 14, 2023
@vpuri3 vpuri3 reopened this Oct 14, 2023
@maximilian-gelbrecht
Copy link
Contributor

I think it would be quite cumbersome to maintain both, CUDA.jl 3 and CUDA.jl 4 (and 5) support. As far as I can see it, many other libraries also dropped CUDA.jl 3 support.

@vpuri3
Copy link
Contributor Author

vpuri3 commented Oct 16, 2023

@mcabbott , the only questionable test failures are below.

  1. Some issue with TensorOperations.jl + Tracker.jl. I don't know the status of tracker support for TensorOperations.

https://github.com/mcabbott/Tullio.jl/actions/runs/6534658486/job/17742367080?pr=177#step:6:613
https://github.com/mcabbott/Tullio.jl/actions/runs/6534658486/job/17742367733?pr=177#step:6:568

  1. CUDA gradient test failure when comparing with Tracker. Fails for 1.10 beta, works for 1.8.

https://buildkite.com/julialang/tullio-dot-jl/builds/223#018b38c7-0732-4296-9013-7d49fc5a209e/233-626

Besides that, nightly is failing due to some CUDA precompile error unnrelated to this work. LMK if you want me to dig deeper into the tracker issue. Otherwise this is good to go

@mcabbott
Copy link
Owner

Thanks!

Re TensorOperations we should probably just remove all of that... it's probably broken by their latest re-write, which I believe also adds their own gradient rules.

CUDA + Tracker I don't know, can investigate at some point. At present Tracker is unusable with Flux and nobody even made an issue so maybe not the top priority.

@mcabbott mcabbott merged commit bcf544e into mcabbott:master Oct 16, 2023
12 of 22 checks passed
@vpuri3 vpuri3 deleted the cuda branch October 16, 2023 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scalar indexing with CUDA CUDA v4 support Upgrade to CUDA.CUDAKernels
4 participants