Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Atomix #1790

Draft
wants to merge 32 commits into
base: master
Choose a base branch
from
Draft

Use Atomix #1790

wants to merge 32 commits into from

Conversation

vchuravy
Copy link
Member

@vchuravy vchuravy commented Mar 10, 2023

Just blindly using UnsafeAtomicsLLVM sadly doesn't work
since LLVM doesn't understand CUDA atomics...

Uses most of #1644 to implement Atomicx.jl properly.

  • cas for size(T) < 4
  • Optimize Atomix.swap! by using atomic_exchange
  • Optimize Atomix.modify! by using hardware implementation of common functions
  • ld.volatile.global.u16 %rs1, [%rd2]; take AS into account, needed for shmem correctness.

@vchuravy
Copy link
Member Author

vchuravy commented Mar 10, 2023

References:

Notes:

@vchuravy
Copy link
Member Author

@tkf if you have some time/interest in looking over my implementation. I intend to replace AtomicxCuda.

@vchuravy
Copy link
Member Author

So a significant difference between CUDA.@atomic and Atomix.@atomic, is that CUDA would fold atomic expression in a statement if possible.

julia> @macroexpand CUDA.@atomic a[1] = a[1] + 1
quote
    #= /home/vchuravy/.julia/packages/CUDA/ZdCxS/src/device/intrinsics/atomics.jl:439 =#
    (CUDA.atomic_arrayset)(a, (1,), +, 1)
end

whereas Atomix lowers this to a set! and calculates the rhs independently.

julia> @macroexpand Atomix.@atomic a[1] = a[1] + 1
:((Atomix.Internal.Atomix).set!((Atomix.Internal.referenceable(a))[1], a[1] + 1, UnsafeAtomics.seq_cst))

Base agrees with Atomix.

julia> @macroexpand @atomic a.x = a.x + 1
:(Base.setproperty!(a, :x, a.x + 1, :sequentially_consistent))
julia> mutable struct A
         @atomic x::Int
       end

julia> f(a) =  @atomic a.x = a.x + 1
f (generic function with 1 method)

julia> f(A(0))
1

julia> @code_llvm f(A(0))
;  @ REPL[5]:1 within `f`
define i64 @julia_f_253({}* noundef nonnull align 8 dereferenceable(8) %0) #0 {
top:
; ┌ @ Base.jl:37 within `getproperty`
   %1 = bitcast {}* %0 to i64*
   %2 = load atomic i64, i64* %1 unordered, align 8
; └
; ┌ @ int.jl:87 within `+`
   %3 = add i64 %2, 1
; └
; ┌ @ Base.jl:54 within `setproperty!`
   store atomic i64 %3, i64* %1 seq_cst, align 8
; └
  ret i64 %3
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant