[WIP] Add support for ROCm #64

jpsamaroo · 2019-05-23T18:55:06Z

This is nowhere near ready to go yet, but I wanted to get this posted since things are progressing well for AMDGPU support overall 🙂

TODO:

Add synchronization to AMDGPUnative, and then use it here
Merge Implement common math functions JuliaGPU/AMDGPUnative.jl#6 for math intrinsics support
Implement some means to select the desired backend
Enable scratch and shmem support, and test them
Ensure CI passes on some version of Julia (requires LLVM 7.0+ and probably Patch LLVM AMDGPU addrspace no-ops, change Julia's addrspaces JuliaLang/julia#31970)
(Optional) Add aliases for threads, blocks, etc. to AMDGPUnative

jpsamaroo · 2019-05-23T18:58:19Z

src/GPUifyLoops.jl

@@ -33,7 +29,7 @@ include("context.jl")

 backend() = CPU()
 # FIXME: Get backend from Context or have Context per backend
-Cassette.overdub(ctx::Ctx, ::typeof(backend)) = CUDA()
+Cassette.overdub(ctx::Ctx, ::typeof(backend)) = ctx.metadata


Reminder that I need to figure out what to do here.

vchuravy · 2019-05-23T19:49:39Z

Related #63

jpsamaroo · 2019-05-23T22:02:21Z

Ok, this is now working for me, albeit without synchronization (I still have to add those intrinsics to AMDGPUnative). Slightly modified example from the GPUifyLoops docs:

using GPUifyLoops, AMDGPUnative, HSARuntime

function kernel(A)
    @loop for i in (1:size(A,1);
                    threadIdx().x)
        A[i] = 2*A[i]
    end
    # TODO: @synchronize
    return nothing
end

kernel(A::HSAArray) = @launch ROC() kernel(A, groupsize=length(A))

data = HSAArray(rand(Float32, 1024))
kernel(data)

vchuravy · 2019-05-26T21:07:26Z

We should probably define threadIdx etc. in GPUIfyLoops, users currently still have to manually do using CUDAnative to get them.

jpsamaroo commented May 23, 2019

View reviewed changes

jpsamaroo force-pushed the jps/rocm branch from e121083 to 92288b3 Compare May 23, 2019 19:04

jpsamaroo force-pushed the jps/rocm branch from 92288b3 to 1ea3f91 Compare May 23, 2019 21:56

jpsamaroo added 2 commits July 25, 2019 10:35

Add support for ROCm

23bc8c6

Add sync(::ROC)

a5c0319

jpsamaroo force-pushed the jps/rocm branch from b5bc004 to a5c0319 Compare July 25, 2019 15:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add support for ROCm #64

[WIP] Add support for ROCm #64

jpsamaroo commented May 23, 2019 •

edited

jpsamaroo May 23, 2019

vchuravy commented May 23, 2019

jpsamaroo commented May 23, 2019

vchuravy commented May 26, 2019

[WIP] Add support for ROCm #64

Are you sure you want to change the base?

[WIP] Add support for ROCm #64

Conversation

jpsamaroo commented May 23, 2019 • edited

jpsamaroo May 23, 2019

Choose a reason for hiding this comment

vchuravy commented May 23, 2019

jpsamaroo commented May 23, 2019

vchuravy commented May 26, 2019

jpsamaroo commented May 23, 2019 •

edited