Skip to content
This repository has been archived by the owner on May 17, 2020. It is now read-only.

[WIP] Add support for ROCm #64

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

jpsamaroo
Copy link

@jpsamaroo jpsamaroo commented May 23, 2019

This is nowhere near ready to go yet, but I wanted to get this posted since things are progressing well for AMDGPU support overall 馃檪

TODO:

@@ -33,7 +29,7 @@ include("context.jl")

backend() = CPU()
# FIXME: Get backend from Context or have Context per backend
Cassette.overdub(ctx::Ctx, ::typeof(backend)) = CUDA()
Cassette.overdub(ctx::Ctx, ::typeof(backend)) = ctx.metadata
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder that I need to figure out what to do here.

@vchuravy
Copy link
Owner

Related #63

@jpsamaroo
Copy link
Author

Ok, this is now working for me, albeit without synchronization (I still have to add those intrinsics to AMDGPUnative). Slightly modified example from the GPUifyLoops docs:

using GPUifyLoops, AMDGPUnative, HSARuntime

function kernel(A)
    @loop for i in (1:size(A,1);
                    threadIdx().x)
        A[i] = 2*A[i]
    end
    # TODO: @synchronize
    return nothing
end

kernel(A::HSAArray) = @launch ROC() kernel(A, groupsize=length(A))

data = HSAArray(rand(Float32, 1024))
kernel(data)

@vchuravy
Copy link
Owner

We should probably define threadIdx etc. in GPUIfyLoops, users currently still have to manually do using CUDAnative to get them.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants