In [1]:
GPU_PKG_NAME = "AMDGPU"; include("common_definitions.jl");

Arrays are fun! Let's see what sorts of things we can do with Julia's array interface.

Because Machine Learning is hot right now, let's implement some of the basics from scratch. Let's give Flux's Dense layer a try.

In [5]:
struct Dense
    W::GpuArray{Float32,2}
    B::GpuArray{Float32,1}
end
(d::Dense)(X) = d.W * X .+ d.B

X = GPUMOD.rand(8) # our input
D = Dense(GPUMOD.rand(32, 8), GPUMOD.rand(32)) # our dense layer
Y = D(X) # our result, or "prediction"

32-element ROCVector{Float32}:
 2.486434
 2.1763203
 3.0257607
 1.2786984
 2.6666918
 1.9567823
 2.9717352
 3.1911674
 2.710516
 3.4692364
 2.6519804
 2.2051744
 2.735538
 ⋮
 1.9065772
 2.959194
 1.9990909
 2.122009
 1.884681
 2.1215396
 1.55394
 2.6163301
 2.0110452
 3.3095946
 2.190668
 2.3430967

That's surprisingly easy! Although, that's more a testament to how simple the Dense layer is. Of course, the *real* Dense layer also has an extra operation stored within it, the "activation function", that'll be applied to the result before it's returned. Let's add that.

In [7]:
struct DenseExtra
    W::GpuArray{Float32,2}
    B::GpuArray{Float32,1}
    op::Function
end
(d::DenseExtra)(X) = d.op.(d.W * X .+ d.B)

# a very strange activation function
D = DenseExtra(GPUMOD.rand(32, 8), GPUMOD.rand(32), Base.sin)

Y = D(X)

32-element ROCVector{Float32}:
  0.7117389
  0.65917706
  0.9962529
  0.9598583
  0.14423871
  0.79629743
  0.26667917
  0.33245322
  0.99117374
  0.3005133
  0.62377775
  0.2911119
  0.9891397
  ⋮
  0.41433883
  0.9339969
 -0.18869163
  0.7243287
  0.6254028
  0.68218803
  0.80426913
  0.18213595
  0.6043187
  0.9808556
  0.00019446213
  0.9997918

It's pretty simple to apply operations to GPU arrays; just use broadcasting! The GPUArrays package will take care of compiling your operation down into a GPU kernel and executing it for you, so you get great performance. Of course, we could have just hacked in explicit support for `Base.sin` into GPUArrays for this workshop. Let's try to disprove that possibility.

In [9]:
relu(x) = ifelse(x > 0, x, zero(x))

# a more familiar activation function, the famous Rectified Linear Unit
D = DenseExtra(GPUMOD.rand(32, 8), GPUMOD.rand(32), relu)

Y = D(X)

32-element ROCVector{Float32}:
 1.7694025
 1.4593736
 1.5504706
 2.1202867
 1.7502114
 2.7722752
 2.5464807
 2.090945
 1.7716358
 2.8755188
 2.0656824
 1.919281
 1.9699565
 ⋮
 2.0603788
 3.0326064
 1.3484907
 2.19077
 2.591572
 3.4285855
 2.8133383
 1.8792555
 2.0648713
 1.6753672
 3.3227973
 2.1942797

Good luck doing *that* in Python! Julia and the GPU packages cooperate to compile custom functions down into code that runs fast on the GPU, so that you can get on to doing something awesome.