# GPU Computing in Julia 

Why should you write GPU code in Julia 
    1. Simple interface
    2. Write GPU kernel in Julia 
    3. Comparible Speed


## Using GPU Packages
    1. JuliaGPU github page lists all the GPU packages you can use for CUDA and OpenCL. 
    2. CuArrays and CLArrays 
    3. GPUArrays 
    

In [None]:
using CuArrays 

# Data Transfer 
a = rand(100,100)
b = rand(100,100)
d_a = CuArray(a)
d_b = CuArray(b)

# Multiple Dispatch
result = collect(d_a * d_b)

#explicit calling package function
result = collect(CuArrays.CUBLAS.gemm('N', 'N', d_a,d_b))



## Writing your own GPU code
    1. Use CUDAnative and CUDAdrv 


In [None]:
# Data Transfer 

a = rand(100,100)
b = rand(100,100)
d_a = CuArray(a)
d_b = CuArray(b)

# Thread configuration
dev = device()
threads = attribute(dev, CUDAdrv.WARP_SIZE)
blocks = min(Int(ceil(ndrange/threads)), attribute(dev, CUDAdrv.MAX_GRID_DIM_X))

# Writing GPU kernel
function lod_kernel(input, ndrange,n) 
    tid = (blockIdx().x-1) * blockDim().x + threadIdx().x
    if(tid < ndrange+1)
        r_square = (input[tid]/n)^2
        input[tid] = (-n/Float32(2.0)) * CUDAnative.log(Float32(1.0)-r_square)
    end 
    return
end

# Calling Function 
@cuda blocks=blocks threads=threads lod_kernel(d_r, ndrange,n)
