Skip to content
This repository has been archived by the owner on May 17, 2020. It is now read-only.

Example showing 3D stencil calculations with a divergence #18

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Example showing 3D stencil calculations with a divergence #18

wants to merge 1 commit into from

Conversation

ali-ramadhan
Copy link
Collaborator

Example might be useful to others. I get a speedup of ~130x with this small kernel (compared to single CPU core, so it's not really a fair comparison) so I think I did it right.

Resolves #12

@vchuravy
Copy link
Owner

vchuravy commented Feb 19, 2019

    gpuIndex3D() = CartesianIndex(
        blockIdx().z,
        blockIdx().y - 1) * blockDim().y + threadIdx().y,
        blockIdx().x - 1) * blockDim().x + threadIdx().x
                                                        )

    # Calculate the divergence of f at every point and store it in div_f.
    @loop for I in (eachindex(f); gpuIndex3D())

                @inbounds div_f[I] = div(f, Nx, Ny, Nz, Δx, Δy, Δz, I)
    end

@vchuravy
Copy link
Owner

Something like this for index calc

        maxThreads = 1024
        Nx, Ny, Nz = size(f)
        Tx  = min(maxThreads, Nx)
        Ty  = min(fld(maxThreads, Tx), Ny)
        Tz  = min(fld(maxThreads, (Tx*Ty)), Nz)

        Bx, By, Bz = cld(Nx, Tx), cld(Ny, Ty), cld(Nz, Tz)  # Blocks in grid.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants