Skip to content
This repository was archived by the owner on May 27, 2021. It is now read-only.
This repository was archived by the owner on May 27, 2021. It is now read-only.

Shared memory + multiple function exits cause invalid results #4

@maleadt

Description

@maleadt

Cause seems to be an added checkbounds, if that even makes sense.

Repro:

using CUDAdrv, CUDAnative

@target ptx function kernel(arr::Ptr{Int32})
    temp = @cuStaticSharedMem(Int32, (2, 1))
    tx = Int(threadIdx().x)

    if tx == 1
        for i = 1:2
            # THIS BREAKS STUFF: checkbounds(temp, i)
            Base.pointerset(temp.ptr, 1, i, 8)
        end
    end
    sync_threads()

    Base.pointerset(arr, Base.pointerref(temp.ptr, tx, 8), tx, 8)

    return nothing
end

dev = CuDevice(0)
ctx = CuContext(dev)

d_arr = CuArray(Int32, (2, 1))
@cuda (1,2) kernel(d_arr.ptr)
println(Array(d_arr))

destroy(ctx)

Result without checkbounds: [1; 1]. With: [1; 0].

cc @cfoket

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions