Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using permute with 2D arrays leads to nvcc compile error #93

Closed
DanielSchuessler opened this issue May 20, 2013 · 1 comment
Closed
Labels
cuda backend [deprecated]

Comments

@DanielSchuessler
Copy link

Hello,

I'm new to accelerate, but this appears to be a bug:

import Data.Array.Accelerate hiding(map,fst,snd)
import qualified Data.Array.Accelerate as Acc
import Data.Array.Accelerate.CUDA(run)

testArr :: Acc (Array DIM2 Int)
testArr = use (fromList (Z :. 1 :. 1) [5])

permuted :: Acc (Array DIM2 Int)
permuted =
  permute
    (\c _ -> c)
    (fill (shape testArr) (constant 0))
    id
    testArr

main = print . run $ permuted
> ghc -threaded bug.hs
[1 of 1] Compiling Main             ( bug.hs, bug.o )
Linking bug ...

> ./bug
/tmp/accelerate-cuda-7546/dragon7546.cu(16): error: no suitable conversion function from "DIM2" to "const int" exists

/tmp/accelerate-cuda-7546/dragon7546.cu(18): error: expression must have class type

/tmp/accelerate-cuda-7546/dragon7546.cu(19): error: expression must have class type

3 errors detected in the compilation of "/tmp/tmpxft_00001d7f_00000000-6_dragon7546.cpp1.ii".
bug: nvcc terminated abnormally (2)
bug: thread blocked indefinitely in an MVar operation

Versions:
accelerate: 0.13.0.1
accelerate-cuda: 0.13.0.1
nvcc release 5.0, V0.2.1221

Contents of /tmp/accelerate-cuda-7546/dragon7546.cu (indented):

#include <accelerate_cuda_extras.h>
typedef DIM2 DimOut;
typedef DIM2 DimIn;
static TexInt32 arrIn0_a0;
extern "C" __global__ void permute(const DIM2 shIn0, const DIM2 shOut, Int32* __restrict__ arrOut_a0)
{
    const Int32 sh1 = shIn0.a1;
    const Int32 sh0 = shIn0.a0;
    const DimIn shIn = shape(sh1, sh0);
    const int shapeSize = sh1 * sh0;
    const int gridSize = __umul24(blockDim.x, gridDim.x);
    int ix;

    for (ix = __umul24(blockDim.x, blockIdx.x) + threadIdx.x; ix < shapeSize; ix += gridSize) {
        DimOut dst;
        const int src = fromIndex(shIn, ix);

        dst.a1 = src.a1;
        dst.a0 = src.a0;
        if (!ignore(dst)) {
            Int32 y0;
            Int32 _y0;
            const int jx = toIndex(shOut, dst);
            const int v1 = ix;
            const Int32 x0 = indexArray(arrIn0_a0, v1);

            arrOut_a0[jx] = x0;
        }
    }
}
tmcdonell added a commit to AccelerateHS/accelerate-cuda that referenced this issue May 21, 2013
@tmcdonell
Copy link
Member

derp... Thanks for the comprehensive bug report, that made it really easy to find what was wrong (:

tmcdonell added a commit to tmcdonell/accelerate-cuda that referenced this issue Jun 7, 2013
tmcdonell added a commit to tmcdonell/accelerate-cuda that referenced this issue Jun 7, 2013
tmcdonell added a commit to AccelerateHS/accelerate-examples that referenced this issue Sep 8, 2014
tmcdonell added a commit to tmcdonell/accelerate that referenced this issue Nov 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda backend [deprecated]
Projects
None yet
Development

No branches or pull requests

2 participants