Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error message: "An item with the same key has been added" #51

Closed
Costigan opened this issue Dec 9, 2019 · 4 comments · Fixed by #354
Closed

Error message: "An item with the same key has been added" #51

Costigan opened this issue Dec 9, 2019 · 4 comments · Fixed by #354
Assignees
Labels

Comments

@Costigan
Copy link

Costigan commented Dec 9, 2019

Hello. I'm using 0.7.0.0 and the Cuda accelerator. I'm getting a runtime error with the message "An item with the same key has been added". The error doesn't occur when using the CPU accelerator.

I don't think this is an ILGPU bug, but I'm uncertain how to debug my code given the lack of specificity in the error message.

The error is generated on this line, the first (and only) time an action is generated from my kernel.

var landing_probability_kernel = accelerator.LoadStreamKernel<Index2, ArrayView2D, ArrayView2D, int, int>(EstimateLandingProbability);

My question is 'how should I debug this message'? What sorts of things should I look for?
(I'm including the code for completeness, but I'm not asking you to look at it if you can suggest a direction to look.)

One possibility I'm unsure about is that this kernel allocates shared memory and create a 2D view of if. I don't know for sure whether creating 2D views is allowed inside kernels.

    static void EstimateLandingProbability(
        Index2 index,
        ArrayView2D<byte> global,
        ArrayView2D<short> counts,
        int start_row,
        int start_col)  // landing_radius
    {
        const int radius = landing_radius;
        var row = index.Y;
        var col = index.X;

        var shared_width = patch_width + radius * 2;
        var shared_length = shared_width * shared_width;
        Debug.Assert(shared_length <= 64000);
        var shared_linear = SharedMemory.Allocate<byte>(shared_length);
        var shared = shared_linear.As2DView(shared_width, shared_width);

        // Load the slope buffer.  This loads a square section of global memory into shared memory four times with offsets.
        var offset = radius + radius;
        var row0 = start_row - radius;
        var col0 = start_col - radius;
        LoadSquare(global, shared, index, 0, 0, row0, col0);                             // top left
        LoadSquare(global, shared, index, 0, offset, row0, col0 + offset);               // top right
        LoadSquare(global, shared, index, offset, 0, row0 + offset, col0);               // bottom left
        LoadSquare(global, shared, index, offset, offset, row0 + offset, col0 + offset); // bottom right

        // Count
        var sum = 0;
        var d2 = radius * radius;
        var high = radius + radius;
        for (var y = -radius; y <= high; y++)
            for (var x = -radius; x <= high; x++)
                sum += x * x + y * y <= d2 ? shared[row + y + radius, col + x + radius] : 0;

        // Write the resulting counts to global memory
        var r1 = start_row + row;
        var c1 = start_col + col;
        Debug.Assert(r1 >= 0);
        Debug.Assert(c1 >= 0);
        Debug.Assert(r1 < dem_width);
        Debug.Assert(c1 < dem_width);
        counts[start_row + row, start_col + col] = (short)sum;
    }

    static void LoadSquare(ArrayView2D<byte> global, ArrayView2D<byte> shared, Index2 index, int shared_row0, int shared_col0, int global_row0, int global_col0)
    {
        var row = index.Y;
        var col = index.X;
        var target_row = shared_row0 + row;
        var target_col = shared_col0 + col;
        var source_row = global_row0 + row;
        var source_col = global_col0 + col;
        shared[target_row, target_col] = source_row < 0 || source_row >= dem_height || source_col < 0 || source_col >= dem_width 
            ? (byte)0 
            : global[source_row, source_col];
    }
@m4rs-mt m4rs-mt self-assigned this Dec 9, 2019
@m4rs-mt m4rs-mt added the bug label Dec 9, 2019
@m4rs-mt
Copy link
Owner

m4rs-mt commented Dec 10, 2019

@Costigan Thank you for submitting this bug report. It seems to be an internal ILGPU compiler issue that should not occur and is not related to an error in your program. I have already investigated the described problem and I guess that it has been caused by a synchronization issue in the ABI class (fixed in d149410). After fixing this issue I can load the kernel using the CudaAccelerator and CLAccelerator classes without any problems.

Can you test whether the latest commits solve this problem? Furthermore, rolling back to version 0.6 should circumvent this problem as well.

@m4rs-mt
Copy link
Owner

m4rs-mt commented Dec 10, 2019

Some additional remarks: It looks like you are not using any Group synchronization instructions after loading data into shared memory. Is this intended?

Note that using shared memory in combination with implicitly grouped kernels can lead to undefined and unintended behavior. We should add an error message to ILGPU that simply forbids such kernels. Please use an explicitly grouped kernel to have fine-grained control regarding the number of threads per group.

@m4rs-mt
Copy link
Owner

m4rs-mt commented Dec 23, 2019

I'll close this issue for now, since the above-mentioned commit solves this problem on all test machines. Should you encounter further problems, we can always take up this topic again.

@m4rs-mt m4rs-mt closed this as completed Dec 23, 2019
@Costigan
Copy link
Author

@m4rs-mt I'm sorry for not responding quickly. I ended up going with a multi-core solution, and performance was good enough. However, getting this implementation going will help me in the future. I'll get it running and add another comment confirming that. Wrt the synchronization instructions, you're right. I should have included one after loading shared memory. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants