Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calling custom C ArrayFire functions in Julia #229

Closed
4lrdyD opened this issue Mar 18, 2019 · 5 comments
Closed

calling custom C ArrayFire functions in Julia #229

4lrdyD opened this issue Mar 18, 2019 · 5 comments

Comments

@4lrdyD
Copy link
Contributor

4lrdyD commented Mar 18, 2019

I needed write some custom functions in C using the ArrayFire library, the functions basically uses custom OpenCL kernels, for example I have:

__kernel void
Gauss_Jordan_c_sp(__global float* C, __global float* A, __global float* B, int gsize);

I use this kernel on ArrayFire arrays, so, I write the function:
Ref: Interoperability with OpenCL .

void AFire::SELgj_c_sp(af::array &C, af::array &A, af::array &B) {
	cl_mem * d_A = A.device<cl_mem>();
	cl_mem * d_B = B.device<cl_mem>();
	cl_mem * d_C = C.device<cl_mem>();

	size_t order = (int)A.dims(0);

	// set arguments, here the kernel has been previously built
	int i = 0;
	clSetKernelArg(kernel, i++, sizeof(cl_mem), d_C);
	clSetKernelArg(kernel, i++, sizeof(cl_mem), d_A);
	clSetKernelArg(kernel, i++, sizeof(cl_mem), d_B);
	clSetKernelArg(kernel, i++, sizeof(cl_int), &order);

	size_t localWorkSize = BLOCK_SIZE * BLOCK_SIZE;
	size_t globalWorkSize = BLOCK_SIZE * BLOCK_SIZE;

	clEnqueueNDRangeKernel(af_queue, kernel, 1, 0, &globalWorkSize, &localWorkSize,
		0, NULL, NULL);
	A.unlock();
	B.unlock();
	C.unlock();
}

I call this function in Julia as follows:

ccall((:SELgj_c_sp,"path/to/dll")
            ,Cvoid,(Ref{af_array},Ref{af_array},Ref{af_array}),Cf.arr,Af.arr,Bf.arr)

It works well sometimes, but sometimes it fails (the kernel dies), depending on how the code is written, althought the call in C works fine. I wanted to change the arguments of the function to af_array instead of af::array to call this function in Julia as I saw in some built-in functions in ArrayFire.jl, but the attempts I made did not go well since I only have the reference above.

is there some general rules that I could use to write (in C) and call (in Julia) this functions?, I think I'm not doing it the right way.

@4lrdyD 4lrdyD changed the title calling custom C ArrayFire functions in Julia: Kernel dies calling custom C ArrayFire functions in Julia Mar 18, 2019
@ghost
Copy link

ghost commented Mar 19, 2019

Not sure, but it seems that you are mixing up C++ and C pointers and calling conventions, not an expert, but I'm surprised that it works at all. af_array is a dumb C pointer and af::array is a pointer to a c++ class.

You should use af_array both in Julia and in your C function definition.

Inside your function either construct af::array objects out of af_array or use C version of the arrayfire functions throughout.

@4lrdyD
Copy link
Contributor Author

4lrdyD commented Mar 19, 2019

I tried something like:

void AFire::SELgj_c_sp(af_array &dC, af_array &dA, af_array &dB) {

        //to be able to use A.device to get the device pointer
       //This step seems to be wrong since it does not work in C or in Julia
        af::array A=array(dA);
        af::array B=array(dB);
        af::array C=array(dC);

	cl_mem * d_A = A.device<cl_mem>();
	cl_mem * d_B = B.device<cl_mem>();
	cl_mem * d_C = C.device<cl_mem>();

	size_t order = (int)A.dims(0);

	// set arguments, here the kernel has been previously built
	int i = 0;
	clSetKernelArg(kernel, i++, sizeof(cl_mem), d_C);
	clSetKernelArg(kernel, i++, sizeof(cl_mem), d_A);
	clSetKernelArg(kernel, i++, sizeof(cl_mem), d_B);
	clSetKernelArg(kernel, i++, sizeof(cl_int), &order);

	size_t localWorkSize = BLOCK_SIZE * BLOCK_SIZE;
	size_t globalWorkSize = BLOCK_SIZE * BLOCK_SIZE;

	clEnqueueNDRangeKernel(af_queue, kernel, 1, 0, &globalWorkSize, &localWorkSize,
		0, NULL, NULL);
	A.unlock();
	B.unlock();
	C.unlock();

        //if I want to use some built-in functions on any argument of the function
        //It does not work either
        //nor in the versión with the af::array arguments
        seq s1(0, order - 1, 1);
        C = transpose(A(order, s1)) / diag(A);   
}

I don't found another way to get the device pointer from a af_array object, I need that pointer to set the kernel arguments, without this I can not use such kernels.
Thanks for answering, I know this is not exactly a issue.

@ghost
Copy link

ghost commented Mar 19, 2019

I might need it someday :)

Have you looked into http://arrayfire.org/docs/group__device__func__mem.htm#ga58fda2d491cd27f31108e699b5aef506

AFAPI af_err af_get_device_ptr ( void ** ptr, const af_array ) ?

@4lrdyD
Copy link
Contributor Author

4lrdyD commented Mar 19, 2019

uy... I did not see it, it seems to be what I need, I'll try it and see if it works, I'll come back with news

@4lrdyD
Copy link
Contributor Author

4lrdyD commented Mar 22, 2019

I managed to write the code, however I had to make some changes in order to handle af_array objects, before to use af_get_device_ptr is necessary initialize the memory using clCreateBuffer. i.e:

void AFire::SELgj_c_sp(af_array* out, af_array dA, af_array dB) {

	//to store the result
	af_array dC;
	af_copy_array(&dC, dB);
      
	dim_t _order[4];
	af_get_dims(&_order[0], &_order[1], &_order[2], &_order[3], dA);
	size_t order = _order[0];

        int status = CL_SUCCESS;

       //here af_context has been previosly obtained with afcl::getContext()

        cl_mem *d_A = (cl_mem*)clCreateBuffer(af_context,CL_MEM_READ_WRITE,
                     sizeof(float) * order*order, NULL, &status); 
        af_get_device_ptr((void**)d_A, dA); 

        cl_mem *d_B = (cl_mem*)clCreateBuffer(af_context,CL_MEM_READ_WRITE, 
                  sizeof(float) * order, NULL, &status); 
        af_get_device_ptr((void**)d_B, dB); 

        cl_mem *d_C = (cl_mem*)clCreateBuffer(af_context,CL_MEM_WRITE_ONLY, 
                    sizeof(float) * order, NULL, &status); 
        af_get_device_ptr((void**)d_C, dC); 

	// set arguments, here the kernel has been previously built
	int i = 0;
	clSetKernelArg(kernel, i++, sizeof(cl_mem), d_C);
	clSetKernelArg(kernel, i++, sizeof(cl_mem), d_A);
	clSetKernelArg(kernel, i++, sizeof(cl_mem), d_B);
	clSetKernelArg(kernel, i++, sizeof(cl_int), &order);

	size_t localWorkSize = BLOCK_SIZE * BLOCK_SIZE;
	size_t globalWorkSize = BLOCK_SIZE * BLOCK_SIZE;

	clEnqueueNDRangeKernel(af_queue, kernel, 1, 0, &globalWorkSize, &localWorkSize,
		0, NULL, NULL);
	af_unlock_array(dA);
	af_unlock_array(dB);
	af_unlock_array(dC);
        
       //copy results to output argument
	af_copy_array(out, dC);
}

then in Julia it is possible to use this call:

out = ArrayFire.RefValue{af_array}(0);
ccall((:SELgj_c_sp,"path/to/dll"),Cvoid,(Ptr{af_array},af_array,af_array),out,Af.arr,Bf.arr);
AFArray{Float32,1}(out[])

For now, this works fine, it seems to go better than the function with af::array arguments, thanks so much @GAIKA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant