-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cuda interop #49
Cuda interop #49
Conversation
#define ASSERT_SUCCESS(expr) \ | ||
if (auto re = expr; CUDA_SUCCESS != re) { \ | ||
const char* name = 0, *str = 0; \ | ||
cu.pcuGetErrorName(re, &name); \ | ||
cu.pcuGetErrorString(re, &str); \ | ||
printf("%s:%d %s:\n\t%s\n", __FILE__, __LINE__, name, str); \ | ||
abort(); \ | ||
} | ||
|
||
#define ASSERT_SUCCESS_NV(expr) \ | ||
if (auto re = expr; NVRTC_SUCCESS != re) { \ | ||
const char* str = cudaHandler->getNVRTCFunctionTable().pnvrtcGetErrorString(re); \ | ||
printf("%s:%d %s\n", __FILE__, __LINE__, str); \ | ||
abort(); \ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
functions or lambdas would have been better practice, this is kinda meh when you're stepping with a debugger.
63.CUDAInterop/main.cpp
Outdated
ASSERT_SUCCESS(cudaDevice->importGPUBuffer(buf0.get(), &mem0)); | ||
ASSERT_SUCCESS(cudaDevice->importGPUBuffer(buf1.get(), &mem1)); | ||
ASSERT_SUCCESS(cudaDevice->importGPUBuffer(buf2.get(), &mem2)); | ||
|
||
void* parameters[] = { &mem0.ptr, &mem1.ptr, &mem2.ptr, &numElements }; | ||
ASSERT_SUCCESS(cu.pcuLaunchKernel(kernel, gridDim[0], gridDim[1], gridDim[2], blockDim[0], blockDim[1], blockDim[2], 0, stream, parameters, nullptr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question, how do you deal with ownership transfers?
Cause by the time createFilledDeviceLocalBufferOnDedMem
returns, the Vulkan queue which you used for transferring the data will own each buffer.
AFAIK, you need to do a pipeline barrier to an FOREIGN or EXTERNAL (does CUDA for the same device UUID count as external though?) queue before you start using it in CUDA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to figure out how to do transferships properly. Although validation layers does not complain, I'll poke around using different families to figure out.
does CUDA for the same device UUID count as external though?
I think it does.
From the spec:
The special queue family index VK_QUEUE_FAMILY_EXTERNAL represents any queue external to the resource’s current Vulkan instance, as long as the queue uses the same underlying device group or physical device, and the same driver version as the resource’s VkDevice, as indicated by VkPhysicalDeviceIDProperties::deviceUUID and VkPhysicalDeviceIDProperties::driverUUID.
system::ISystem::future_t<core::smart_refctd_ptr<system::IFile>> fut; | ||
system->createFile(fut, "../vectorAdd_kernel.cu", system::IFileBase::ECF_READ); | ||
auto [ptx_, res] = cudaHandler->compileDirectlyToPTX(fut.copy().get(), cudaDevice->geDefaultCompileOptions()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AnastaZIuk can you show @atkurtul how to declare that file as a builtin resource, I think we should use them whenever possible going foward
Very nice concise example |
superseded |
No description provided.