-
-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it acceptable to launch with kernel_t's based on the arguments' types? #236
Comments
Example, due to @codecircuit : #include <cuda/runtime_api.hpp>
#include <iostream>
struct OwningContainer {
OwningContainer(size_t len)
: ptr(cuda::memory::managed::make_unique<int[]>(len)), l(len) {}
OwningContainer(const OwningContainer& other) {
std::cout << "OwningContainer copy constructor is called" << std::endl;
l = other.l;
ptr = cuda::memory::managed::make_unique<int[]>(other.l);
std::copy(other.ptr.get(), other.ptr.get() + l, ptr.get());
}
cuda::memory::managed::unique_ptr<int[]> ptr;
size_t l;
};
struct View {
View(OwningContainer& oc) : l(oc.l), ptr(oc.ptr.get()) {}
size_t l;
int* ptr;
};
__global__ void kernel(View v) {
printf("%p\n", v.ptr);
// process the data through the view
}
int main() {
OwningContainer oc(10);
auto device = cuda::device::current::get();
std::cout << "Call wrapping kernel launch" << std::endl;
auto k = cuda::kernel_t(device, kernel);
cuda::enqueue_launch(k, device.default_stream(),
cuda::make_launch_config(1, 1), oc);
device.synchronize();
std::cout << "Call native kernel launch" << std::endl;
kernel<<<1, 1>>>(oc);
device.synchronize();
} |
So, @codecircuit : What do you think would be the alternative? Something I considered: We can have a templated constructor. In that constructor, we could attach to the constructed object a pointer to a function: It's type would be independent of the template, but its value will be specific to the template. Then, upon launch, we could have that function run. But that wouldn't be a solution either, since we can't adapt the launch arguments to this function's parameters, unless we used some run-time reflection mechanism. So I'm not exactly sure what I could do. Also remember that passing "heavy" objects around is risky, and I doubt people do that, in practice, with kernels. |
I also think that this is a rather difficult problem. Maybe it is reasonable to have a specific signature if a kernel is launched with a
Now it would be transparent to the user that the kernel argument types are erased and that he is responsible to use the correct types, but to be honest, I do not like this C style signature. This would be very similar to cudaLaunch. We should keep in mind that this problem also exists if a raw CUDA kernel is launched cooperatively, if I see it correctly in the code. ...or we just keep this issue and wait for other ideas/opinions to come up. |
We're offering Modern C++ wrappers, and in there's one thing that tells you old-school C it's passing No, people will just have to pass exactly what their kernel takes if they use a type-erased kernel. Well, unless there's another idea. I will emphasize this point in the documentation though. |
At the moment, when a kernel function is wrapped in a
kernel_t
object, its signature is (type-)erased. Thus, when you launch it with certain arguments - those arguments define which signature the kernel is assumed to have. It is cast to having exactly that signature (well, ignoring some decaying we do initially), then launched.This presents 3 (related) problems:
However - the alternative would be a somewhat-involved workaround (which I'm not even sure is feasible) to keep some "partial-reflection" of the original function, then have something run, on invocation, which converts to the original kernel's arguments. That may not necessarily fit the idea of lightweight wrapping.
So, what should we do here?
The text was updated successfully, but these errors were encountered: