Is it acceptable to launch with kernel_t's based on the arguments' types? #236

eyalroz · 2020-11-11T13:31:48Z

At the moment, when a kernel function is wrapped in a kernel_t object, its signature is (type-)erased. Thus, when you launch it with certain arguments - those arguments define which signature the kernel is assumed to have. It is cast to having exactly that signature (well, ignoring some decaying we do initially), then launched.

This presents 3 (related) problems:

One cannot utilize implicit conversions from arguments to the parameter types.
There will be a stronger tendency to copy large arguments (mostly due to problem 1)
The user encounters different behavior with the same kernel depending on whether they wrap it in a kernel_t or not.

However - the alternative would be a somewhat-involved workaround (which I'm not even sure is feasible) to keep some "partial-reflection" of the original function, then have something run, on invocation, which converts to the original kernel's arguments. That may not necessarily fit the idea of lightweight wrapping.

So, what should we do here?

The text was updated successfully, but these errors were encountered:

eyalroz · 2020-11-11T13:45:11Z

Example, due to @codecircuit :

#include <cuda/runtime_api.hpp>
#include <iostream>

struct OwningContainer {
    OwningContainer(size_t len)
	: ptr(cuda::memory::managed::make_unique<int[]>(len)), l(len) {}

    OwningContainer(const OwningContainer& other) {
	std::cout << "OwningContainer copy constructor is called" << std::endl;
	l = other.l;
	ptr = cuda::memory::managed::make_unique<int[]>(other.l);
	std::copy(other.ptr.get(), other.ptr.get() + l, ptr.get());
    }

    cuda::memory::managed::unique_ptr<int[]> ptr;
    size_t l;
};

struct View {
    View(OwningContainer& oc) : l(oc.l), ptr(oc.ptr.get()) {}
    size_t l;
    int* ptr;
};

__global__ void kernel(View v) {
    printf("%p\n", v.ptr);
    // process the data through the view
}

int main() {
    OwningContainer oc(10);
    auto device = cuda::device::current::get();
    std::cout << "Call wrapping kernel launch" << std::endl;
    auto k = cuda::kernel_t(device, kernel);
    cuda::enqueue_launch(k, device.default_stream(),
			 cuda::make_launch_config(1, 1), oc);
    device.synchronize();
    std::cout << "Call native kernel launch" << std::endl;
    kernel<<<1, 1>>>(oc);
    device.synchronize();
}

eyalroz · 2020-11-11T15:59:42Z

So, @codecircuit : What do you think would be the alternative?

Something I considered: We can have a templated constructor. In that constructor, we could attach to the constructed object a pointer to a function: It's type would be independent of the template, but its value will be specific to the template. Then, upon launch, we could have that function run. But that wouldn't be a solution either, since we can't adapt the launch arguments to this function's parameters, unless we used some run-time reflection mechanism. So I'm not exactly sure what I could do.

Also remember that passing "heavy" objects around is risky, and I doubt people do that, in practice, with kernels.

codecircuit · 2020-11-13T18:07:52Z

I also think that this is a rather difficult problem. Maybe it is reasonable to have a specific signature if a kernel is launched with a cuda::kernel_t object:

inline void enqueue_launch(
	bool                        thread_block_cooperation,
	kernel_t&                   kernel_function,
	stream::id_t                stream_id,
	launch_configuration_t      launch_configuration,
	void** parameters)

Now it would be transparent to the user that the kernel argument types are erased and that he is responsible to use the correct types, but to be honest, I do not like this C style signature. This would be very similar to cudaLaunch. We should keep in mind that this problem also exists if a raw CUDA kernel is launched cooperatively, if I see it correctly in the code.

...or we just keep this issue and wait for other ideas/opinions to come up.

eyalroz · 2020-11-13T18:51:50Z

We're offering Modern C++ wrappers, and in there's one thing that tells you old-school C it's passing void** anywhere :-(

No, people will just have to pass exactly what their kernel takes if they use a type-erased kernel. Well, unless there's another idea.

I will emphasize this point in the documentation though.

eyalroz added the question label Nov 11, 2020

eyalroz mentioned this issue Nov 11, 2020

Copy constructions instead of expected implicit conversion when launching a kernel #158

Closed

eyalroz changed the title ~~Find a way to use argument type info when wrapping in a kernel_t~~ Is it acceptable to launch with kernel_t's based on the arguments' types? Nov 11, 2020

codecircuit mentioned this issue Aug 23, 2021

Make "cooperativity" be part of the launch configuration #258

Closed

eyalroz closed this as completed Nov 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it acceptable to launch with kernel_t's based on the arguments' types? #236

Is it acceptable to launch with kernel_t's based on the arguments' types? #236

eyalroz commented Nov 11, 2020 •

edited

eyalroz commented Nov 11, 2020 •

edited

eyalroz commented Nov 11, 2020 •

edited

codecircuit commented Nov 13, 2020

eyalroz commented Nov 13, 2020

Is it acceptable to launch with kernel_t's based on the arguments' types? #236

Is it acceptable to launch with kernel_t's based on the arguments' types? #236

Comments

eyalroz commented Nov 11, 2020 • edited

eyalroz commented Nov 11, 2020 • edited

eyalroz commented Nov 11, 2020 • edited

codecircuit commented Nov 13, 2020

eyalroz commented Nov 13, 2020

eyalroz commented Nov 11, 2020 •

edited

eyalroz commented Nov 11, 2020 •

edited

eyalroz commented Nov 11, 2020 •

edited