Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it acceptable to launch with kernel_t's based on the arguments' types? #236

Closed
eyalroz opened this issue Nov 11, 2020 · 4 comments
Closed
Labels

Comments

@eyalroz
Copy link
Owner

eyalroz commented Nov 11, 2020

At the moment, when a kernel function is wrapped in a kernel_t object, its signature is (type-)erased. Thus, when you launch it with certain arguments - those arguments define which signature the kernel is assumed to have. It is cast to having exactly that signature (well, ignoring some decaying we do initially), then launched.

This presents 3 (related) problems:

  1. One cannot utilize implicit conversions from arguments to the parameter types.
  2. There will be a stronger tendency to copy large arguments (mostly due to problem 1)
  3. The user encounters different behavior with the same kernel depending on whether they wrap it in a kernel_t or not.

However - the alternative would be a somewhat-involved workaround (which I'm not even sure is feasible) to keep some "partial-reflection" of the original function, then have something run, on invocation, which converts to the original kernel's arguments. That may not necessarily fit the idea of lightweight wrapping.

So, what should we do here?

@eyalroz eyalroz changed the title Find a way to use argument type info when wrapping in a kernel_t Is it acceptable to launch with kernel_t's based on the arguments' types? Nov 11, 2020
@eyalroz
Copy link
Owner Author

eyalroz commented Nov 11, 2020

Example, due to @codecircuit :

#include <cuda/runtime_api.hpp>
#include <iostream>

struct OwningContainer {
    OwningContainer(size_t len)
	: ptr(cuda::memory::managed::make_unique<int[]>(len)), l(len) {}

    OwningContainer(const OwningContainer& other) {
	std::cout << "OwningContainer copy constructor is called" << std::endl;
	l = other.l;
	ptr = cuda::memory::managed::make_unique<int[]>(other.l);
	std::copy(other.ptr.get(), other.ptr.get() + l, ptr.get());
    }

    cuda::memory::managed::unique_ptr<int[]> ptr;
    size_t l;
};

struct View {
    View(OwningContainer& oc) : l(oc.l), ptr(oc.ptr.get()) {}
    size_t l;
    int* ptr;
};

__global__ void kernel(View v) {
    printf("%p\n", v.ptr);
    // process the data through the view
}

int main() {
    OwningContainer oc(10);
    auto device = cuda::device::current::get();
    std::cout << "Call wrapping kernel launch" << std::endl;
    auto k = cuda::kernel_t(device, kernel);
    cuda::enqueue_launch(k, device.default_stream(),
			 cuda::make_launch_config(1, 1), oc);
    device.synchronize();
    std::cout << "Call native kernel launch" << std::endl;
    kernel<<<1, 1>>>(oc);
    device.synchronize();
}

@eyalroz
Copy link
Owner Author

eyalroz commented Nov 11, 2020

So, @codecircuit : What do you think would be the alternative?

Something I considered: We can have a templated constructor. In that constructor, we could attach to the constructed object a pointer to a function: It's type would be independent of the template, but its value will be specific to the template. Then, upon launch, we could have that function run. But that wouldn't be a solution either, since we can't adapt the launch arguments to this function's parameters, unless we used some run-time reflection mechanism. So I'm not exactly sure what I could do.

Also remember that passing "heavy" objects around is risky, and I doubt people do that, in practice, with kernels.

@codecircuit
Copy link
Contributor

I also think that this is a rather difficult problem. Maybe it is reasonable to have a specific signature if a kernel is launched with a cuda::kernel_t object:

inline void enqueue_launch(
	bool                        thread_block_cooperation,
	kernel_t&                   kernel_function,
	stream::id_t                stream_id,
	launch_configuration_t      launch_configuration,
	void** parameters)

Now it would be transparent to the user that the kernel argument types are erased and that he is responsible to use the correct types, but to be honest, I do not like this C style signature. This would be very similar to cudaLaunch. We should keep in mind that this problem also exists if a raw CUDA kernel is launched cooperatively, if I see it correctly in the code.

...or we just keep this issue and wait for other ideas/opinions to come up.

@eyalroz
Copy link
Owner Author

eyalroz commented Nov 13, 2020

We're offering Modern C++ wrappers, and in there's one thing that tells you old-school C it's passing void** anywhere :-(

No, people will just have to pass exactly what their kernel takes if they use a type-erased kernel. Well, unless there's another idea.

I will emphasize this point in the documentation though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants