Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsound lifetimes of kernel params #29

Closed
coreylowman opened this issue Dec 22, 2022 · 1 comment
Closed

Unsound lifetimes of kernel params #29

coreylowman opened this issue Dec 22, 2022 · 1 comment
Labels
bug Something isn't working

Comments

@coreylowman
Copy link
Owner

There is potential undefined behavior if any of the params for cuLaunchKernel are drop'd before the kernel actually executes.

A simple case is just a local parameter variable, which gets turned into a kernel parameter which a reference. That means the pointer is to local stack frame maybe?

What would need to happen to fix this is ensuring that these values are dropped after the kernel executes. One potential way to do this is use cuLaunchHostFunc, where the host function does nothing but has ownership of all the params.

Pin also seems very useful for this case.

@coreylowman coreylowman added the bug Something isn't working label Dec 22, 2022
@coreylowman
Copy link
Owner Author

This may not actually be an issue. I ran some simple tests where a stack host variable is dropped before the kernel executes, and this doesn't seem to be an issue.

Since values are passed by value, and they are copied into const memory, this should all happen before the launch_async returns.

See more: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#functions

Notable excerpts:

When lowering a global function launch from host code, the compiler generates stub functions that copy the parameters one or more times by value, before eventually using memcpy to copy the arguments to the global function’s parameter memory on the device. This occurs even if an argument was non-trivially-copyable, and therefore may break programs where the copy constructor has side effects.

Kernel launches are asynchronous with host execution. As a result, if a global function argument has a non-trivial destructor, the destructor may execute in host code even before the global function has finished execution. This may break programs where the destructor has side effects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant