Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unique_ptrs throws an instance of 'cuda::runtime_error' in version v0.5.1 #336

Closed
DiamonDinoia opened this issue May 26, 2022 · 6 comments
Closed

Comments

@DiamonDinoia
Copy link

DiamonDinoia commented May 26, 2022

The following code:

#include <cuda/api.hpp>
int main() {
    cuda::memory::device::unique_ptr<double[]> device_pointer(
        cuda::memory::device::make_unique<double[]>(cuda::device::current::get(), 16));
   return 0;
}

causes:

terminate called after throwing an instance of 'cuda::runtime_error'
  what():  Freeing device memory at 0x0x07faca2c00000: invalid argument

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

I think is a bug introduced in verson v0.5.1 as it works fine with version v0.5.0.

@eyalroz
Copy link
Owner

eyalroz commented May 26, 2022

@DiamonDinoia Can you try this with the HEAD of the development branch? I think it's been fixed.

@eyalroz
Copy link
Owner

eyalroz commented May 26, 2022

Yeah, I think this is a dupe of #327 and should be resolved now.

@DiamonDinoia
Copy link
Author

Yes, it seems the same issue. It works with the HEAD development branch.

@eyalroz
Copy link
Owner

eyalroz commented May 27, 2022

@DiamonDinoia : Thank you for reporting this, though.

If you use the library in an interesting project, I'd love to hear about it... also if you have any suggestions / things you think are missing or existing-but-inconvenient.

Finally, the reason this went unnoticed is probably because people typically work with an explicit context proxy in a certain scope, in which case I do not need to jump through hoops to to manage global state (the device primary context) :-)

@DiamonDinoia
Copy link
Author

@eyalroz Yes. I think the project is interesting once complete it will released open source. I might contact you in the future for support regarding multi-threading combined with multiple gpus (i.e assigning a GPU for each thread to fully exploit the system).

Finally, the reason this went unnoticed is probably because people typically work with an explicit context proxy in a certain scope, in which case I do not need to jump through hoops to to manage global state (the device primary context) :-)

Is the context thread local? When using the normal CUDA API, I can just set the device after creating the thread and I do not need to change anything. Or should I explicitly copy-it locally to the thread? But maybe this is not the right place to discuss this. I probably should open a related issue so that the conversation is logged for others to use.

@eyalroz
Copy link
Owner

eyalroz commented May 27, 2022

@DiamonDinoia : Are contexts thread-local? Well... in some ways, at least. The context stack and the current context are thread-local for sure. But I don't know whether you can use a context created in one thread, in another thread of the same process.

As for "just setting the device" - that's because the CUDA Runtime API does some logistics for you behind the scene. It creates a device-specific primary context and replaces the top of the context stack with it. I think that kind of global state is a bad idea to begin with, but that's what NVIDIA went with , so I couldn't just ditch it altogether, as much as I would have liked a purely RAII/CADRE library.

Anyway, you can ask such things on StackOverflow or the CUDA developer forums.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants