Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions: what can be done asynchronously? #9

Closed
DavidPoliakoff opened this issue Mar 5, 2018 · 3 comments
Closed

Questions: what can be done asynchronously? #9

DavidPoliakoff opened this issue Mar 5, 2018 · 3 comments

Comments

@DavidPoliakoff
Copy link

Hey folks,

This bug is a question, feel free to close on an answer, no code needed.

I'm trying to use Jitify asynchronously, I have a pipeline which asynchronously creates program templates to hand off to Jitify (or Cling), launches a generic version of things to be jitted while waiting, and then swaps in the NVRTC products when they're available. Jitify is printing the PTX it asynchronously generates (yay) and then erroring
CUDA_ERROR_INVALID_CONTEXT (aww).

I'm assuming this is because I'm creating a KernelLauncher on a different thread than the one in which I wish to execute it, std::async will launch tasks in another thread, if Jitify is picking up the default CUDA context for that thread I don't know what happens when the KernelLauncher gets returned to a different thread with a different CUDA context. My questions are:

  1. Is there a way to pass a cudaContext to the KernelInstantiation to create the KernelLauncher under that context?
  2. If not, how far down the Jitify stack can I go in another thread before cudaContexts become relevant? My intuition is that I can instantiate a kernel, I just can't configure it, but let me know if I'm wrong there.

Thanks again for your help!

@DavidPoliakoff
Copy link
Author

Answered 2 for myself: if I can't pass a context around, I actually need to form the program itself, everything after kernel_cache.program(recipe,0) appears bound to a given CUDA context. This is workable, but not preferable

@benbarsdell
Copy link
Member

Thanks for the report, I can reproduce the error you're seeing.

A solution is to call a CUDA Runtime API function such as cudaSetDevice(gpu_index) (or even cudaFree(0)) in the new thread before calling Jitify functions (which internally call the CUDA Driver API). The first call to a Runtime function will automatically set the context in the thread. Let me know if that works for your application.

@DavidPoliakoff
Copy link
Author

@benbarsdell , that's really good work, solved the problem. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants