-
-
Notifications
You must be signed in to change notification settings - Fork 6
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expected to work with RustGPU? #7
Comments
Hey @thedodd , Your idea is mostly right. You will augment your code with tracers and you use NSight Systems to execute and return a technical report. You will need an NVIDIA GPU and NSight Systems (NSight Systems also has a CLI). I recommend starting with the GUI. NSight Systems is essentially a program executor, and you instruct NSight Systems which command to run, e.g. It is important that you check NSight Systems to subscribe to NVTX annotations. Otherwise your report will not show any NVTX tracers. There are some specific instructions in examples/, but I also have an in-depth blog post if you want a quick read. |
I'm going to close this, but feel free to re-open it if you need further assistance or consider starting a discussion. |
So, it looks like this crate is a std crate, and therefore can not actually run GPU-side. All of the examples are for CPU side code as well. Is it possible to generate instrumentation for GPU-side code so that one might be able to profile their GPU code in greater detail? As-is, one is left to just instrument the time it takes for a kernel invocation from the vantage point of the CPU, not the code running on the GPU itself. Does that make sense? Hopefully I am asking these questions clearly |
I'll make it a no-std crate and that should solve all issues. :) +1 |
Crate is now no_std as of |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Hello, I am interested in using this with a project of mine which is using https://github.com/Rust-GPU/Rust-CUDA/. Is it expected that this will work properly in that context? I'm assuming the idea is to instrument the kernel code itself with the nvtx code here, and then a profile will be made available ... somewhere? Perhaps NSight extracts it from the running GPU, or some well known location on the GPU host, is that correct?
The text was updated successfully, but these errors were encountered: