Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected to work with RustGPU? #7

Closed
thedodd opened this issue Jul 2, 2022 · 5 comments
Closed

Expected to work with RustGPU? #7

thedodd opened this issue Jul 2, 2022 · 5 comments

Comments

@thedodd
Copy link

thedodd commented Jul 2, 2022

Hello, I am interested in using this with a project of mine which is using https://github.com/Rust-GPU/Rust-CUDA/. Is it expected that this will work properly in that context? I'm assuming the idea is to instrument the kernel code itself with the nvtx code here, and then a profile will be made available ... somewhere? Perhaps NSight extracts it from the running GPU, or some well known location on the GPU host, is that correct?

@simbleau
Copy link
Owner

simbleau commented Jul 2, 2022

Hey @thedodd ,

Your idea is mostly right. You will augment your code with tracers and you use NSight Systems to execute and return a technical report.

You will need an NVIDIA GPU and NSight Systems (NSight Systems also has a CLI). I recommend starting with the GUI.

NSight Systems is essentially a program executor, and you instruct NSight Systems which command to run, e.g. cat /my/file. After it runs, it will return a report of all of the metrics you subscribed to.

It is important that you check NSight Systems to subscribe to NVTX annotations. Otherwise your report will not show any NVTX tracers.

There are some specific instructions in examples/, but I also have an in-depth blog post if you want a quick read.

@simbleau
Copy link
Owner

simbleau commented Jul 9, 2022

I'm going to close this, but feel free to re-open it if you need further assistance or consider starting a discussion.

@simbleau simbleau closed this as completed Jul 9, 2022
@thedodd
Copy link
Author

thedodd commented Jul 10, 2022

So, it looks like this crate is a std crate, and therefore can not actually run GPU-side. All of the examples are for CPU side code as well.

Is it possible to generate instrumentation for GPU-side code so that one might be able to profile their GPU code in greater detail? As-is, one is left to just instrument the time it takes for a kernel invocation from the vantage point of the CPU, not the code running on the GPU itself.

Does that make sense? Hopefully I am asking these questions clearly :)

@simbleau
Copy link
Owner

I'll make it a no-std crate and that should solve all issues. :) +1

@simbleau simbleau reopened this Jul 10, 2022
This was referenced Jul 10, 2022
@simbleau
Copy link
Owner

Crate is now no_std as of 1.1.1. Let me know if there's any hiccups but I tested all the functions so we should be good.

Repository owner locked and limited conversation to collaborators Jul 14, 2022
@simbleau simbleau converted this issue into discussion #12 Jul 14, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants