Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scoped_range does not work with domain::global #65

Open
zygfrydw opened this issue Dec 9, 2022 · 1 comment
Open

scoped_range does not work with domain::global #65

zygfrydw opened this issue Dec 9, 2022 · 1 comment

Comments

@zygfrydw
Copy link

zygfrydw commented Dec 9, 2022

I am trying to use NVTX3_FUNC_RANGE(), nvtx3::scoped_range or nvtx3::scoped_range_in<nvtx3::domain::global> but neither is working - the range does not show in nvperf / nvvp.

I have tested nvtxRangePush and nvtx3::scoped_range_in<my_domain> and these ranges show in profiling tools correctly.

Configuration

  • GPU CARD: GeForce GTX 1650 Mobile
  • driver version: 520.61.05
  • CUDA version: 11.8
  • OS version: Ubuntu 22.04

Reproduction docker

I have prepared a simple reproduction docker here.

Reproduction code:

struct my_domain{ static constexpr char const* name{"my_domain"}; };

void function_my_domain(){
    // this range does show in profiling tools as expected
    nvtx3::scoped_range_in<my_domain> r(__FUNCTION__);
    std::this_thread::sleep_for(1s);
}

void function_global(){
    // this range does not show in profiling tools
    nvtx3::scoped_range r(__FUNCTION__);
    std::this_thread::sleep_for(1s);
}
@zygfrydw
Copy link
Author

zygfrydw commented Dec 9, 2022

I have done some experiments with NVTX source code and the following changes seams to fix the issue:

I have changed

// NVTX/include/nvtx3/nvtx3.hpp line 940
template <>
inline domain const& domain::get<domain::global>() noexcept
{
  static domain const d{};
  return d;
}

to:

template <>
inline domain const& domain::get<domain::global>() noexcept
{
  static domain const d{"global"};
  return d;
}

Can you verify if the modification makes sense to other NVIDIA tools?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant