Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing calls to nvmlShutdown #5311

Merged
merged 1 commit into from
Feb 9, 2024
Merged

Conversation

JanuszL
Copy link
Contributor

@JanuszL JanuszL commented Feb 6, 2024

  • in some places, DALI initializes nvml but misses
    to call nvmlShutdown to shut it down gracefully

Signed-off-by: Janusz Lisiecki jlisiecki@nvidia.com

Category:

Bug fix (non-breaking change which fixes an issue)

Description:

  • in some places, DALI initializes nvml but misses
    to call nvmlShutdown to shut it down gracefully

Additional information:

Affected modules and functionalities:

  • cuda malloc resource
  • nvjpeg tests
  • gds tests
  • optical flow
  • video loader_decoder

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

- in some places, DALI initializes nvml but misses
  to call nvmlShutdown to shut it down gracefully

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL
Copy link
Contributor Author

JanuszL commented Feb 6, 2024

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [12619731]: BUILD STARTED

@mzient mzient self-assigned this Feb 6, 2024
Comment on lines +207 to 211
nvml::Init();
static float driver_version = nvml::GetDriverVersion();
nvml::Shutdown();
static bool device_supports_hw_decoder = nvml::isHWDecoderSupported();
return device_supports_hw_decoder && driver_version >= 455;
Copy link
Contributor

@mzient mzient Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a potential perf hit.

Suggested change
nvml::Init();
static float driver_version = nvml::GetDriverVersion();
nvml::Shutdown();
static bool device_supports_hw_decoder = nvml::isHWDecoderSupported();
return device_supports_hw_decoder && driver_version >= 455;
static const float driver_version = []() {
nvml::Init();
float ver = nvml::GetDriverVersion();
nvml::Shutdown();
return ver;
}();
static const bool device_supports_hw_decoder = nvml::isHWDecoderSupported();
return device_supports_hw_decoder && driver_version >= 455;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the hot path, I would go for something faster. This is just a test. I don't think it will affect it much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [12619731]: BUILD PASSED

@JanuszL
Copy link
Contributor Author

JanuszL commented Feb 7, 2024

Let me run with sanitizers ON as it passes the basic CI.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [12655173]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [12655173]: BUILD FAILED

@JanuszL JanuszL merged commit 1b60777 into NVIDIA:main Feb 9, 2024
6 of 7 checks passed
@JanuszL JanuszL deleted the add_nvml_shutdown branch February 9, 2024 08:19
@JanuszL JanuszL added the important-fix Fixes an important issue in the software or development environment. label Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
important-fix Fixes an important issue in the software or development environment.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants