Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw an exception in cms::cuda::chooseDevice() if CUDAService is disabled #32155

Merged
merged 1 commit into from Nov 18, 2020

Conversation

makortel
Copy link
Contributor

@makortel makortel commented Nov 16, 2020

PR description:

It was noticed in #31719 (comment) that there is a viable call path in modules using CUDA that leads to a crash instead of an exception when run on a machine without a GPU. This PR changes cms::cuda::chooseDevice() to require that CUDAService is enabled, and throw an explanatory exception if it is disabled.

PR validation:

Edited HeterogeneousCore/CUDATest/test/testCUDASwitch_cfg.py to force CUDA being enabled to see the exception is being thrown on a machine without GPU.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-32155/19850

  • This PR adds an extra 12KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @makortel (Matti Kortelainen) for master.

It involves the following packages:

HeterogeneousCore/CUDACore

@makortel, @cmsbuild, @fwyzard can you please review it and eventually sign? Thanks.
@rovere this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 16, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+1
Tested at: 00f2cdd
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2a230c/10794/summary.html
CMSSW: CMSSW_11_2_X_2020-11-16-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

#include "HeterogeneousCore/CUDAServices/interface/CUDAService.h"

#include "chooseDevice.h"

namespace cms::cuda {
int chooseDevice(edm::StreamID id) {
edm::Service<CUDAService> cudaService;
if (not cudaService->enabled()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be

Suggested change
if (not cudaService->enabled()) {
if (not cudaService or not cudaService->enabled()) {

to catch the case where the CUDAService is not included in the process ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

De-referencing the edm::Service<CUDAService> without CUDAService in the configuration gives the standard exception

Service Request unable to find requested service with compiler type name ' 11CUDAService'.

I thought that would be sufficient to cover that case.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2a230c/10794/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2529296
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2529273
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 148 log files, 22 edm output root files, 35 DQM output files

@silviodonato silviodonato changed the title Throw an exception in cms::cuda::chooseDevice() is CUDAService is disabled Throw an exception in cms::cuda::chooseDevice() if CUDAService is disabled Nov 17, 2020
@fwyzard
Copy link
Contributor

fwyzard commented Nov 17, 2020

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@qliphy
Copy link
Contributor

qliphy commented Nov 18, 2020

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants