Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][CUDA][PI] Fix infinite loop when parallel_for range exceeds INT_MAX #5095

Merged
merged 2 commits into from
Dec 9, 2021

Conversation

pgorlani
Copy link
Contributor

@pgorlani pgorlani commented Dec 6, 2021

This solves #4255

The fix expresses the number of threads per block in size_t in order to avoid an infinite loop for parallel_for ranges greater than INT_MAX.

@pgorlani pgorlani requested a review from a team as a code owner December 6, 2021 18:32
steffenlarsen
steffenlarsen previously approved these changes Dec 7, 2021
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

sycl/plugins/cuda/pi_cuda.cpp Outdated Show resolved Hide resolved
@bader bader added the cuda CUDA back-end label Dec 7, 2021
@npmiller
Copy link
Contributor

npmiller commented Dec 7, 2021

I believe this issue is also going to affect the HIP plugin:

So we should probably update that as well.

@romanovvlad romanovvlad merged commit 64720d8 into intel:sycl Dec 9, 2021
@pgorlani pgorlani deleted the int_max_range branch December 9, 2021 15:03
npmiller added a commit to npmiller/llvm that referenced this pull request Dec 9, 2021
This is the equivalent for HIP of the changes in intel#5095.

It also fixes intel#4255 for the HIP plugin.
alexbatashev added a commit to alexbatashev/llvm that referenced this pull request Dec 11, 2021
* upstream/sycl: (725 commits)
  [SYCL] Translate ZE_RESULT_ERROR_INVALID_ARGUMENT error code from L0 RT (intel#5122)
  [SYCL][L0][Plugin] Call ZeCommandQueueCreate on demand (intel#5109)
  [SYCL] Switch to using blocking USM free for OpenCL GPU (intel#4928)
  [CI] Disable pack and upload steps (intel#5119)
  [SYCL] Disable submission of AssertInfoCopier for FPGA (intel#4780)
  [SYCL][SPIRV] Implement islessgreater with FOrdNotEqual instead (intel#5076)
  [SYCL] Fix typo in the name of the host-visible pool (intel#5073)
  [SYCL] Only call shutdown when DLL is being unloaded, not when process is terminating (intel#4983)
  [SYCL][CUDA][PI] Fix infinite loop when parallel_for range exceeds INT_MAX (intel#5095)
  [SYCL] Translate out-of-memory error codes from L0 RT (intel#5107)
  [SYCL] Fix a few warnings during build scripts configuration (intel#5082)
  [SYCL] Fix amdgpu openmp test (intel#5103)
  [SYCL] [FPGA] Create experimental headers for FPGA latency control (intel#5066)
  [SYCL][CUDA] Don't enqueue an event wait on same CUDA stream (intel#5099)
  Remove PR disable template (intel#5102)
  [BuildBot]Uplift CPU/FPGAEMU RT version (intel#5078)
  [SYCL] Fix the test to not depend on a specific line. (intel#5092)
  [CI] Provide libclc targets to build and test (intel#5091)
  Fix build of `check-llvm-spirv` target after 8f8001a
  Force opt to use new pass manager in pr52289 test after c34d157
  ...
alexbatashev added a commit to alexbatashev/llvm that referenced this pull request Dec 12, 2021
* upstream/sycl:
  [CI] Add container users to video group (intel#5101)
  [CI] More typo fixes in Nightly build (intel#5088)
  Revert "[CI] Disable pack and upload steps (intel#5119)" (intel#5125)
  [SYCL] Translate ZE_RESULT_ERROR_INVALID_ARGUMENT error code from L0 RT (intel#5122)
  [SYCL][L0][Plugin] Call ZeCommandQueueCreate on demand (intel#5109)
  [SYCL] Switch to using blocking USM free for OpenCL GPU (intel#4928)
  [CI] Disable pack and upload steps (intel#5119)
  [SYCL] Disable submission of AssertInfoCopier for FPGA (intel#4780)
  [SYCL][SPIRV] Implement islessgreater with FOrdNotEqual instead (intel#5076)
  [SYCL] Fix typo in the name of the host-visible pool (intel#5073)
  [SYCL] Only call shutdown when DLL is being unloaded, not when process is terminating (intel#4983)
  [SYCL][CUDA][PI] Fix infinite loop when parallel_for range exceeds INT_MAX (intel#5095)
  [SYCL] Translate out-of-memory error codes from L0 RT (intel#5107)
  [SYCL] Fix a few warnings during build scripts configuration (intel#5082)
  [SYCL] Fix amdgpu openmp test (intel#5103)
  [SYCL] [FPGA] Create experimental headers for FPGA latency control (intel#5066)
  [SYCL][CUDA] Don't enqueue an event wait on same CUDA stream (intel#5099)
  Remove PR disable template (intel#5102)
  [BuildBot]Uplift CPU/FPGAEMU RT version (intel#5078)
bader pushed a commit that referenced this pull request Dec 17, 2021
…#5115)

This is the equivalent for HIP of the changes in #5095.

It also fixes #4255 for the HIP plugin.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda CUDA back-end
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants