-
Notifications
You must be signed in to change notification settings - Fork 124
[EXP][CUDA] Enable Large cluster sizes and fix cluster dimensions being set for dimensions less than 3 #1765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…luster launch is used
| if (workDim == 3) { | ||
| launch_attribute[i].value.clusterDim.x = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use some help Here -
I was not able to figure out where this flipping of order happens,
I see it's being set in setKernelParams but how it flips it I was not able to understand it
|
@oneapi-src/unified-runtime-cuda-write gentle ping r.e. this PR. lit.py: /home/test-user/actions-runners/01/_work/unified-runtime/unified-runtime/sycl-repo/sycl/test-e2e/lit.cfg.py:718: error: Cannot detect device aspect for cuda:gpu
stdout:
Platforms: 0
default_selector() : No device of requested type available. -1 (PI_ERRO...
accelerator_selector() : No device of requested type available. -1 (PI_ERRO...
cpu_selector() : No device of requested type available. -1 (PI_ERRO...
gpu_selector() : No device of requested type available. -1 (PI_ERRO...
custom_selector(gpu) : No device of requested type available. -1 (PI_ERRO...
custom_selector(cpu) : No device of requested type available. -1 (PI_ERRO...
custom_selector(acc) : No device of requested type available. -1 (PI_ERRO...Maybe re-triggering the job might help ? |
|
yeh CI seems to be broken currently, same here: https://github.com/oneapi-src/unified-runtime/actions/runs/9584385509/job/26430113727?pr=1774 |
|
Hi, thanks for your patch. |
…ABLE_CLUSTER_SIZE_ALLOWED flag being added
So the test cases that are being added, are in this PR here - intel/llvm#14113, which would test this PR fully. I do not suppose I can add a test which will check the ordering of the cluster dimensions in this UR PR, however I can change increase the cluster size in the tests added in #1643 and increase the cluster size, such that it tests However, note that this runs on SM_90 only, which I do not suppose is on the CI. If you prefer, I can add like a log of the test run on H100 ? |
|
I've removed the ready-to-merge label since intel/llvm#14113 isn't passing CI and also has the abi-break label - it will need to wait for the ABI breaking window to open before it can be merged. |
Thanks, that would be great! |
|
This was included in #1804 which has now merged |
Fix the ordering of cluster dimension in accordance to the grid Dimensions.
Also adds a call to
CU_FUNC_ATTRIBUTE_NON_PORTABLE_CLUSTER_SIZE_ALLOWEDto allowcluster sizes greater than 8