Skip to content

Conversation

@AD2605
Copy link
Contributor

@AD2605 AD2605 commented Jun 17, 2024

Fix the ordering of cluster dimension in accordance to the grid Dimensions.

Also adds a call to CU_FUNC_ATTRIBUTE_NON_PORTABLE_CLUSTER_SIZE_ALLOWED to allow
cluster sizes greater than 8

@AD2605 AD2605 requested a review from a team as a code owner June 17, 2024 15:39
@AD2605 AD2605 requested a review from MartinWehking June 17, 2024 15:39
@github-actions github-actions bot added the cuda CUDA adapter specific issues label Jun 17, 2024
Comment on lines +546 to +547
if (workDim == 3) {
launch_attribute[i].value.clusterDim.x =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use some help Here -

I was not able to figure out where this flipping of order happens,
I see it's being set in setKernelParams but how it flips it I was not able to understand it

@JackAKirk JackAKirk added the ready to merge Added to PR's which are ready to merge label Jun 18, 2024
@AD2605
Copy link
Contributor Author

AD2605 commented Jun 20, 2024

@oneapi-src/unified-runtime-cuda-write gentle ping r.e. this PR.
E2E Cuda fails because It was unable to find a cuda GPU, seems unrelated to my changes -

lit.py: /home/test-user/actions-runners/01/_work/unified-runtime/unified-runtime/sycl-repo/sycl/test-e2e/lit.cfg.py:718: error: Cannot detect device aspect for cuda:gpu
stdout:

Platforms: 0
default_selector()      : No device of requested type available. -1 (PI_ERRO...
accelerator_selector()  : No device of requested type available. -1 (PI_ERRO...
cpu_selector()          : No device of requested type available. -1 (PI_ERRO...
gpu_selector()          : No device of requested type available. -1 (PI_ERRO...
custom_selector(gpu)    : No device of requested type available. -1 (PI_ERRO...
custom_selector(cpu)    : No device of requested type available. -1 (PI_ERRO...
custom_selector(acc)    : No device of requested type available. -1 (PI_ERRO...

Maybe re-triggering the job might help ?

@JackAKirk
Copy link
Contributor

yeh CI seems to be broken currently, same here: https://github.com/oneapi-src/unified-runtime/actions/runs/9584385509/job/26430113727?pr=1774

@MartinWehking
Copy link

Hi, thanks for your patch.
Are there any test cases that fail without your fix and if not, could you provide one?

@AD2605
Copy link
Contributor Author

AD2605 commented Jun 21, 2024

Hi, thanks for your patch. Are there any test cases that fail without your fix and if not, could you provide one?

So the test cases that are being added, are in this PR here - intel/llvm#14113, which would test this PR fully. I do not suppose I can add a test which will check the ordering of the cluster dimensions in this UR PR, however I can change increase the cluster size in the tests added in #1643 and increase the cluster size, such that it tests CU_FUNC_ATTRIBUTE_NON_PORTABLE_CLUSTER_SIZE_ALLOWED flag being added

However, note that this runs on SM_90 only, which I do not suppose is on the CI. If you prefer, I can add like a log of the test run on H100 ?

@AD2605 AD2605 requested a review from a team as a code owner June 21, 2024 15:39
@github-actions github-actions bot added the conformance Conformance test suite issues. label Jun 21, 2024
@callumfare callumfare removed the ready to merge Added to PR's which are ready to merge label Jun 24, 2024
@callumfare
Copy link
Contributor

I've removed the ready-to-merge label since intel/llvm#14113 isn't passing CI and also has the abi-break label - it will need to wait for the ABI breaking window to open before it can be merged.

@MartinWehking
Copy link

Hi, thanks for your patch. Are there any test cases that fail without your fix and if not, could you provide one?

So the test cases that are being added, are in this PR here - intel/llvm#14113, which would test this PR fully. I do not suppose I can add a test which will check the ordering of the cluster dimensions in this UR PR, however I can change increase the cluster size in the tests added in #1643 and increase the cluster size, such that it tests CU_FUNC_ATTRIBUTE_NON_PORTABLE_CLUSTER_SIZE_ALLOWED flag being added

However, note that this runs on SM_90 only, which I do not suppose is on the CI. If you prefer, I can add like a log of the test run on H100 ?

Thanks, that would be great!

@JackAKirk JackAKirk added the ready to merge Added to PR's which are ready to merge label Jun 28, 2024
@hdelan hdelan mentioned this pull request Jun 28, 2024
@callumfare
Copy link
Contributor

This was included in #1804 which has now merged

@callumfare callumfare closed this Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conformance Conformance test suite issues. cuda CUDA adapter specific issues ready to merge Added to PR's which are ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants