Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LIBCLC][AMDGCN] Fix get_max_sub_group_size #5386

Closed
wants to merge 2 commits into from

Conversation

npmiller
Copy link
Contributor

Using defines to figure out the wavefront size there is incorrect
because libclc is not built for a specific amdgcn version, so it will
always default to 64.

Instead use the __oclc_wavefront64 global variable provided by ROCm,
which will be set to a different value depending on the architecture.

This may fix some of the discrepancies between tests being run on gfx908 and the tests running on the CI, as the CI hardware uses a wavefront of 32 which mismatches with what was returned by this function, and that this function is used in the implementation of the group collectives.

And so it may fix:

Using defines to figure out the wavefront size there is incorrect
because libclc is not built for a specific amdgcn version, so it will
always default to `64`.

Instead use the `__oclc_wavefront64` global variable provided by ROCm,
which will be set to a different value depending on the architecture.
@npmiller npmiller requested a review from bader as a code owner January 25, 2022 11:53
bader
bader previously approved these changes Jan 25, 2022
Co-authored-by: Alexey Bader <alexey.bader@intel.com>
@npmiller
Copy link
Contributor Author

So the test results:

********************
Failed Tests (1):
  SYCL :: GroupAlgorithm/SYCL2020/shift_left_right.cpp

********************
Unexpectedly Passed Tests (3):
  SYCL :: SubGroup/broadcast.cpp
  SYCL :: SubGroup/broadcast_fp16.cpp
  SYCL :: SubGroup/broadcast_fp64.cpp


Testing Time: 190.11s
  Unsupported        : 400
  Passed             : 288
  Expectedly Failed  :  53
  Failed             :   1
  Unexpectedly Passed:   3

So it looks like this does fix the broadcast tests as I was hoping.

I'm not sure what's going on with the new failure, however the test is using shuffleUp and shuffleDown so it's possible that #5359 would fix it. I'm not able to make it fail on gfx908 so it is difficult to debug, maybe I should close this PR and bring the patch into the shuffles PR to see how the testing goes then.

@npmiller
Copy link
Contributor Author

Closing, will be merged as part of #5359

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants