Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIP: Fix concurrency #6479

Merged
merged 1 commit into from
Oct 10, 2023
Merged

HIP: Fix concurrency #6479

merged 1 commit into from
Oct 10, 2023

Conversation

maartenarnst
Copy link
Contributor

@maartenarnst maartenarnst commented Oct 3, 2023

This PR fixes

i.e., Kokkos hardodes in Kokkos_HIP.cpp the maximum number of waves per CU, and the goal of this PR is to make computations of the concurrency consistent with this hardcoded number.

Longer term, should we try to work towards removing this hardcoded value, and ping somebody from AMD to bring this issue to their attention and ask if there is a way for the hip function to ideally return directly the appropriate value for maxThreadsPerMultiProcessor?

@dalg24-jenkins
Copy link
Collaborator

Can one of the admins verify this patch?

@masterleinad
Copy link
Contributor

OK to test.

@masterleinad
Copy link
Contributor

The Cuda backend has the same code. It probably makes sense for that part to be consistent across the two backends.

@Rombur
Copy link
Member

Rombur commented Oct 3, 2023

Longer term, should we try to work towards removing this hardcoded value, and ping somebody from AMD to bring this issue to their attention

It is someone from AMD that wrote that code...

@IanBogle do you know if now there is a way to get the sustainable number of wavefronts / CU instead of the maximum of WF / CU?

@IanBogle
Copy link
Contributor

IanBogle commented Oct 3, 2023

I don't know off the top of my head, but I'm asking around. What do you mean by "sustainable number of wavefronts/CU"? Is it sustainable in terms of occupancy or some other characteristic?

@Rombur
Copy link
Member

Rombur commented Oct 3, 2023

Yes sustainable in term of occupancy. @arghdos wrote that code a while back

// theoretically, we can get 40 WF's / CU, but only can sustain 32 see
// https://github.com/ROCm-Developer-Tools/HIP/blob/a0b5dfd625d99af7e288629747b40dd057183173/vdi/hip_platform.cpp#L742
Impl::HIPInternal::m_maxWavesPerCU = 32;

I wonder if there is a new property that would give us that number now.

@IanBogle
Copy link
Contributor

IanBogle commented Oct 4, 2023

I don't know of any property that exposes this number, but I'll check with @arghdos when he's back in the office. I have checked recent versions of the file that is referenced in the code snippet, and it is still looks accurate.

@Rombur
Copy link
Member

Rombur commented Oct 4, 2023

Thanks for looking @IanBogle!

@dalg24
Copy link
Member

dalg24 commented Oct 10, 2023

Unrelated OpenMPTarget failure

4: [ RUN      ] openmptarget.partitioning_by_args
4: /var/jenkins/workspace/Kokkos/core/unit_test/TestExecSpacePartitioning.hpp:100: Failure
4: Expected equality of these values:
4:   sum1
4:     Which is: 7642095
4:   sum2
4:     Which is: 0
4: [  FAILED  ] openmptarget.partitioning_by_args (0 ms)

@dalg24 dalg24 merged commit d97f16f into kokkos:develop Oct 10, 2023
27 of 28 checks passed
@masterleinad masterleinad mentioned this pull request Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants