Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of gfx942 #6358

Merged
merged 6 commits into from
Aug 16, 2023
Merged

Conversation

skyreflectedinmirrors
Copy link
Contributor

No description provided.

Change-Id: Id31ca3ba5356d021cade2abc3e3f51f9f3b4d211
Change-Id: I1454bb0b91518bfcf7a04506e40b98387cdf8ed9
Change-Id: Id9c03fe451d1d28a3c23a77f161a2600f016c7e4
cmake/kokkos_arch.cmake Outdated Show resolved Hide resolved
Nick Curtis and others added 2 commits August 14, 2023 15:45
Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
Co-authored-by: Damien L-G <dalg24+github@gmail.com>
@dalg24
Copy link
Member

dalg24 commented Aug 15, 2023

Can you say a word about the thread fences that are being added?

@skyreflectedinmirrors
Copy link
Contributor Author

Can you say a word about the thread fences that are being added?

Technically, this was a violation of the memory model, because there was no guarantee that the write for the intermediate reduction values became visible before the read by the last block to do the second stage. This never bit us because it was quite unlikely on the hardware we're running on, but... shall we say that may not always hold.

I've tested correctness and performance of several LAMMPS benchmarks, a regular dot product, and the yAx tutorial example on MI-250, and saw essentially no impact from unconditionally including it.

cmake/kokkos_arch.cmake Outdated Show resolved Hide resolved
Change-Id: Ibd028fddeedf8e0fdda50b72625ab62cee6fa71e
@dalg24
Copy link
Member

dalg24 commented Aug 16, 2023

CUDA failure unrelated

@dalg24 dalg24 merged commit 04d5c55 into kokkos:develop Aug 16, 2023
25 of 28 checks passed
@Rombur Rombur mentioned this pull request Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants