Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial HIP backend support #3

Merged
merged 20 commits into from
Jan 3, 2024
Merged

Initial HIP backend support #3

merged 20 commits into from
Jan 3, 2024

Conversation

rcarson3
Copy link
Member

@rcarson3 rcarson3 commented Oct 1, 2020

Draft proposal to add in HIP support. This PR is currently just a way to easily keep track of what's changed when adding in the new HIP backend

@rcarson3 rcarson3 added the WIP label Oct 1, 2020
@rcarson3
Copy link
Member Author

rcarson3 commented Jan 4, 2022

Just so I don't forget before merging address #10 in here as well.

Comment on lines 51 to 54
#if (defined(__CUDA_ARCH__) && (__CUDA_ARCH__ > 0)) || defined(__HIP_DEVICE_COMPILE__)
#define __cuda_device_only__
#else
#define __cuda_host_only__
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we should probably move to something more generic here like __gpu_device_only__ and __gpu_host_only__ and maybe add the snls name some where in there as well just to avoid name clashing with other people's macros.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along a similar set of lines, it would also be good to update the device forall portion of the code.

@rcarson3 rcarson3 marked this pull request as ready for review June 1, 2022 21:07
Lots of different parts of the library referenced to either CUDA or HIP for things.
Since, we're seeing more and more different GPU vendors come online it just made more sense to generalize things to be called gpu where possible.
As part of this work, I renamed a number of the macros so that they would use SNLS in the name to avoid name clashing with other codes.
@rcarson3 rcarson3 removed the WIP label Aug 17, 2023
@rcarson3
Copy link
Member Author

rcarson3 commented Dec 6, 2023

Note: the GPU single point test still has some CUDA specific stuff in it. I'm leaving it in there for now given it's being completely ripped out in #14

cmake/thirdpartylibraries/FindRAJA.cmake Outdated Show resolved Hide resolved
src/SNLS_TrDLDenseG_Batch.h Show resolved Hide resolved
src/SNLS_gpu_portability.h Show resolved Hide resolved
@rcarson3
Copy link
Member Author

rcarson3 commented Jan 3, 2024

@gberg617 made a few small updates:

1 -> include Alan's bug fixes related to the device class so that wouldn't be sitting in the other branch/PR for too long
2-> simplify some of the CUDA/HIP kernel logic in places so we only have 1 forall call and not 2 different ones for the 2 execution types
3 -> Some minor fixes related to some of the tests that I noticed when testing this on vernal with the RAJA Portability Suite enabled which showed some compilation cases I hadn't run across before.

@rcarson3 rcarson3 merged commit 1064940 into develop Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants