Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Refactor GPU reductions and add unsafe atomic tunings #247

Closed
wants to merge 32 commits into from

Conversation

MrBurmark
Copy link
Member

@MrBurmark MrBurmark commented Jun 3, 2022

Refactor GPU reductions and add unsafe atomic tunings

This adds a "unsafeAtomic" tuning of each of the kernels with a hip variant using atomics.
This also refactors gpu reductions so the implementation is not duplicated in each kernel with a reduction.
This also adds lambda variants of gpu reduction kernels.

  • This PR is a refactoring and feature
  • It does the following (modify list as needed):
    • refactors gpu reduction code to avoid duplication
    • Adds "unsafeAtomic" hip atomic tunings at the request of myself

@MrBurmark MrBurmark marked this pull request as ready for review June 3, 2022 05:32
Now gpu kernels using atomics have a default tuning
called atomic and tunings with unsafe atomics have
a tuning called unsafeAtomics.
This makes it clear where atomics are being used
as part of a reduction or not
Copy link
Member

@rhornung67 rhornung67 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of changes in this PR. I think I understand them all.

@MrBurmark
Copy link
Member Author

I kept finding things to change...
I have more in some more in mind for a future PR too.

#endif
template < size_t block_size >
__launch_bounds__(block_size)
__global__ void reduce_sum_unsafe(Real_ptr x, Real_ptr dsum, Real_type sum_init,
Copy link
Member

@CRobeck CRobeck Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we start to add more tests with atomics we might not want to split these up into separate kernels for compile time concerns but should be OK for now.

return devProp.gcnArchName;
}

#if defined(__gfx90a__)
Copy link
Member

@CRobeck CRobeck Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be able to do away with gfx90a arch check in future Rocm release.

@rhornung67 rhornung67 mentioned this pull request Jul 10, 2023
24 tasks
@MrBurmark MrBurmark changed the title Add support for hip unsafe atomics [Draft] Refactor GPU reductions and add unsafe atomic tunings Jul 18, 2023
@MrBurmark
Copy link
Member Author

I'm closing this PR for a number of reasons. Its worth differentiating between safe and unsafe atomics as that is a temporary issue. The reducer implementation is difficult to put in a macro and not identically duplicated, so its not a huge gain to abstract it. The lambda variants are not necessary.

@MrBurmark MrBurmark closed this Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants