Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop Stream 2023-05-12 #243

Closed
wants to merge 26 commits into from

Conversation

Beanavil
Copy link
Contributor

This PR adds improvements to existing algorithms/tests and solves several small bugs.

  • BlockRadixRank and BlockRadixRankMatch are now implemented in terms of rocPRIM.
  • hipCUB now matches CUB's 2.0.0 and 2.0.1 releases, except for some breaking changes left out of this PR.
  • Extended floats are now present in more tests.
  • Documentation is now updated to the current state of hipCUB.
  • Fixed DeviceSegmentedReduce::ArgMin and DeviceSegmentedReduce::ArgMax for INF input .
  • References to hcc have been removed.
  • New overload of StableSortKeysCopy has been added to address part of CUB's 2.1.0 release.
  • Several bug fixes and optimizations.

Snektron and others added 26 commits May 10, 2023 13:33
load direct warp striped cannot load into warp striped if the
block size does not divide the hardware warp size.
The HipcubDeviceRadixSort.SortKeysOver4G requires more than 8 GB of
global memory to complete, and fails on devices with 8 GB or less. This
commit disables the test for devices which have 8 GB or less.
@Beanavil Beanavil closed this May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants