Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test large arrays in in device radix sort #1349

Merged
merged 12 commits into from
Feb 21, 2024

Conversation

alliepiper
Copy link
Collaborator

@alliepiper alliepiper commented Feb 7, 2024

Description

closes #1268
closes #1348

Adds a new test utility that generates large (>2^32 elements) test inputs for sorting algorithms and verifies them, optimizing for runtime and minimizing host/device memory overhead.

Also adds large num_item test cases to radix sort tests in catch2 and removes the old legacy radix sort tests.

A new c2h::cpu_timer utility is also added that is useful for breaking down execution times of test cases per-step.

Looking at CI logs, this appears to increase the time needed to run CUB tests in CI by around 1 minutes (39m -> 40 min for gcc12/cuda12.3/c++17). The new large-input tests take longer than one minute, but their cost is offset by cleaning up some redundant small-input testscases.

@alliepiper alliepiper requested review from a team as code owners February 7, 2024 19:21
@alliepiper alliepiper marked this pull request as draft February 7, 2024 19:21
Copy link
Collaborator

@elstehle elstehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick PR! I like the approach a lot for sanity checking large number of items. I'd be fine just being able to test keys for now as a best-effort approach to catching indexing issues, while halving memory requirements.

cub/test/catch2_large_array_sort_helper.cuh Outdated Show resolved Hide resolved
cub/test/catch2_large_array_sort_helper.cuh Outdated Show resolved Hide resolved
@alliepiper alliepiper changed the title [DRAFT] Test large arrays in in device radix sort Test large arrays in in device radix sort Feb 15, 2024
@alliepiper alliepiper marked this pull request as ready for review February 15, 2024 14:30
@alliepiper
Copy link
Collaborator Author

Looking at CI logs, this appears to increase the time needed to run CUB tests in CI by around 9 minutes (39m -> 48 min for gcc12/cuda12.3/c++17).

@alliepiper
Copy link
Collaborator Author

Time cost update after removing redundant tests using signed NumItemsT (cub::detail::ChooseOffsetsT automatically converts signed to unsigned):

Looking at CI logs, this appears to increase the time needed to run CUB tests in CI by around 1 minutes (39m -> 40 min for gcc12/cuda12.3/c++17).

Thanks @elstehle for pointing this out!

The new large-input tests do take longer than 1 minute to run, but this is offset by reducing the number of small-input tests.

Copy link
Collaborator

@gevtushenko gevtushenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

cub/test/catch2_large_array_sort_helper.cuh Outdated Show resolved Hide resolved
cub/test/catch2_large_array_sort_helper.cuh Outdated Show resolved Hide resolved
cub/test/catch2_large_array_sort_helper.cuh Outdated Show resolved Hide resolved
cub/test/c2h/checked_allocator.cuh Outdated Show resolved Hide resolved
cub/test/c2h/checked_allocator.cuh Show resolved Hide resolved
cub/test/catch2_large_array_sort_helper.cuh Outdated Show resolved Hide resolved
cub/test/catch2_large_array_sort_helper.cuh Outdated Show resolved Hide resolved
@alliepiper
Copy link
Collaborator Author

Thanks for the excellent reviews! Should be all set to merge now once CI is green (current failure looks like an infra issue).

@elstehle I've updated the large array initialization methods to take a seed as suggested -- be sure to update your merge sort PR.

@alliepiper alliepiper merged commit 77431b8 into NVIDIA:main Feb 21, 2024
538 checks passed
@alliepiper alliepiper deleted the radix_sort_large branch February 21, 2024 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Design and implement solution for large num_items Port test/test_device_radix_sort.cu to Catch2
3 participants