Skip to content

[cub] Replace arch_id with compute_capability#8728

Merged
davebayer merged 1 commit intoNVIDIA:mainfrom
davebayer:replace_arch_id_with_cc
Apr 30, 2026
Merged

[cub] Replace arch_id with compute_capability#8728
davebayer merged 1 commit intoNVIDIA:mainfrom
davebayer:replace_arch_id_with_cc

Conversation

@davebayer
Copy link
Copy Markdown
Contributor

No description provided.

@davebayer davebayer requested review from a team as code owners April 29, 2026 12:19
@davebayer davebayer requested a review from gevtushenko April 29, 2026 12:19
@github-project-automation github-project-automation Bot moved this to Todo in CCCL Apr 29, 2026
@cccl-authenticator-app cccl-authenticator-app Bot moved this from Todo to In Review in CCCL Apr 29, 2026
Comment thread c/parallel/src/merge_sort.cu
@davebayer davebayer force-pushed the replace_arch_id_with_cc branch from b2fba9f to f4249f9 Compare April 29, 2026 12:34
Comment thread cub/benchmarks/bench/transform/common.h Outdated
Comment thread cub/benchmarks/bench/transform/common.h Outdated
Comment thread cub/cub/detail/launcher/cuda_runtime.cuh Outdated
Comment thread cub/cub/detail/arch_dispatch.cuh
Comment thread cub/cub/detail/arch_dispatch.cuh Outdated
Comment thread cub/cub/detail/arch_dispatch.cuh Outdated
@davebayer davebayer requested a review from a team as a code owner April 29, 2026 12:48
@davebayer davebayer requested a review from ericniebler April 29, 2026 12:48
Comment thread cub/cub/device/dispatch/kernels/kernel_reduce.cuh
Comment thread cub/cub/device/dispatch/kernels/kernel_reduce_deterministic.cuh
Comment thread cub/cub/device/dispatch/kernels/kernel_scan_warpspeed.cuh
Comment thread cub/cub/device/dispatch/kernels/kernel_segmented_radix_sort.cuh
Comment thread cub/cub/device/dispatch/kernels/kernel_segmented_reduce.cuh
Comment thread cub/cub/device/dispatch/kernels/kernel_transform.cuh
Comment thread cub/cub/device/dispatch/tuning/tuning_histogram.cuh Outdated
Comment thread cub/cub/device/dispatch/tuning/tuning_radix_sort.cuh Outdated
Comment thread cub/cub/device/dispatch/dispatch_adjacent_difference.cuh Outdated
Comment thread cub/benchmarks/bench/reduce/arg_extrema.cu Outdated
Comment thread cub/benchmarks/bench/reduce/base.cuh Outdated
Comment thread cub/benchmarks/bench/reduce/nondeterministic.cu Outdated
Comment thread cub/benchmarks/bench/scan/exclusive/by_key.cu Outdated
Comment thread cub/benchmarks/bench/run_length_encode/non_trivial_runs.cu Outdated
@github-actions

This comment has been minimized.

@davebayer davebayer requested a review from a team as a code owner April 30, 2026 07:10
@davebayer davebayer force-pushed the replace_arch_id_with_cc branch from 082e9c5 to ae0976f Compare April 30, 2026 07:17
@davebayer davebayer force-pushed the replace_arch_id_with_cc branch from 22481d8 to 3e9c4f6 Compare April 30, 2026 10:23
@github-actions

This comment has been minimized.

@davebayer
Copy link
Copy Markdown
Contributor Author

pre-commit.ci autofix

@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot Bot commented Apr 30, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@davebayer
Copy link
Copy Markdown
Contributor Author

/ok to test c3570ee

@davebayer davebayer force-pushed the replace_arch_id_with_cc branch from c3570ee to 1884bd5 Compare April 30, 2026 15:34
@jrhemstad
Copy link
Copy Markdown
Collaborator

/ok to test 1884bd5

Comment on lines 84 to 88
::cudaError_t PtxVersion(int& version) const
{
version = cc * 10;
version = cc_ * 10;
return cudaSuccess;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: that function should become obsolete by this PR and can be removed (in a follow-up PR)

struct a_policy
{
arch_id value;
int value;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Please add a comment to turn the int into a cuda::compute_capability once #8730 is fixed. Can we done after this PR. Or fix it right way when solving #8730.

Comment thread cub/test/catch2_test_arch_dispatch.cu
@github-actions
Copy link
Copy Markdown
Contributor

🥳 CI Workflow Results

🟩 Finished in 2h 27m: Pass: 100%/381 | Total: 10d 05h | Max: 2h 00m | Hits: 87%/483871

See results here.

@davebayer davebayer merged commit 46e67f2 into NVIDIA:main Apr 30, 2026
400 of 401 checks passed
@github-project-automation github-project-automation Bot moved this from In Review to Done in CCCL Apr 30, 2026
rapids-bot Bot pushed a commit to rapidsai/rapids-cmake that referenced this pull request May 1, 2026
This PR updates CCCL to the latest 3.4.0 pre-release commit.

Comparison of CCCL commits: NVIDIA/cccl@a4fd978...2b7188d

We specifically need these features/fixes for RAPIDS projects:
- NVIDIA/cccl#8728

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #1015
@davebayer davebayer deleted the replace_arch_id_with_cc branch May 7, 2026 05:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants