Skip to content

[backport/3.0] Replace CUB util_arch.cuh macros with inline constexpr variables #4165#4202

Merged
fbusato merged 1 commit intoNVIDIA:branch/3.0.xfrom
fbusato:backport-3.0-cub-macro-deprecation
Mar 20, 2025
Merged

[backport/3.0] Replace CUB util_arch.cuh macros with inline constexpr variables #4165#4202
fbusato merged 1 commit intoNVIDIA:branch/3.0.xfrom
fbusato:backport-3.0-cub-macro-deprecation

Conversation

@fbusato
Copy link
Copy Markdown
Contributor

@fbusato fbusato commented Mar 19, 2025

Description

backport of #4165

@fbusato fbusato added the 3.0 label Mar 19, 2025
@fbusato fbusato requested a review from bernhardmgruber March 19, 2025 17:53
@fbusato fbusato self-assigned this Mar 19, 2025
@fbusato fbusato requested a review from a team as a code owner March 19, 2025 17:53
@fbusato fbusato added this to CCCL Mar 19, 2025
@github-project-automation github-project-automation Bot moved this to Todo in CCCL Mar 19, 2025
@cccl-authenticator-app cccl-authenticator-app Bot moved this from Todo to In Review in CCCL Mar 19, 2025
@fbusato fbusato changed the base branch from main to branch/3.0.x March 19, 2025 17:53
@fbusato fbusato enabled auto-merge (squash) March 19, 2025 17:54
@github-actions
Copy link
Copy Markdown
Contributor

🟩 CI finished in 2h 53m: Pass: 100%/97 | Total: 2d 17h | Avg: 40m 21s | Max: 1h 58m | Hits: 69%/134281
  • 🟩 cub: Pass: 100%/45 | Total: 1d 16h | Avg: 54m 18s | Max: 1h 58m | Hits: 53%/53780

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 14h | Avg: 54m 09s | Max:  1h 58m | Hits:  54%/51336 
      🟩 arm64              Pass: 100%/2   | Total:  1h 54m | Avg: 57m 18s | Max:  1h 07m | Hits:  49%/2444  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 53m | Avg: 58m 37s | Max:  1h 03m | Hits:  43%/5940  
      🟩 12.6               Pass: 100%/2   | Total:  2h 37m | Avg:  1h 18m | Max:  1h 19m | Hits:  12%/2260  
      🟩 12.8               Pass: 100%/38  | Total:  1d 09h | Avg: 52m 27s | Max:  1h 58m | Hits:  57%/45580 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 08m | Hits:  53%/2108  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 53m | Avg: 58m 37s | Max:  1h 03m | Hits:  43%/5940  
      🟩 nvcc12.6           Pass: 100%/2   | Total:  2h 37m | Avg:  1h 18m | Max:  1h 19m | Hits:  12%/2260  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  1d 07h | Avg: 51m 40s | Max:  1h 58m | Hits:  57%/43472 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 08m | Hits:  53%/2108  
      🟩 nvcc               Pass: 100%/43  | Total:  1d 14h | Avg: 53m 44s | Max:  1h 58m | Hits:  53%/51672 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  4h 17m | Avg:  1h 04m | Max:  1h 08m | Hits:  49%/4896  
      🟩 Clang15            Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 02m | Hits:  49%/2444  
      🟩 Clang16            Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 04m | Hits:  49%/2444  
      🟩 Clang17            Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 05m | Hits:  49%/2444  
      🟩 Clang18            Pass: 100%/7   | Total:  6h 11m | Avg: 53m 07s | Max:  1h 08m | Hits:  65%/8218  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 44m | Avg: 52m 11s | Max: 53m 25s | Hits:  49%/2448  
      🟩 GCC8               Pass: 100%/1   | Total: 49m 38s | Avg: 49m 38s | Max: 49m 38s | Hits:  49%/1224  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 44m | Avg: 52m 08s | Max: 52m 14s | Hits:  49%/2448  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 39m | Avg: 49m 46s | Max: 51m 07s | Hits:  49%/2448  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 39m | Avg: 49m 50s | Max: 50m 08s | Hits:  49%/2444  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 44m | Avg: 52m 28s | Max: 54m 33s | Hits:  49%/2444  
      🟩 GCC13              Pass: 100%/11  | Total:  7h 41m | Avg: 41m 59s | Max:  1h 58m | Hits:  74%/13442 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 05m | Hits:  14%/2088  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 05m | Avg:  1h 02m | Max:  1h 05m | Hits:  13%/2088  
      🟩 NVHPC25.1          Pass: 100%/2   | Total:  2h 37m | Avg:  1h 18m | Max:  1h 19m | Hits:  12%/2260  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 16h 48m | Avg: 59m 21s | Max:  1h 08m | Hits:  56%/20446 
      🟩 GCC                Pass: 100%/22  | Total: 17h 04m | Avg: 46m 33s | Max:  1h 58m | Hits:  61%/26898 
      🟩 MSVC               Pass: 100%/4   | Total:  4h 13m | Avg:  1h 03m | Max:  1h 05m | Hits:  14%/4176  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 37m | Avg:  1h 18m | Max:  1h 19m | Hits:  12%/2260  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 13m | Avg: 24m 25s | Max: 25m 31s | Hits:  82%/3666  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 09h | Avg: 59m 19s | Max:  1h 19m | Hits:  44%/40338 
      🟩 rtxa6000           Pass: 100%/8   | Total:  5h 53m | Avg: 44m 11s | Max:  1h 58m | Hits:  83%/9776  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 11h | Avg: 58m 14s | Max:  1h 19m | Hits:  44%/44004 
      🟩 DeviceLaunch       Pass: 100%/1   | Total:  1h 58m | Avg:  1h 58m | Max:  1h 58m | Hits:  69%/1222  
      🟩 GraphCapture       Pass: 100%/1   | Total: 21m 03s | Avg: 21m 03s | Max: 21m 03s | Hits:  99%/1222  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 18m | Avg: 26m 01s | Max: 27m 31s | Hits:  99%/3666  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 11m | Avg: 23m 40s | Max: 26m 57s | Hits:  99%/3666  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 13m | Avg: 24m 25s | Max: 25m 31s | Hits:  82%/3666  
      🟩 90;90a;100         Pass: 100%/1   | Total: 47m 05s | Avg: 47m 05s | Max: 47m 05s | Hits:  49%/1222  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 19h 39m | Avg: 58m 58s | Max:  1h 19m | Hits:  43%/23662 
      🟩 20                 Pass: 100%/25  | Total: 21h 04m | Avg: 50m 34s | Max:  1h 58m | Hits:  62%/30118 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 22h 46m | Avg: 30m 21s | Max: 57m 28s | Hits: 79%/80181

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 36m 52s | Avg: 18m 26s | Max: 25m 27s | Hits:  89%/3566  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 21h 52m | Avg: 30m 31s | Max: 57m 28s | Hits:  79%/76616 
      🟩 arm64              Pass: 100%/2   | Total: 53m 25s | Avg: 26m 42s | Max: 28m 14s | Hits:  79%/3565  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 52m | Avg: 34m 26s | Max: 48m 58s | Hits:  78%/8906  
      🟩 12.6               Pass: 100%/2   | Total:  1h 50m | Avg: 55m 08s | Max: 55m 10s | Hits:  58%/3564  
      🟩 12.8               Pass: 100%/38  | Total: 18h 03m | Avg: 28m 31s | Max: 57m 28s | Hits:  81%/67711 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 51m 27s | Avg: 25m 43s | Max: 25m 58s | Hits:  79%/3564  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 52m | Avg: 34m 26s | Max: 48m 58s | Hits:  78%/8906  
      🟩 nvcc12.6           Pass: 100%/2   | Total:  1h 50m | Avg: 55m 08s | Max: 55m 10s | Hits:  58%/3564  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 17h 12m | Avg: 28m 40s | Max: 57m 28s | Hits:  81%/64147 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 51m 27s | Avg: 25m 43s | Max: 25m 58s | Hits:  79%/3564  
      🟩 nvcc               Pass: 100%/43  | Total: 21h 54m | Avg: 30m 34s | Max: 57m 28s | Hits:  79%/76617 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 58m | Avg: 29m 32s | Max: 30m 28s | Hits:  79%/7128  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 03m | Avg: 31m 37s | Max: 32m 09s | Hits:  79%/3564  
      🟩 Clang16            Pass: 100%/2   | Total: 57m 42s | Avg: 28m 51s | Max: 28m 54s | Hits:  79%/3564  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 02m | Avg: 31m 04s | Max: 31m 05s | Hits:  79%/3564  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 34m | Avg: 22m 01s | Max: 31m 51s | Hits:  85%/12474 
      🟩 GCC7               Pass: 100%/2   | Total:  1h 03m | Avg: 31m 54s | Max: 32m 14s | Hits:  79%/3566  
      🟩 GCC8               Pass: 100%/1   | Total: 31m 47s | Avg: 31m 47s | Max: 31m 47s | Hits:  79%/1783  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 04m | Avg: 32m 09s | Max: 33m 03s | Hits:  79%/3566  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 01m | Avg: 30m 51s | Max: 31m 42s | Hits:  79%/3566  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 03m | Avg: 31m 30s | Max: 33m 42s | Hits:  79%/3566  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 05m | Avg: 32m 53s | Max: 34m 13s | Hits:  79%/3566  
      🟩 GCC13              Pass: 100%/10  | Total:  3h 29m | Avg: 20m 56s | Max: 32m 23s | Hits:  87%/17830 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 42m | Avg: 51m 18s | Max: 53m 38s | Hits:  65%/3552  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  2h 18m | Avg: 46m 02s | Max: 57m 28s | Hits:  71%/5328  
      🟩 NVHPC25.1          Pass: 100%/2   | Total:  1h 50m | Avg: 55m 08s | Max: 55m 10s | Hits:  58%/3564  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  7h 35m | Avg: 26m 47s | Max: 32m 09s | Hits:  81%/30294 
      🟩 GCC                Pass: 100%/21  | Total:  9h 19m | Avg: 26m 39s | Max: 34m 13s | Hits:  82%/37443 
      🟩 MSVC               Pass: 100%/5   | Total:  4h 00m | Avg: 48m 08s | Max: 57m 28s | Hits:  69%/8880  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 50m | Avg: 55m 08s | Max: 55m 10s | Hits:  58%/3564  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 28m 56s | Avg: 14m 28s | Max: 17m 08s | Hits:  89%/3566  
      🟩 rtx2080            Pass: 100%/33  | Total: 18h 37m | Avg: 33m 51s | Max: 55m 10s | Hits:  76%/58802 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 40m | Avg: 22m 01s | Max: 57m 28s | Hits:  89%/17813 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 21h 17m | Avg: 33m 36s | Max: 57m 28s | Hits:  76%/67709 
      🟩 TestCPU            Pass: 100%/3   | Total: 43m 30s | Avg: 14m 30s | Max: 27m 43s | Hits:  99%/5341  
      🟩 TestGPU            Pass: 100%/4   | Total: 45m 31s | Avg: 11m 22s | Max: 11m 52s | Hits:  99%/7131  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 28m 56s | Avg: 14m 28s | Max: 17m 08s | Hits:  89%/3566  
      🟩 90;90a;100         Pass: 100%/1   | Total: 32m 00s | Avg: 32m 00s | Max: 32m 00s | Hits:  79%/1783  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 11h 48m | Avg: 35m 25s | Max: 55m 10s | Hits:  75%/35631 
      🟩 20                 Pass: 100%/23  | Total: 10h 20m | Avg: 26m 59s | Max: 57m 28s | Hits:  82%/40984 
    
  • 🟩 stdpar: Pass: 100%/4 | Total: 16m 50s | Avg: 4m 12s | Max: 5m 05s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 10m 09s | Avg:  5m 04s | Max:  5m 05s
      🟩 arm64              Pass: 100%/2   | Total:  6m 41s | Avg:  3m 20s | Max:  3m 23s
    🟩 ctk
      🟩 12.6               Pass: 100%/4   | Total: 16m 50s | Avg:  4m 12s | Max:  5m 05s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/4   | Total: 16m 50s | Avg:  4m 12s | Max:  5m 05s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 16m 50s | Avg:  4m 12s | Max:  5m 05s
    🟩 cxx
      🟩 NVHPC25.1          Pass: 100%/4   | Total: 16m 50s | Avg:  4m 12s | Max:  5m 05s
    🟩 cxx_family
      🟩 NVHPC              Pass: 100%/4   | Total: 16m 50s | Avg:  4m 12s | Max:  5m 05s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 16m 50s | Avg:  4m 12s | Max:  5m 05s
    🟩 jobs
      🟩 Build              Pass: 100%/4   | Total: 16m 50s | Avg:  4m 12s | Max:  5m 05s
    🟩 std
      🟩 17                 Pass: 100%/2   | Total:  8m 28s | Avg:  4m 14s | Max:  5m 05s
      🟩 20                 Pass: 100%/2   | Total:  8m 22s | Avg:  4m 11s | Max:  5m 04s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 17m 43s | Avg: 8m 51s | Max: 15m 00s | Hits: 97%/320

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 17m 43s | Avg:  8m 51s | Max: 15m 00s | Hits:  97%/320   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 17m 43s | Avg:  8m 51s | Max: 15m 00s | Hits:  97%/320   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 17m 43s | Avg:  8m 51s | Max: 15m 00s | Hits:  97%/320   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 17m 43s | Avg:  8m 51s | Max: 15m 00s | Hits:  97%/320   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 17m 43s | Avg:  8m 51s | Max: 15m 00s | Hits:  97%/320   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 17m 43s | Avg:  8m 51s | Max: 15m 00s | Hits:  97%/320   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 17m 43s | Avg:  8m 51s | Max: 15m 00s | Hits:  97%/320   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 43s | Avg:  2m 43s | Max:  2m 43s | Hits:  95%/160   
      🟩 Test               Pass: 100%/1   | Total: 15m 00s | Avg: 15m 00s | Max: 15m 00s | Hits:  98%/160   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 11m | Avg: 1h 11m | Max: 1h 11m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- stdpar
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 97)

# Runner
68 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@fbusato fbusato merged commit 72ff92c into NVIDIA:branch/3.0.x Mar 20, 2025
110 of 113 checks passed
@github-project-automation github-project-automation Bot moved this from In Review to Done in CCCL Mar 20, 2025
@fbusato fbusato deleted the backport-3.0-cub-macro-deprecation branch March 20, 2025 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants