Skip to content

[BUG]: Validate tuning refactoring on all affected SM versions #3235

@gevtushenko

Description

@gevtushenko

Is this a duplicate?

Type of Bug

Performance

Component

CUB

Describe the bug

#3138 introduced 120% performance regression in cub::DeviceScan::Sum on Hopper.
Looking at PR description:

No changes in SASS for cub.test.device_scan.lid_0.types_0 except kernel symbol names

I assume that we checked SASS only on default set of architectures from our preset "60;70;80".

diff old.sass new.sass | wc -l 
88

Mentioned refactoring also touches SM90 policies, which results in the following diff:

diff old.90.sass new.90.sass | wc -l 
83618

We should verify SM90 SASS differences on other algorithms we refactored lately.

How to Reproduce

git checkout 52b4b671c111fc289b2f2247a16eff21d5a5a1a8
./ci/build_cub.sh -arch 90
cuobjdump --dump-sass build/cuda12.6-gcc13/cub-cpp17/bin/cub.bench.scan.exclusive.sum.base > new.90.sass
git checkout HEAD~
./ci/build_cub.sh -arch 90
cuobjdump --dump-sass build/cuda12.6-gcc13/cub-cpp17/bin/cub.bench.scan.exclusive.sum.base > old.90.sass
diff old.90.sass new.90.sass | wc -l 
83618

Expected behavior

No SASS difference on any architecture after tuning refactoring

Reproduction link

No response

Operating System

No response

nvidia-smi output

No response

NVCC version

No response

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions