Skip to content

Conversation

@sstamenk
Copy link
Contributor

@sstamenk sstamenk commented Nov 3, 2025

This PR ports the changes from #1784 to kernels.hip and ops.hip so that the newly added test_dynamic_blockwise_quantization_large test can pass on AMD GPUs. Without this change the test gets aborted.

Tested this on both W7900 (gfx1100) and R9700 (gfx1201) and all 3 unit tests passed successfully.

@matthewdouglas matthewdouglas added this to the v0.49.0 milestone Nov 3, 2025
@github-actions
Copy link

github-actions bot commented Nov 3, 2025

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@matthewdouglas matthewdouglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM!

@matthewdouglas matthewdouglas merged commit 1920972 into bitsandbytes-foundation:main Nov 3, 2025
53 checks passed
@sstamenk sstamenk deleted the blockwise-quant-index-overflow-amd branch November 3, 2025 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants