Skip to content

[VitisAI] AMD NPU LLM Quantization - Add Windows + CUDA support for Quark Quantizer#2269

Merged
jambayk merged 4 commits intomicrosoft:mainfrom
poganesh:win_cuda
Nov 21, 2025
Merged

[VitisAI] AMD NPU LLM Quantization - Add Windows + CUDA support for Quark Quantizer#2269
jambayk merged 4 commits intomicrosoft:mainfrom
poganesh:win_cuda

Conversation

@poganesh
Copy link
Contributor

Describe your changes

This PR extends the Quark quantization support to enable Windows + CUDA workflows and improves quark quantization stability:
Key Changes:

  • Added Windows + CUDA platform support for Quark quantization pass
  • Added bfloat16 dtype support in quark quantization pass for improved AWQ quantization stability
  • Tested on Windows 11 Pro with NVIDIA RTX 3090 + CUDA 13.0

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

@jambayk jambayk enabled auto-merge (squash) November 20, 2025 18:07
@jambayk jambayk merged commit 8b44cf4 into microsoft:main Nov 21, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants