Skip to content

[AMD] Update Quark Quantization Pass for Quark 0.11 and VitisAI LLM Fusion Model Support#2364

Merged
jambayk merged 2 commits intomicrosoft:mainfrom
poganesh:npu_fusion_use_ep_v2
Mar 26, 2026
Merged

[AMD] Update Quark Quantization Pass for Quark 0.11 and VitisAI LLM Fusion Model Support#2364
jambayk merged 2 commits intomicrosoft:mainfrom
poganesh:npu_fusion_use_ep_v2

Conversation

@poganesh
Copy link
Copy Markdown
Contributor

Describe your changes

  • Updates the QuarkQuantization (torch) pass for Quark 0.11 API (from 0.10)
  • Adds full fusion optimization for LLM models where supported
  • Adds token fusion support for models where full fusion is not yet available
  • Adds GPT-OSS pre-quantized model support
  • Aligned with MS-AMD 3D release (2/17/26)

@poganesh poganesh changed the title [AMD] Update QuarkQuantization Pass (torch) for Quark 0.11 and VitisAI LLM Fusion Model Support [AMD] Update Quark Quantization Pass for Quark 0.11 and VitisAI LLM Fusion Model Support Mar 22, 2026
@poganesh
Copy link
Copy Markdown
Contributor Author

@devang-ml, @xieofxie could you please help review this PR.

@jambayk jambayk merged commit 6c1a869 into microsoft:main Mar 26, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants