Skip to content

Omni: Fix GGUF workflow on Windows#252

Merged
xiangyuT merged 2 commits intointel:mainfrom
xiangyuT:fix_gguf_win_0120
Jan 21, 2026
Merged

Omni: Fix GGUF workflow on Windows#252
xiangyuT merged 2 commits intointel:mainfrom
xiangyuT:fix_gguf_win_0120

Conversation

@xiangyuT
Copy link
Copy Markdown
Contributor

No description provided.

@xiangyuT xiangyuT marked this pull request as ready for review January 20, 2026 06:56
@xiangyuT xiangyuT merged commit 24e2e64 into intel:main Jan 21, 2026
MegaStood pushed a commit to MegaStood/llm-scaler that referenced this pull request Apr 16, 2026
…, cross-platform bugs

- IPEX #838 had zero PRs; fix was never released, repo archived March 30 2026
- vllm-xpu-kernels CUTLASS XE PRs intel#88/intel#98/intel#114 replace GatedMLPMOE
- Expert scaling fixes in progress: PRs intel#252 (1024 experts), intel#253 (128-256)
- Cross-platform: 128-expert Qwen3-30B-A3B crashes on NVIDIA too (vLLM #35922, SGLang #9872)
- Added Llama-4-Scout-17B-16E (16 experts, works) to threshold table
- Other unfixed IPEX issues: #864 (GPT-OSS-20B-Int4), #869 (CPU offload)
- Updated path forward: vLLM v0.16+ with vllm-xpu-kernels is recommended migration

https://claude.ai/code/session_01JyMJU94Dq32vYBGMoMJM34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant