Omni: Fix GGUF workflow on Windows by xiangyuT · Pull Request #252 · intel/llm-scaler

xiangyuT · 2026-01-20T06:07:30Z

No description provided.

…, cross-platform bugs - IPEX #838 had zero PRs; fix was never released, repo archived March 30 2026 - vllm-xpu-kernels CUTLASS XE PRs intel#88/intel#98/intel#114 replace GatedMLPMOE - Expert scaling fixes in progress: PRs intel#252 (1024 experts), intel#253 (128-256) - Cross-platform: 128-expert Qwen3-30B-A3B crashes on NVIDIA too (vLLM #35922, SGLang #9872) - Added Llama-4-Scout-17B-16E (16 experts, works) to threshold table - Other unfixed IPEX issues: #864 (GPT-OSS-20B-Int4), #869 (CPU offload) - Updated path forward: vLLM v0.16+ with vllm-xpu-kernels is recommended migration https://claude.ai/code/session_01JyMJU94Dq32vYBGMoMJM34

xiangyuT added 2 commits January 20, 2026 14:11

init

2700dc6

fiux

afed34e

xiangyuT marked this pull request as ready for review January 20, 2026 06:56

xiangyuT merged commit 24e2e64 into intel:main Jan 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Omni: Fix GGUF workflow on Windows#252

Omni: Fix GGUF workflow on Windows#252
xiangyuT merged 2 commits intointel:mainfrom
xiangyuT:fix_gguf_win_0120

xiangyuT commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xiangyuT commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant