fix: VLLM_SKIP_PROFILE_RUN patch for Lunar Lake iGPU profile_run() hang by MegaStood · Pull Request #340 · intel/llm-scaler

MegaStood · 2026-04-01T12:21:28Z

Bug

vLLM's XPU worker calls profile_run() during startup — a dummy forward pass
to measure peak GPU memory for KV cache sizing. On Lunar Lake Xe2 iGPU (Arc 140V),
this hangs indefinitely for MoE models (gpt-oss-20b, GLM-4.7-flash), blocking
server startup entirely.

Related upstream issue: vllm-project/vllm#30359

Fix

Adds a vllm_xpu_worker_skip_profile.patch for vllm/v1/worker/xpu_worker.py that
introduces VLLM_SKIP_PROFILE_RUN=1 environment variable support:

Skips profile_run() entirely when set
Estimates peak memory as memory_allocated() × 1.2 (conservative)
Prints memory profiling analysis for debugging

Also updates lunar_lake_serve.sh to set the env var automatically.

Impact

Without patch: Server hangs at startup for MoE models on iGPU — unusable
With patch: KV cache allocation is ~1.2× conservative (slightly less cache
than optimal), but server starts and runs correctly

Tested On

Device: MSI Claw 8 AI+ (Core Ultra 7 258V, Arc 140V, 32GB LPDDR5x)
Models verified: gpt-oss-20b (MXFP4), Qwen3.5-4B (INT4), Qwen3-8B (INT4)
vLLM version: 0.14.0 with XPU backend
Dense models (Qwen3.5-4B, Qwen3-8B) also work with the patch — the 1.2× estimate
matches actual peak closely

…Lake iGPU vLLM's XPU worker runs a dummy forward pass (profile_run()) during startup to measure peak GPU memory for KV cache sizing. On Lunar Lake's Xe2 iGPU, this forward pass hangs indefinitely for MoE models (gpt-oss-20b, GLM-4.7). This patch adds VLLM_SKIP_PROFILE_RUN=1 environment variable support to _determine_available_memory_default() in xpu_worker.py. When set: - Skips profile_run() entirely - Estimates peak memory as memory_allocated() * 1.2 - Prints memory profiling analysis for debugging Tested on: MSI Claw 8 AI+ (Core Ultra 7 258V, Arc 140V, 32GB LPDDR5x) Models verified: gpt-oss-20b (MXFP4), Qwen3.5-4B, Qwen3-8B Related: vllm-project/vllm#30359

MegaStood · 2026-04-02T12:01:04Z

please check and review.

MegaStood closed this Apr 2, 2026

MegaStood force-pushed the claude/upstream-skip-profile-patch-CB5w6 branch from 16aa9cc to e874953 Compare April 2, 2026 11:54

MegaStood reopened this Apr 2, 2026

MegaStood mentioned this pull request Apr 2, 2026

Add Lunar Lake (32GB) support: Xe2 compatibility fixes and benchmark results #335

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: VLLM_SKIP_PROFILE_RUN patch for Lunar Lake iGPU profile_run() hang#340

fix: VLLM_SKIP_PROFILE_RUN patch for Lunar Lake iGPU profile_run() hang#340
MegaStood wants to merge 1 commit intointel:mainfrom
MegaStood:claude/upstream-skip-profile-patch-CB5w6

MegaStood commented Apr 1, 2026

Uh oh!

MegaStood commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MegaStood commented Apr 1, 2026

Bug

Fix

Impact

Tested On

Uh oh!

MegaStood commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants