Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[XPU] Fix AWQ skipped layer detection in IPEX quantization ready ONLY add when PR is ready to merge/full CI is needed
#29774 opened Dec 1, 2025 by faaany Loading…
3 of 5 tasks
[ROCm] [Fused Moe] Use binary expert mask for aiter fused moe kernel rocm Related to AMD ROCm
#29773 opened Dec 1, 2025 by ZhiweiYan-96 Loading…
5 tasks
[Draft] AFD basic implention deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend v1
#29772 opened Dec 1, 2025 by jiangkuaixue123 Draft
11 tasks
[Bugfix][MM] Move grid_thw tensor to cpu before directly converting to numpy qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed
#29770 opened Dec 1, 2025 by shen-shanshan Loading…
5 tasks
Refactor MLA attention: move prefill logic to layer.py rocm Related to AMD ROCm v1
#29769 opened Dec 1, 2025 by therealnaveenkamal Loading…
1 of 5 tasks
Bump actions/setup-python from 6.0.0 to 6.1.0 ci/build dependencies Pull requests that update a dependency file github_actions Pull requests that update GitHub Actions code
#29768 opened Dec 1, 2025 by dependabot bot Loading…
[Misc] Unify tokenizer registration frontend ready ONLY add when PR is ready to merge/full CI is needed structured-output v1
#29767 opened Dec 1, 2025 by DarkLight1337 Loading…
5 tasks
[Bugfix]: Fix missing SPLIT_K in GPTQ/AWQ MoE Triton config
#29766 opened Dec 1, 2025 by aaarkai Loading…
3 of 5 tasks
[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine ready ONLY add when PR is ready to merge/full CI is needed v1
#29764 opened Dec 1, 2025 by chaunceyjiang Loading…
5 tasks
[BugFix] avoid debug log on hot path v1
#29761 opened Dec 1, 2025 by BoyuanFeng Loading…
[BugFix] Preserve spec decoding uniform decode when scheduling ready ONLY add when PR is ready to merge/full CI is needed v1
#29759 opened Dec 1, 2025 by njhill Loading…
Add Mistral Large 3 deepseek Related to DeepSeek models new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding v1
#29757 opened Nov 30, 2025 by juliendenize Loading…
1 of 5 tasks
docs: add guide to reduce PyTorch profiler overhead via env vars (#29564) documentation Improvements or additions to documentation
#29753 opened Nov 30, 2025 by kbp4154 Loading…
[Feature]Add EVS (Efficient Video Sampling) Support for Qwen3-VL qwen Related to Qwen models
#29752 opened Nov 30, 2025 by skyloevil Loading…
[crashfix] Eagle + multimodal can crash on mm cache miss ready ONLY add when PR is ready to merge/full CI is needed v1
#29750 opened Nov 30, 2025 by mickaelseznec Loading…
5 tasks
Add OpenVLA model support
#29738 opened Nov 30, 2025 by yongming-qin Draft
Add KV Cache Memory Estimator Example Script documentation Improvements or additions to documentation
#29736 opened Nov 29, 2025 by ksenthilnathan02 Loading…
[Frontend] Add streaming tool-call support to Responses API (non-Harmony) frontend gpt-oss Related to GPT-OSS models
#29726 opened Nov 29, 2025 by sumitaryal Loading…
5 tasks
[WIP][Feat][Sched] Support Balance Scheduling v1
#29721 opened Nov 29, 2025 by GDzhu01 Loading…
5 tasks
ProTip! Filter pull requests by the default branch with base:main.