Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Enable multi-image support benchmarking for serving performance Performance-related issues
#21145 opened Jul 17, 2025 by leopck Loading…
[Perf] Using mul instead of div for int8 quant
#21136 opened Jul 17, 2025 by yewentao256 Loading…
[Bugfix] Fix the tensor non-contiguous issue for Flashinfer TRT-LLM backend attention kernel bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed v1
#21133 opened Jul 17, 2025 by elvischenv Loading…
3 of 4 tasks
[V0 deprecation] Remove V0 HPU backend ci/build ready ONLY add when PR is ready to merge/full CI is needed
#21131 opened Jul 17, 2025 by WoosukKwon Loading…
Convert tests to ruff-format deepseek Related to DeepSeek models llama Related to Llama models multi-modality Related to multi-modality (#4194) performance Performance-related issues qwen Related to Qwen models rocm Related to AMD ROCm speculative-decoding structured-output tool-calling v1
#21129 opened Jul 17, 2025 by hmellor Loading…
[Core] Set pooling params based on task and model frontend ready ONLY add when PR is ready to merge/full CI is needed tpu Related to Google TPUs v1
#21128 opened Jul 17, 2025 by DarkLight1337 Loading…
2 of 4 tasks
docker: docker-aware precompiled wheel support ci/build
#21127 opened Jul 17, 2025 by dougbtv Loading…
4 tasks done
[WIP] Use FlashInfer RoPE
#21126 opened Jul 17, 2025 by mgoin Loading…
4 tasks
[Refactor] Remove Unused Naive Moe Kernels performance Performance-related issues
#21125 opened Jul 17, 2025 by yewentao256 Loading…
[UPDATED] - Large Block_size solution v1
#21123 opened Jul 17, 2025 by nadathurv Loading…
[Bugfix] Allocate less memory in non-batched CUTLASS MoE bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed
#21121 opened Jul 17, 2025 by ElizaWszola Loading…
security policy: take 1 documentation Improvements or additions to documentation
#21119 opened Jul 17, 2025 by sidhpurwala-huzaifa Loading…
feat: add fused MLA QKV + strided layernorm deepseek Related to DeepSeek models
#21116 opened Jul 17, 2025 by mickaelseznec Loading…
3 of 4 tasks
[benchmark] Sending request strictly follows the random intervals performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed
#21108 opened Jul 17, 2025 by Jialin Loading…
3 of 4 tasks
Add fused_moe_gate kernel and integrate to DeepSeek MoE layer ci/build deepseek Related to DeepSeek models performance Performance-related issues
#21107 opened Jul 17, 2025 by zhangxy9999 Draft
[Model] Re-add the implicit conversion feature for as_seq_cls_model llama Related to Llama models new-model Requests to new models qwen Related to Qwen models
#21103 opened Jul 17, 2025 by noooop Loading…
3 of 4 tasks
[misc][eplb] add valida ep or tp or dp
#21102 opened Jul 17, 2025 by lengrongfu Loading…
1 of 4 tasks
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.