-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: Properly set engine_id when using multi connector in dynamo
#19487
opened Jun 11, 2025 by
Missmiaom
Loading…
[Misc] Fix misleading ROCm warning
rocm
Related to AMD ROCm
#19486
opened Jun 11, 2025 by
jeejeelee
Loading…
3 of 4 tasks
[Core] Do not copy array during hashing
multi-modality
Related to multi-modality (#4194)
v1
#19484
opened Jun 11, 2025 by
lgeiger
Loading…
[V1][Metrics] Add instance_id (hostname) label for prometheus metrics
v1
#19469
opened Jun 11, 2025 by
reidliu41
Loading…
4 tasks
[BugFix] Destroy nccl Comm to fix cuda memory leak of
destroy_model_parallel
#19465
opened Jun 11, 2025 by
wcsjtu
Loading…
3 of 4 tasks
[WIP][Perf] Improve/Fix-regression for FA3 in QPS regimes
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#19463
opened Jun 11, 2025 by
LucasWilkinson
Loading…
2 of 4 tasks
[doc] fix TPU getting started page
documentation
Improvements or additions to documentation
#19457
opened Jun 11, 2025 by
davidxia
Loading…
[Bugfix] Update the example code, make it work with the latest lmcache
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
#19453
opened Jun 11, 2025 by
runzhen
Loading…
Enforce contiguous input for dynamic_per_token FP8/INT8 quant
#19452
opened Jun 11, 2025 by
mgoin
Loading…
Draft: WIP NixlConnector drop ZMQ in favor of HTTP metadata exchanges
frontend
needs-rebase
v1
#19447
opened Jun 10, 2025 by
wseaton
Loading…
2 of 7 tasks
[Kernel] Integrate IBM/Applied-AI fused moe kernels
#19443
opened Jun 10, 2025 by
varun-sundar-rabindranath
•
Draft
[Misc] Update lmcache connector with the latest connector apis
#19441
opened Jun 10, 2025 by
YaoJiayi
Loading…
[Chore] debloat some initial logs
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#19438
opened Jun 10, 2025 by
aarnphm
Loading…
[BugFix] Honor Something isn't working
v1
enable_caching
in connector-delayed kvcache load case
bug
#19435
opened Jun 10, 2025 by
njhill
Loading…
[WIP][FP8] ScaledMM refactor
needs-rebase
#19434
opened Jun 10, 2025 by
ProExpertProg
•
Draft
4 tasks
Fixed power build by building numba from source
ci/build
#19433
opened Jun 10, 2025 by
npanpaliya
Loading…
4 tasks
[Bugfix]: fix JSON decode error when tool call argument is empty
frontend
#19428
opened Jun 10, 2025 by
my-git9
Loading…
Mistral tool parser streaming update
frontend
tool-calling
#19425
opened Jun 10, 2025 by
avigny
Loading…
[Misc][Benchmarking] Add variable request-rate ("ramp-up") to the benchmarking client.
#19423
opened Jun 10, 2025 by
dtransposed
Loading…
Added FP8 support quantization support to DualChunkFlashAttentionBackend
#19420
opened Jun 10, 2025 by
ExtReMLapin
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-05-11.