-
-
Notifications
You must be signed in to change notification settings - Fork 6k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix] Fix DeepSeek MTP crash when using TP1ModelRunner with CUDA graph due to shape mismatch
speculative-decoding
#14237
opened Mar 4, 2025 by
pyc96
Loading…
[V1][Bugfix] Do not reset prefix caching metrics
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#14235
opened Mar 4, 2025 by
comaniac
Loading…
Adjusting invalid k values in top-k selection beyond vocabulary limits
v1
#14234
opened Mar 4, 2025 by
dashanji
Loading…
Use getattr for hidden_act and hidden_activation in Gemma models
#14230
opened Mar 4, 2025 by
richardsliu
Loading…
[CI] Make UT cases in test_comm_ops.py compatible with more devices
#14229
opened Mar 4, 2025 by
wwfu109
Loading…
Serialize using safetensors for KV caches
ready
ONLY add when PR is ready to merge/full CI is needed
#14228
opened Mar 4, 2025 by
KuntaiDu
Loading…
[V1][TPU] Support V1 Sampler for ragged attention
ci/build
v1
#14227
opened Mar 4, 2025 by
NickLucche
Loading…
[V1] Do not detokenize if sampling param detokenize is False
v1
#14224
opened Mar 4, 2025 by
hj-mistral
Loading…
[Doc] Update nginx guide: remove privileged from vllm container run and add target GPU ID
documentation
Improvements or additions to documentation
#14217
opened Mar 4, 2025 by
iacolippo
Loading…
[Model] Add Reasoning Parser for Granite Models
documentation
Improvements or additions to documentation
frontend
#14202
opened Mar 4, 2025 by
alex-jw-brooks
Loading…
Moved numba from common requirements to cuda/rocm specific requirements
ci/build
force-merge
ready
ONLY add when PR is ready to merge/full CI is needed
#14199
opened Mar 4, 2025 by
npanpaliya
Loading…
docs: Add documentation for s390x cpu implementation
documentation
Improvements or additions to documentation
#14198
opened Mar 4, 2025 by
dilipgb
Loading…
[misc] Update blog link in README
documentation
Improvements or additions to documentation
#14194
opened Mar 4, 2025 by
khluu
Loading…
[RLHF] use worker_adapter_cls for compatibility with V0 and V1
ci/build
documentation
Improvements or additions to documentation
#14185
opened Mar 4, 2025 by
youkaichao
Loading…
[Bugfix] Correctly call
cudaProfilerStop
in benchmarks script
#14183
opened Mar 4, 2025 by
b8zhong
Loading…
Fix WorkerWrapperBase initialization: defer vllm_config setup
#14179
opened Mar 4, 2025 by
vincent-4
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.