-
-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Feature] The Qwen3 reasoning parser supports guided decoding
ready
ONLY add when PR is ready to merge/full CI is needed
#17466
opened Apr 30, 2025 by
chaunceyjiang
Loading…
[Model] Add GraniteMoeHybrid 4.0 model
ci/build
documentation
Improvements or additions to documentation
frontend
multi-modality
Related to multi-modality (#4194)
needs-rebase
tool-calling
v1
[Misc] refactor example - cpu_offload_lmcache
documentation
Improvements or additions to documentation
#17460
opened Apr 30, 2025 by
reidliu41
Loading…
[CI/Build] Reorganize models tests
ci/build
multi-modality
Related to multi-modality (#4194)
#17459
opened Apr 30, 2025 by
DarkLight1337
Loading…
[Bugfix] Fixed mistral tokenizer path when pointing to file
#17457
opened Apr 30, 2025 by
psav
Loading…
[Feature][Frontend]: Deprecate --enable-reasoning
documentation
Improvements or additions to documentation
frontend
structured-output
tool-calling
#17452
opened Apr 30, 2025 by
chaunceyjiang
Loading…
Fix more broken speculative decode tests
ready
ONLY add when PR is ready to merge/full CI is needed
speculative-decoding
#17450
opened Apr 30, 2025 by
huydhn
Loading…
[Bugfix] Fix TritonPlaceholder conflicts with torch.compile
v1
#17446
opened Apr 30, 2025 by
MengqingCao
•
Draft
[benchmark][structured output] Add offline benchmark script for structured output
structured-output
#17440
opened Apr 30, 2025 by
lk-chen
Loading…
fix missing
_num_cached_tokens
in subtract_num_batched_tokens
#17436
opened Apr 30, 2025 by
initzhang
Loading…
[V1] Allow turning off pickle fallback in vllm.v1.serial_utils
v1
#17427
opened Apr 30, 2025 by
russellb
Loading…
[Fix] Support passing args to logger
multi-modality
Related to multi-modality (#4194)
structured-output
#17425
opened Apr 29, 2025 by
aarnphm
Loading…
[Misc][AMD] Add query_platform method to interface.py
#17424
opened Apr 29, 2025 by
rasmith
Loading…
[Chore] import as annotations on config
needs-rebase
ready
ONLY add when PR is ready to merge/full CI is needed
#17423
opened Apr 29, 2025 by
aarnphm
Loading…
[Feature][CLI] Unify configuration for structured outputs via Improvements or additions to documentation
needs-rebase
structured-output
tool-calling
v1
--structured-output-config
documentation
#17420
opened Apr 29, 2025 by
aarnphm
Loading…
Avoid overwriting vllm_compile_cache.py
ready
ONLY add when PR is ready to merge/full CI is needed
#17418
opened Apr 29, 2025 by
youngkent
Loading…
[Bugfix] Temporarily disable gptq_bitblas on ROCm
documentation
Improvements or additions to documentation
#17411
opened Apr 29, 2025 by
nlzy
Loading…
[Frontend] Fix tool_call handling in llama3.1 and llama3.2 chat template to allow zero tool_calls
documentation
Improvements or additions to documentation
tool-calling
#17409
opened Apr 29, 2025 by
CatherineSue
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.