-
-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Use xla flag to improve the quantized model performance
ready
ONLY add when PR is ready to merge/full CI is needed
tpu
Related to Google TPUs
v1
#19303
opened Jun 6, 2025 by
vanbasten23
Loading…
3 tasks done
[Misc] Change tests/compile to use VLLM_V1 by default
ready
ONLY add when PR is ready to merge/full CI is needed
Add optional token-level progress bar to
LLM.beam_search
using tqdm
frontend
#19301
opened Jun 6, 2025 by
NekoMimiUnagi
Loading…
3 tasks done
[Bugfix] Re-enable use_cudagraph in vLLM v1
ready
ONLY add when PR is ready to merge/full CI is needed
#19299
opened Jun 6, 2025 by
zou3519
Loading…
[Misc] Fixes and Optimizations for DeepEP + DeepGEMM combination.
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#19298
opened Jun 6, 2025 by
varun-sundar-rabindranath
Loading…
[Quantization] Bump compressed-tensors version
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#19295
opened Jun 6, 2025 by
kylesayrs
Loading…
[V1] Add API docs for EncoderCacheManager
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#19294
opened Jun 6, 2025 by
russellb
Loading…
[TPU] support fp8 kv cache quantization
tpu
Related to Google TPUs
v1
#19292
opened Jun 6, 2025 by
yaochengji
Loading…
[Metrics] Compute and log the serving FLOPs
documentation
Improvements or additions to documentation
#19290
opened Jun 6, 2025 by
sysradium
Loading…
[Misc] Add documentation update reminder to PR template
ci/build
#19289
opened Jun 6, 2025 by
Isotr0py
Loading…
1 of 3 tasks
[Frontend] Remove unreachable code from llm.py
frontend
#19288
opened Jun 6, 2025 by
KsuParkhamchuk
Loading…
[CI/Build] Improve Llama GGUF test robustness
ready
ONLY add when PR is ready to merge/full CI is needed
#19287
opened Jun 6, 2025 by
Isotr0py
Loading…
1 of 3 tasks
[Core] Update error message for Whisper + num-scheduler-steps > 1
ready
ONLY add when PR is ready to merge/full CI is needed
#19286
opened Jun 6, 2025 by
russellb
Loading…
[Bugfix]: Fix TypeError: 'float' object cannot be interpreted as an integer
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#19283
opened Jun 6, 2025 by
chaunceyjiang
Loading…
Update compatible packaging version
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#19279
opened Jun 6, 2025 by
pramenku
Loading…
Convert kv_transfer_config from dict to KVTransferConfig to fix #19259
frontend
#19262
opened Jun 6, 2025 by
maobaolong
Loading…
[New Model]: Support Qwen3 Embedding & Reranker
frontend
#19260
opened Jun 6, 2025 by
noooop
Loading…
[CPU] Fix torch version in x86 CPU backend and refine default configurations
ci/build
multi-modality
Related to multi-modality (#4194)
v1
#19258
opened Jun 6, 2025 by
bigPYJ1151
Loading…
3 tasks done
[CI][PowerPC] Use a more appropriate way to select testcase in tests/models/language/pooling/test_embedding.py
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#19253
opened Jun 6, 2025 by
AaruniAggarwal
Loading…
[Misc] refactor context extension
documentation
Improvements or additions to documentation
#19246
opened Jun 6, 2025 by
reidliu41
Loading…
3 tasks
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.