vllm-project / vllm Public

Notifications
Fork 8.8k
Star 52.5k

Code
Issues 1.9k
Pull requests 805
Discussions
Actions
Projects 11
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm

Labels 56 Milestones 1

New pull request New

805 Open 10,086 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Misc] Make MM embedding merge interface explicit in model runner v1

#21147 opened Jul 17, 2025 by ywang96

Loading…

4 tasks

[Core] disable gc during cuda graph capture codex startup-ux v1

#21146 opened Jul 17, 2025 by mgoin

Loading…

Enable multi-image support benchmarking for serving performance

Performance-related issues

#21145 opened Jul 17, 2025 by leopck

Loading…

[Misc] allow pulling vllm in Ray runtime environment

#21143 opened Jul 17, 2025 by eric-higgins-ai

Loading…

Add request preprocess counter in vllm v1

#21139 opened Jul 17, 2025 by vladmihailescu • Draft

[Attention] Optimize FlashInfer MetadataBuilder Build call needs-rebase rocm

Related to AMD ROCm

speculative-decoding v1

#21137 opened Jul 17, 2025 by LucasWilkinson • Draft

4 tasks

[Perf] Using mul instead of div for int8 quant

#21136 opened Jul 17, 2025 by yewentao256

Loading…

[Misc] change default request logging behavior to off codex

#21135 opened Jul 17, 2025 by simon-mo

Loading…

[Bugfix] Fix the tensor non-contiguous issue for Flashinfer TRT-LLM backend attention kernel bug

Something isn't working

ready

ONLY add when PR is ready to merge/full CI is needed

#21133 opened Jul 17, 2025 by elvischenv

Loading…

3 of 4 tasks

[V0 deprecation] Remove V0 HPU backend ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#21131 opened Jul 17, 2025 by WoosukKwon

Loading…

Convert tests to ruff-format deepseek

Related to DeepSeek models

llama

Related to Llama models

multi-modality

Related to multi-modality (#4194)

performance

Performance-related issues

qwen

Related to Qwen models

rocm

Related to AMD ROCm

speculative-decoding structured-output tool-calling v1

#21129 opened Jul 17, 2025 by hmellor

Loading…

[Core] Set pooling params based on task and model frontend ready

ONLY add when PR is ready to merge/full CI is needed

tpu

Related to Google TPUs

#21128 opened Jul 17, 2025 by DarkLight1337

Loading…

2 of 4 tasks

docker: docker-aware precompiled wheel support ci/build

#21127 opened Jul 17, 2025 by dougbtv

Loading…

4 tasks done

[WIP] Use FlashInfer RoPE

#21126 opened Jul 17, 2025 by mgoin

Loading…

4 tasks

[Refactor] Remove Unused Naive Moe Kernels performance

Performance-related issues

#21125 opened Jul 17, 2025 by yewentao256

Loading…

[UPDATED] - Large Block_size solution v1

#21123 opened Jul 17, 2025 by nadathurv

Loading…

[Bugfix] Allocate less memory in non-batched CUTLASS MoE bug

Something isn't working

ready

ONLY add when PR is ready to merge/full CI is needed

#21121 opened Jul 17, 2025 by ElizaWszola

Loading…

security policy: take 1 documentation

Improvements or additions to documentation

#21119 opened Jul 17, 2025 by sidhpurwala-huzaifa

Loading…

feat: add fused MLA QKV + strided layernorm deepseek

Related to DeepSeek models

#21116 opened Jul 17, 2025 by mickaelseznec

Loading…

3 of 4 tasks

[benchmark] Sending request strictly follows the random intervals performance

Performance-related issues

ready

ONLY add when PR is ready to merge/full CI is needed

#21108 opened Jul 17, 2025 by Jialin

Loading…

3 of 4 tasks

Add fused_moe_gate kernel and integrate to DeepSeek MoE layer ci/build deepseek

Related to DeepSeek models

performance

Performance-related issues

#21107 opened Jul 17, 2025 by zhangxy9999 • Draft

[V1][Metrics][Frontend] Add support for custom stat loggers via CLI --stat-loggers frontend

#21105 opened Jul 17, 2025 by ptovam

Loading…

[Model] Re-add the implicit conversion feature for as_seq_cls_model llama

Related to Llama models

new-model

Requests to new models

qwen

Related to Qwen models

#21103 opened Jul 17, 2025 by noooop

Loading…

3 of 4 tasks

[misc][eplb] add valida ep or tp or dp

#21102 opened Jul 17, 2025 by lengrongfu

Loading…

1 of 4 tasks

[Quantization] Enable BNB support for more MoE models ci/build

#21100 opened Jul 17, 2025 by jeejeelee • Draft

4 tasks

Previous 1 2 3 4 5 … 32 33 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!