vllm-project / vllm Public

Notifications
Fork 6.4k
Star 42.3k

Code
Issues 1.5k
Pull requests 534
Discussions
Actions
Projects 7
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q1 2025

#11862 opened Jan 8, 2025 by simon-mo

Open 8

[V1] Feedback Thread

#12568 opened Jan 30, 2025 by simon-mo

Open 72

Labels 42 Milestones 1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,515 Open 5,839 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: Can't deserialize object: ObjectRef，DeepSeek R1, H20*16, pp2, tp8, v1 engine bug

Something isn't working

#15333 opened Mar 22, 2025 by markluofd

1 task done

[Bug]: RuntimeError: The size of tensor a (1059) must match the size of tensor b (376) at non-singleton dimension, DeepSeek R1 H20x16 pp2, v1 engine bug

Something isn't working

#15332 opened Mar 22, 2025 by markluofd

1 task done

[Performance]: poor performance in pipeline parallesm when batch-size is large performance

Performance-related issues

#15330 opened Mar 22, 2025 by nannaer

1 task done

[Bug]: bug

Something isn't working

#15329 opened Mar 22, 2025 by ladygagaclass

1 task done

[Bug]: ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?) bug

Something isn't working

#15327 opened Mar 22, 2025 by johnturner108

1 task done

[Feature]: looking into adding a generation algorithm feature request

New feature or request

#15315 opened Mar 22, 2025 by tiger241

1 task done

[Usage]: Async engine batch request usage

How to use vllm

#15314 opened Mar 21, 2025 by tiger241

1 task done

[Bug]: vLLM declares itself healthy before it can serve requests bug

Something isn't working

#15313 opened Mar 21, 2025 by kiratp

1 task done

[Bug]: Crashing on unsupported Sampling params bug

Something isn't working

#15312 opened Mar 21, 2025 by kiratp

1 task done

[Usage]: Generating multiple completions with Qwen QwQ 32B usage

How to use vllm

#15304 opened Mar 21, 2025 by madhavaggar

[Bug]: OPEA/Mistral-Small-3.1-24B-Instruct-2503-int4-AutoRound-awq-sym error bug

Something isn't working

#15300 opened Mar 21, 2025 by moshilangzi

1 task done

[Bug]: Worker VllmWorkerProcess pid 000000 died, exit code: -15 bug

Something isn't working

#15295 opened Mar 21, 2025 by a7mad911

1 task done

[Bug]: Critical Memory Leak in vLLM V1 Engine: 200+ GB RAM Usage from Image Inference bug

Something isn't working

#15294 opened Mar 21, 2025 by oyerli

1 task done

[Bug]: Qwen2.5 VL online service can not input video and image simultaneously. bug

Something isn't working

#15291 opened Mar 21, 2025 by Thyme-git

1 task done

[Feature]: Dynamic Memory Release for GPU after idle time feature request

New feature or request

#15287 opened Mar 21, 2025 by kmamine

1 task done

[Usage]: why no ray command in my docker image usage

How to use vllm

#15284 opened Mar 21, 2025 by yanzhichao

1 task done

[Bug]: GGUF model with architecture deepseek2 is not supported yet while vllm version is 0.8.1 bug

Something isn't working

#15277 opened Mar 21, 2025 by Zeppelinpp

1 task done

[Bug]: int8 2:4 sparse time more than fp8 bug

Something isn't working

#15275 opened Mar 21, 2025 by zhink

1 task done

[Bug]:streming is lost in arguments in tool_calls bug

Something isn't working

#15274 opened Mar 21, 2025 by xiaodizi

1 task done

[Bug]: Inconsistent Output Based on Presence of chat_template Parameter bug

Something isn't working

#15272 opened Mar 21, 2025 by SmartManoj

1 task done

vector search feature request

New feature or request

#15268 opened Mar 21, 2025 by 20246688

1 task

[Feature]: Can support CPU inference with Ray cluster? feature request

New feature or request

#15266 opened Mar 21, 2025 by MaoJianwei

1 task done

[Bug]: qwen2.5vl cannot use fp8 quantization bug

Something isn't working

#15264 opened Mar 21, 2025 by lessmore991

1 task done

[Bug]: oracle for device checking raise exception unexpectly bug

Something isn't working

#15263 opened Mar 21, 2025 by Selkh

1 task done

[Bug]: OOM with QwQ-32B bug

Something isn't working

#15258 opened Mar 21, 2025 by vmajor

1 task done

Previous 1 2 3 4 5 … 60 61 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly