vllm-project / vllm Public

Notifications
Fork 6.9k
Star 45.1k

Code
Issues 1.7k
Pull requests 582
Discussions
Actions
Projects 10
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q2 2025

#15735 opened Mar 29, 2025 by simon-mo

Open 7

[V1] Feedback Thread

#12568 opened Jan 30, 2025 by simon-mo

Open 87

[Roadmap] vLLM Release/CI/Performance Benchmark Q2 2025

#16284 opened Apr 8, 2025 by khluu

Open 3

Beta

Labels 45 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,703 Open 6,310 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: Non-avx512 CPUs cannot use tensor parallelism bug

Something isn't working

#16800 opened Apr 17, 2025 by zhinianqin

1 task done

[Bug]: Trying to load llama4 model with is trained from llama4 scout. But this is text only FineTune which has the following config. bug

Something isn't working

#16798 opened Apr 17, 2025 by yeshsurya

1 task done

[Bug]: Qwen/Qwen2.5-VL-3B-Instruct doesnt identify tools bug

Something isn't working

#16797 opened Apr 17, 2025 by PedroMiolaSilva

1 task done

[Bug]: Unable to deploy Qwen2.5-VL-3B-Instruct after updating vLLM to latest version bug

Something isn't working

#16791 opened Apr 17, 2025 by Zhiyuan-Fan

1 task done

[Feature]: Absolute gpu_memory_utilization feature request

New feature or request

#16786 opened Apr 17, 2025 by Podidiving

1 task done

[Installation]: pip install -e . on WSL2 Got OSError: [Errno 5] Input/output error: 'hl-smi' installation

Installation problems

#16785 opened Apr 17, 2025 by francis1992

1 task done

[Bug]: InternVL3-9B call is hanging bug

Something isn't working

#16782 opened Apr 17, 2025 by xxzhang0927

1 task done

[Bug]: vllm 0.8.4 start with using ray, and ray's dashboard fails to start bug

Something isn't working

ray

anything related with ray

#16779 opened Apr 17, 2025 by ying2025

1 task done

[Bug]: Could't deploy c4ai-command-a-03-2025 with VLLM docker bug

Something isn't working

#16777 opened Apr 17, 2025 by mru4913

1 task done

[Usage]: [offline inference] How to get a stream response with tools and how to implement the "parallel_tool_calls" parameter usage

How to use vllm

#16775 opened Apr 17, 2025 by konglykly

1 task done

[Bug]: Invalid Mistral ChatCompletionRequest Body Exception bug

Something isn't working

#16774 opened Apr 17, 2025 by JasmondL

1 task done

[Bug]: vllm stopped at vLLM is using nccl==2.21.5 bug

Something isn't working

#16772 opened Apr 17, 2025 by WanianXO

1 task done

[Usage]: How to configure the server parameters for THUDM/GLM-4-32B-0414 to support Function call using vllm-0.8.4? usage

How to use vllm

#16771 opened Apr 17, 2025 by jifa513

1 task done

[Bug]: vllm-v0.7.3 V0 engine TP=16 serve DeepSeek-R1 Crash while inference bug

Something isn't working

#16766 opened Apr 17, 2025 by handsome-chips

1 task done

[Feature]: VLLM does not support inference for the dora-fine-tuned R1-distill-qwen large model. feature request

New feature or request

#16764 opened Apr 17, 2025 by HelloWorldMan-git

1 task done

[Bug]: qwen2.5-vl inference truncated bug

Something isn't working

#16763 opened Apr 17, 2025 by vivian-chen010

1 task done

[Usage]: vLLM fails with NCCL invalid usage error when serving model on multi-GPU usage

How to use vllm

#16761 opened Apr 17, 2025 by whfeLingYu

1 task done

[Usage]: Wrong context length for Qwen2.5-7B-Instruct? usage

How to use vllm

#16757 opened Apr 17, 2025 by tjoymeed

[Bug]: InternVL3-78B OOM on 4 A100 40G in 0.8.4 bug

Something isn't working

#16749 opened Apr 17, 2025 by hanggun

1 task done

[Feature]: AMD Ryzen AI NPU support feature request

New feature or request

#16742 opened Apr 16, 2025 by InspiringCode

1 task done

[Bug]: GuidedDecodingParams choice - Request-level structured output backend must match engine-level backend bug

Something isn't working

#16738 opened Apr 16, 2025 by nrv

1 task done

[Feature]: return graceful inference text input validation errors as part of output (without throwing an exception) - to enable skipping / handling bad examples after the processing of good ones feature request

New feature or request

#16732 opened Apr 16, 2025 by vadimkantorov

1 task done

[Bug]: With --cpu-offload-gb, deepseek-moe-16b-chat got different response, even if the temperature is zero. bug

Something isn't working

#16731 opened Apr 16, 2025 by YenFuLin

1 task done

[Usage]: VLLM>0.8 also met No platform detected, vLLM is running on UnspecifiedPlatform usage

How to use vllm

#16724 opened Apr 16, 2025 by rainays

1 task done

[Bug]: Remove fallback to outlines for int/number range and pattern constraints in guided_json bug

Something isn't working

#16723 opened Apr 16, 2025 by csy1204

1 task done

Previous 1 2 3 4 5 … 68 69 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly