vllm-project / vllm Public

Notifications
Fork 6.5k
Star 42.9k

Code
Issues 1.5k
Pull requests 530
Discussions
Actions
Projects 7
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q1 2025

#11862 opened Jan 8, 2025 by simon-mo

Open 9

[V1] Feedback Thread

#12568 opened Jan 30, 2025 by simon-mo

Open 79

Labels 43 Milestones 1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,549 Open 5,943 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

DeciLMConfig object has no attribute ‘num_key_value_heads_per_layer’ For Nemotron bug

Something isn't working

#15625 opened Mar 27, 2025 by manitadayon

1 task done

[Bug]: vllm 0.8.2 have severe quality problem bug

Something isn't working

#15622 opened Mar 27, 2025 by aabbccddwasd

1 task done

[Bug]: Triton JIT Compile Regression from PR 15511 bug

Something isn't working

#15619 opened Mar 27, 2025 by Qubitium

1 task done

[Usage]: How to make the Reasoning of deepseek output normally and the final content structured output usage

How to use vllm

#15618 opened Mar 27, 2025 by du0L

1 task done

[Installation]: ValueError: size must contain 'shortest_edge' and 'longest_edge' keys. installation

Installation problems

#15614 opened Mar 27, 2025 by PSH-1997

1 task done

[Bug]: running vllm image in k3s helm chart gives ValueError: invalid literal for int() with base 10: 'tcp://10.43.1.39:8000' bug

Something isn't working

#15613 opened Mar 27, 2025 by Chennakesavulu5

1 task done

[New Model]: Please support Babel series model ASAP new model

Requests to new models

#15612 opened Mar 27, 2025 by ifyoulovexxz

1 task done

[Bug]: The content is empty after gemma3 is deployed on the T4 graphics card to send request inference bug

Something isn't working

#15610 opened Mar 27, 2025 by sujunze

[Usage]: I don't know how to set the maximum number of simultaneous API requests to be processed when calling an API usage

How to use vllm

#15609 opened Mar 27, 2025 by Tu1231

1 task done

[Feature]: In multimodal inference, is it possible to cache textual content and only load images each time to optimize inference efficiency feature request

New feature or request

#15608 opened Mar 27, 2025 by Eduiskss

1 task done

[Bug]: Failed to run deepseek v2 lite model with tp = 4 bug

Something isn't working

#15607 opened Mar 27, 2025 by jiangjiadi

1 task done

[Usage]: Will dynamo be on vllm main branch? usage

How to use vllm

#15606 opened Mar 27, 2025 by johnnynunez

1 task done

[Bug]: Failed to run deepseek v2 lite model with tp = 8 when enabling expert parallel bug

Something isn't working

#15604 opened Mar 27, 2025 by jiangjiadi

1 task done

[Performance]: How to install and use vLLM to serve multiple large language models performance

Performance-related issues

#15602 opened Mar 27, 2025 by moshilangzi

1 task done

[Bug]: Qwen2-VL-2B quantization model has no improvement in reasoning speed compared to the original model bug

Something isn't working

#15601 opened Mar 27, 2025 by Eduiskss

1 task done

[V1] [Performance Benchmark] Benchmark the performance of Speculative Decoding

#15600 opened Mar 27, 2025 by LiuXiaoxuanPKU

[Bug]: Gemma3 GPU memory usage is always oom bug

Something isn't working

#15599 opened Mar 27, 2025 by lyj157175

1 task done

[Bug]:torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB. GPU 6 has a total capacity of 44.53 GiB of which 448.00 KiB is free. bug

Something isn't working

#15598 opened Mar 27, 2025 by g-zhangpp

1 task done

[Bug]: Model Reasoning Warning bug

Something isn't working

#15596 opened Mar 27, 2025 by Eduiskss

1 task done

[Bug]:ModuleNotFoundError: No module named 'vllm._C' bug

Something isn't working

#15592 opened Mar 27, 2025 by lastlastsummer

[Bug]: DeepSeek R1 with V1+FLASHMLA on L40S bug

Something isn't working

#15590 opened Mar 27, 2025 by longqu

1 task done

[Bug]: guided_json not working correctly with (quantized) mistral-small model bug

Something isn't working

#15577 opened Mar 26, 2025 by VMinB12

1 task done

TP4 fails with 5090 in the mix

#15576 opened Mar 26, 2025 by pavanimajety

[Feature]: Output the JSON for the response payload when VLLM_LOGGING_LEVEL=DEBUG feature request

New feature or request

good first issue

Good for newcomers

#15571 opened Mar 26, 2025 by amfred

1 task done

[Bug]: Vllm 0.8.2 + Ray 2.44 (Ray serve deployment) fallbacks to V0 Engine bug

Something isn't working

#15569 opened Mar 26, 2025 by Qasimk555

1 task done

Previous 1 2 3 4 5 … 61 62 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly