-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
DeciLMConfig object has no attribute ‘num_key_value_heads_per_layer’ For Nemotron
bug
Something isn't working
#15625
opened Mar 27, 2025 by
manitadayon
1 task done
[Bug]: vllm 0.8.2 have severe quality problem
bug
Something isn't working
#15622
opened Mar 27, 2025 by
aabbccddwasd
1 task done
[Bug]: Triton JIT Compile Regression from PR 15511
bug
Something isn't working
#15619
opened Mar 27, 2025 by
Qubitium
1 task done
[Usage]: How to make the Reasoning of deepseek output normally and the final content structured output
usage
How to use vllm
#15618
opened Mar 27, 2025 by
du0L
1 task done
[Installation]: ValueError: size must contain 'shortest_edge' and 'longest_edge' keys.
installation
Installation problems
#15614
opened Mar 27, 2025 by
PSH-1997
1 task done
[Bug]: running vllm image in k3s helm chart gives ValueError: invalid literal for int() with base 10: 'tcp://10.43.1.39:8000'
bug
Something isn't working
#15613
opened Mar 27, 2025 by
Chennakesavulu5
1 task done
[New Model]: Please support Babel series model ASAP
new model
Requests to new models
#15612
opened Mar 27, 2025 by
ifyoulovexxz
1 task done
[Bug]: The content is empty after gemma3 is deployed on the T4 graphics card to send request inference
bug
Something isn't working
#15610
opened Mar 27, 2025 by
sujunze
[Usage]: I don't know how to set the maximum number of simultaneous API requests to be processed when calling an API
usage
How to use vllm
#15609
opened Mar 27, 2025 by
Tu1231
1 task done
[Feature]: In multimodal inference, is it possible to cache textual content and only load images each time to optimize inference efficiency
feature request
New feature or request
#15608
opened Mar 27, 2025 by
Eduiskss
1 task done
[Bug]: Failed to run deepseek v2 lite model with tp = 4
bug
Something isn't working
#15607
opened Mar 27, 2025 by
jiangjiadi
1 task done
[Usage]: Will dynamo be on vllm main branch?
usage
How to use vllm
#15606
opened Mar 27, 2025 by
johnnynunez
1 task done
[Bug]: Failed to run deepseek v2 lite model with tp = 8 when enabling expert parallel
bug
Something isn't working
#15604
opened Mar 27, 2025 by
jiangjiadi
1 task done
[Performance]: How to install and use vLLM to serve multiple large language models
performance
Performance-related issues
#15602
opened Mar 27, 2025 by
moshilangzi
1 task done
[Bug]: Qwen2-VL-2B quantization model has no improvement in reasoning speed compared to the original model
bug
Something isn't working
#15601
opened Mar 27, 2025 by
Eduiskss
1 task done
[V1] [Performance Benchmark] Benchmark the performance of Speculative Decoding
#15600
opened Mar 27, 2025 by
LiuXiaoxuanPKU
[Bug]: Gemma3 GPU memory usage is always oom
bug
Something isn't working
#15599
opened Mar 27, 2025 by
lyj157175
1 task done
[Bug]:torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB. GPU 6 has a total capacity of 44.53 GiB of which 448.00 KiB is free.
bug
Something isn't working
#15598
opened Mar 27, 2025 by
g-zhangpp
1 task done
[Bug]: Model Reasoning Warning
bug
Something isn't working
#15596
opened Mar 27, 2025 by
Eduiskss
1 task done
[Bug]:ModuleNotFoundError: No module named 'vllm._C'
bug
Something isn't working
#15592
opened Mar 27, 2025 by
lastlastsummer
[Bug]: DeepSeek R1 with V1+FLASHMLA on L40S
bug
Something isn't working
#15590
opened Mar 27, 2025 by
longqu
1 task done
[Bug]: guided_json not working correctly with (quantized) mistral-small model
bug
Something isn't working
#15577
opened Mar 26, 2025 by
VMinB12
1 task done
[Feature]: Output the JSON for the response payload when VLLM_LOGGING_LEVEL=DEBUG
feature request
New feature or request
good first issue
Good for newcomers
#15571
opened Mar 26, 2025 by
amfred
1 task done
[Bug]: Vllm 0.8.2 + Ray 2.44 (Ray serve deployment) fallbacks to V0 Engine
bug
Something isn't working
#15569
opened Mar 26, 2025 by
Qasimk555
1 task done
Previous Next
ProTip!
no:milestone will show everything without a milestone.