-
-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug]: Can't deserialize object: ObjectRef,DeepSeek R1, H20*16, pp2, tp8, v1 engine
bug
Something isn't working
#15333
opened Mar 22, 2025 by
markluofd
1 task done
[Bug]: RuntimeError: The size of tensor a (1059) must match the size of tensor b (376) at non-singleton dimension, DeepSeek R1 H20x16 pp2, v1 engine
bug
Something isn't working
#15332
opened Mar 22, 2025 by
markluofd
1 task done
[Performance]: poor performance in pipeline parallesm when batch-size is large
performance
Performance-related issues
#15330
opened Mar 22, 2025 by
nannaer
1 task done
[Bug]: ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?)
bug
Something isn't working
#15327
opened Mar 22, 2025 by
johnturner108
1 task done
[Feature]: looking into adding a generation algorithm
feature request
New feature or request
#15315
opened Mar 22, 2025 by
tiger241
1 task done
[Usage]: Async engine batch request
usage
How to use vllm
#15314
opened Mar 21, 2025 by
tiger241
1 task done
[Bug]: vLLM declares itself healthy before it can serve requests
bug
Something isn't working
#15313
opened Mar 21, 2025 by
kiratp
1 task done
[Bug]: Crashing on unsupported Sampling params
bug
Something isn't working
#15312
opened Mar 21, 2025 by
kiratp
1 task done
[Usage]: Generating multiple completions with Qwen QwQ 32B
usage
How to use vllm
#15304
opened Mar 21, 2025 by
madhavaggar
[Bug]: OPEA/Mistral-Small-3.1-24B-Instruct-2503-int4-AutoRound-awq-sym error
bug
Something isn't working
#15300
opened Mar 21, 2025 by
moshilangzi
1 task done
[Bug]: Worker VllmWorkerProcess pid 000000 died, exit code: -15
bug
Something isn't working
#15295
opened Mar 21, 2025 by
a7mad911
1 task done
[Bug]: Critical Memory Leak in vLLM V1 Engine: 200+ GB RAM Usage from Image Inference
bug
Something isn't working
#15294
opened Mar 21, 2025 by
oyerli
1 task done
[Bug]: Qwen2.5 VL online service can not input video and image simultaneously.
bug
Something isn't working
#15291
opened Mar 21, 2025 by
Thyme-git
1 task done
[Feature]: Dynamic Memory Release for GPU after idle time
feature request
New feature or request
#15287
opened Mar 21, 2025 by
kmamine
1 task done
[Usage]: why no ray command in my docker image
usage
How to use vllm
#15284
opened Mar 21, 2025 by
yanzhichao
1 task done
[Bug]: GGUF model with architecture deepseek2 is not supported yet while vllm version is 0.8.1
bug
Something isn't working
#15277
opened Mar 21, 2025 by
Zeppelinpp
1 task done
[Bug]: int8 2:4 sparse time more than fp8
bug
Something isn't working
#15275
opened Mar 21, 2025 by
zhink
1 task done
[Bug]:streming is lost in arguments in tool_calls
bug
Something isn't working
#15274
opened Mar 21, 2025 by
xiaodizi
1 task done
[Bug]: Inconsistent Output Based on Presence of chat_template Parameter
bug
Something isn't working
#15272
opened Mar 21, 2025 by
SmartManoj
1 task done
[Feature]: Can support CPU inference with Ray cluster?
feature request
New feature or request
#15266
opened Mar 21, 2025 by
MaoJianwei
1 task done
[Bug]: qwen2.5vl cannot use fp8 quantization
bug
Something isn't working
#15264
opened Mar 21, 2025 by
lessmore991
1 task done
[Bug]: oracle for device checking raise exception unexpectly
bug
Something isn't working
#15263
opened Mar 21, 2025 by
Selkh
1 task done
[Bug]: OOM with QwQ-32B
bug
Something isn't working
#15258
opened Mar 21, 2025 by
vmajor
1 task done
Previous Next
ProTip!
no:milestone will show everything without a milestone.