-
Notifications
You must be signed in to change notification settings - Fork 90
Open
Labels
Description
I'm seeing empty prompts when benchmarking DeepSeekV3 with prompt length 1.
I could not reproduce this with either Qwen/Qwen3-30B-A3B-FP8
or deepseek-ai/DeepSeek-V2-Lite
This is the error I see coming out of guidellm
25-10-13 21:00:46|ERROR |guidellm.backend.openai:text_completions:307 - OpenAIHTTPBackend request with headers: {'Content-Type': 'application/json'} and params: {} and payload: {'prompt': '', 'model': 'deepseek-ai/DeepSeek-V3', 'stream': True, 'stream_options': {'include_usage': True}, 'max_tokens': 100, 'stop': None, 'ignore_eos': True} failed: Client error '400 Bad Request' for url 'http://127.0.0.1:8192/v1/completions'
The vLLM server then logs:
(APIServer pid=3766874) INFO: 127.0.0.1:45320 - "POST /v1/completions HTTP/1.1" 400 Bad Request
To Repro:
vllm serve deepseek-ai/DeepSeek-V3 --hf-overrides.num_hidden_layers=4 --load-format=dummy --port 8192 -tp 2
guidellm benchmark \
--target http://127.0.0.1:8192 \
--rate-type constant \
--rate 2048 \
--max-requests 10000 \
--data '{"prompt_tokens": 1, "output_tokens": 100}'