-
-
Notifications
You must be signed in to change notification settings - Fork 9.9k
Description
Your current environment
The output of `python collect_env.py`
Your output of `python collect_env.py` here
🐛 Describe the bug
device :8 * H100
python3 -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 23333 --max-model-len 60000 --trust-remote-code --tensor-parallel-size 8 --quantization moe_wna16 --gpu-memory-utilization 0.92 --kv-cache-dtype fp8_e5m2 --calculate-kv-scales --served-model-name deepseek-reasoner --model cognitivecomputations/DeepSeek-R1-AWQ
curl http://localhost:23333/v1/chat/completions
-H "Content-Type: application/json"
-d '{"model": "deepseek-reasoner",
"messages": [
{"role": "user", "content": "你是谁"}
],
"stream":true,
"temperature":1.2
}'
data: {"id":"chatcmpl-c7e88282efa547cfba27b429df7df593","object":"chat.completion.chunk","created":1739440234,"model":"deepseek-reasoner","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-c7e88282efa547cfba27b429df7df593","object":"chat.completion.chunk","created":1739440234,"model":"deepseek-reasoner","choices":[{"index":0,"delta":{"content":""},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-c7e88282efa547cfba27b429df7df593","object":"chat.completion.chunk","created":1739440234,"model":"deepseek-reasoner","choices":[{"index":0,"delta":{"content":""},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-c7e88282efa547cfba27b429df7df593","object":"chat.completion.chunk","created":1739440234,"model":"deepseek-reasoner","choices":[{"index":0,"delta":{"content":""},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-c7e88282efa547cfba27b429df7df593","object":"chat.completion.chunk","created":1739440234,"model":"deepseek-reasoner","choices":[{"index":0,"delta":{"content":""},"logprobs":null,"finish_reason":null}]}
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.