[Bug]: vllm server bad

### Your current environment

vllm 0.7.2
torch 2.4
cuda 12.1


### 🐛 Describe the bug

OpenAI-Compatible Server in chat window call by url base_url="http://localhost:8000/v1"  when call api，why 200 OK only the first time and then always 400 Bad Request：

log 阿斯follows：
INFO:     127.0.0.1:59042 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO 02-15 22:09:59 engine.py:275] Added request chatcmpl-803293759b1e415caefd7845b3fa8352.
INFO 02-15 22:10:03 metrics.py:455] Avg prompt throughput: 33.4 tokens/s, Avg generation throughput: 37.8 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.1%, CPU KV cache usage: 0.0%.
INFO 02-15 22:10:08 metrics.py:455] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 43.2 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.1%, CPU KV cache usage: 0.0%.
INFO:     127.0.0.1:59042 - "POST /v1/chat/completions HTTP/1.1" 400 Bad Request


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: vllm server bad #13340

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: vllm server bad #13340

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions