Skip to content

[Bug]: After receiving the request, the service froze #19800

@wmj9346464543

Description

@wmj9346464543

Your current environment

Python 3.9.21
pydantic-ai 0.3.0
vllm 0.9.0.1

deploy code:

CUDA_VISIBLE_DEVICES=3 nohup python -m vllm.entrypoints.openai.api_server
--model /data/ckpt/Qwen/Qwen2.5-14B-Instruct
--tensor-parallel-size 1
--max-model-len 16384
--port 7509
--gpu-memory-utilization 0.95
--disable-log-stats
--served-model-name qwen2.5-14b-instruct
--max-num-batched-tokens 100000
--max-num-seqs 1500
--enable-prefix-caching
--tokenizer-pool-size=32
--enable-auto-tool-choice
--tool-call-parser hermes
--trust-remote-code

🐛 Describe the bug

After receiving the request, the service froze,Long term no response, new requests cannot be returned

server log:

INFO 06-18 16:48:22 [logger.py:42] Received request chatcmpl-b2fba3c87bfb4896bf83656515c8ecca: prompt: '<|im_start|>system\nplease extract the user profile information from the following text. The output should be a JSON object with the keys "name", "dob" (date of birth), and "bio" (a short biography). If any information is not available, leave that key out of the JSON object.\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within XML tags:\n\n{"type": "function", "function": {"name": "final_result", "description": "The final response which ends this conversation", "parameters": {"additionalProperties": false, "properties": {"name": {"type": "string"}, "dob": {"format": "date", "type": "string"}, "bio": {"type": "string"}}, "type": "object"}}}\n\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{"name": , "arguments": }\n</tool_call><|im_end|>\n<|im_start|>user\nMy name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.05, temperature=0.7, top_p=0.8, top_k=20, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=16126, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=GuidedDecodingParams(json={'type': 'array', 'minItems': 1, 'items': {'type': 'object', 'anyOf': [{'properties': {'name': {'type': 'string', 'enum': ['final_result']}, 'parameters': {'additionalProperties': False, 'properties': {'name': {'type': 'string'}, 'dob': {'format': 'date', 'type': 'string'}, 'bio': {'type': 'string'}}, 'type': 'object'}}, 'required': ['name', 'parameters']}]}}, regex=None, choice=None, grammar=None, json_object=None, backend=None, backend_was_auto=False, disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, whitespace_pattern=None, structural_tag=None), extra_args=None), prompt_token_ids: None, prompt_embeds shape: None, lora_request: None, prompt_adapter_request: None.

INFO: 10.70.110.222:56702 - "POST /v1/chat/completions HTTP/1.1" 200 OK

INFO 06-18 16:48:22 [async_llm.py:261] Added request chatcmpl-b2fba3c87bfb4896bf83656515c8ecca.

test-code:
`from datetime import date
from typing import Dict, List
from loguru import logger
from pydantic_ai import Agent, Tool
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic import ValidationError
from typing_extensions import TypedDict
from pydantic_ai import Agent

class UserProfile(TypedDict, total=False):
name: str
dob: date
bio: str

class Chatbot:
def init(self):
self.model = OpenAIModel(
model_name="qwen2.5-14b-instruct",
provider=OpenAIProvider(
base_url="http://0.0.0.0:7509/v1/",
api_key="password"
),
)
self.agent = Agent(
model=self.model,
system_prompt='please extract the user profile information from the following text. The output should be a JSON object with the keys "name", "dob" (date of birth), and "bio" (a short biography). If any information is not available, leave that key out of the JSON object.',
output_type=UserProfile
)

async def main():
chatbot = Chatbot()
user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.'

# async with chatbot.agent.run_stream(user_prompt=user_input) as response:
#     async for chunk in response.stream_structured():
#         print(chunk, end='', flush=True)
async with chatbot.agent.run_stream(user_input) as result:
    async for message, last in result.stream_structured():
        # print(last, message)
        try:
            profile = await result.validate_structured_output(  
                message,
                allow_partial=not last,
            )
        except ValidationError:
            continue
        print(profile)
        #> {'name': 'Ben'}
        #> {'name': 'Ben'}
        #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'}
        #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '}
        #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'}
        #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
        #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}

if name == "main":
import asyncio
asyncio.run(main())`

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions