Skip to content

Why is ChatCompletionService#createStreaming blocking the thread for a significant amount of time #646

@DeidaraMC

Description

@DeidaraMC

When attempting to stream data with a client like so:
try (var stream = client.chat().completions().createStreaming(params))
the thread is blocked at this line of code for 5-15 seconds after which the response is nearly instant
It feels like the data is getting buffered until the entire response is returned and only then will the stream return
Using ChatModel.GPT_5_NANO

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions