Skip to content
This repository was archived by the owner on Aug 7, 2025. It is now read-only.
This repository was archived by the owner on Aug 7, 2025. It is now read-only.

Issue when sending parallel requests #3361

@lschaupp

Description

@lschaupp

Hello,

I am getting the following error when I send multiple requests in parallel to the inference endpoint:

ERROR: 503
{
"code": 503,
"type": "ServiceUnavailableException",
"message": "Model "restorer" has no worker to serve inference request. Please use scale workers API to add workers. If this is a sequence inference, please check if it is closed, or expired; or exceeds maxSequenceJobQueueSize"
}

I have two separate processes that can access the inference API.
Any ideas?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions