We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Description When I use two clients to send /v2/repository/models/MODEL/load requests to the same server at the same time, the model is loaded twice
/v2/repository/models/MODEL/load
Triton Information What version of Triton are you using? 23.11
Are you using the Triton container or did you build it yourself? Container nvcr.io/nvidia/tritonserver:23.11-py3
To Reproduce Start a server in explicit mode, and load no model.
Open two terminals, run curl -X POST "http://localhost:8000/v2/repository/models/MODEL/load" -d "{}" at the same time. You can see logs like
curl -X POST "http://localhost:8000/v2/repository/models/MODEL/load" -d "{}"
successfully loaded MODEL loading: MODEL successfully loaded MODEL successfully unloaded MODEL
Expected behavior The model should be only loaded once. And the log successfully unloaded MODEL should be before successfully loaded MODEL
successfully unloaded MODEL
successfully loaded MODEL
The text was updated successfully, but these errors were encountered:
Hi @ShuaiShao93 , thanks a lot for reaching out. Can you provide with the following details
I am unable to reproduce this
When I try to load a model simultaneously it just gets loaded once.
Sorry, something went wrong.
Hi @ShuaiShao93 , thanks a lot for reaching out. Can you provide with the following details What type of model/backend?
Ensemble pipeline with Python & ONNX backends
Can you reproduce this behavior with other types of models/backends? Or is it specific to this one?
Sorry didn't get a chance to test more
Not sure how are you getting the unloaded log? Are you making a unload request?
No, I just made load requests simultaneously from two clients, and I saw the unloaded logs
I am unable to reproduce this When I try to load a model simultaneously it just gets loaded once.
No branches or pull requests
Description
When I use two clients to send
/v2/repository/models/MODEL/load
requests to the same server at the same time, the model is loaded twiceTriton Information
What version of Triton are you using?
23.11
Are you using the Triton container or did you build it yourself?
Container nvcr.io/nvidia/tritonserver:23.11-py3
To Reproduce
Start a server in explicit mode, and load no model.
Open two terminals, run
curl -X POST "http://localhost:8000/v2/repository/models/MODEL/load" -d "{}"
at the same time. You can see logs likeExpected behavior
The model should be only loaded once. And the log
successfully unloaded MODEL
should be beforesuccessfully loaded MODEL
The text was updated successfully, but these errors were encountered: