Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending two "load" requests to server makes it load twice #7018

Open
ShuaiShao93 opened this issue Mar 21, 2024 · 2 comments
Open

Sending two "load" requests to server makes it load twice #7018

ShuaiShao93 opened this issue Mar 21, 2024 · 2 comments
Labels
question Further information is requested

Comments

@ShuaiShao93
Copy link

Description
When I use two clients to send /v2/repository/models/MODEL/load requests to the same server at the same time, the model is loaded twice

Triton Information
What version of Triton are you using?
23.11

Are you using the Triton container or did you build it yourself?
Container nvcr.io/nvidia/tritonserver:23.11-py3

To Reproduce
Start a server in explicit mode, and load no model.

Open two terminals, run curl -X POST "http://localhost:8000/v2/repository/models/MODEL/load" -d "{}" at the same time. You can see logs like

 successfully loaded MODEL
loading: MODEL
successfully loaded MODEL
successfully unloaded MODEL

Expected behavior
The model should be only loaded once. And the log successfully unloaded MODEL should be before successfully loaded MODEL

@indrajit96
Copy link
Contributor

Hi @ShuaiShao93 , thanks a lot for reaching out.
Can you provide with the following details

  1. What type of model/backend?
  2. Can you reproduce this behavior with other types of models/backends? Or is it specific to this one?
  3. Not sure how are you getting the unloaded log? Are you making a unload request?

I am unable to reproduce this

When I try to load a model simultaneously it just gets loaded once.

@indrajit96 indrajit96 added the question Further information is requested label Mar 25, 2024
@ShuaiShao93
Copy link
Author

Hi @ShuaiShao93 , thanks a lot for reaching out. Can you provide with the following details

  1. What type of model/backend?

Ensemble pipeline with Python & ONNX backends

  1. Can you reproduce this behavior with other types of models/backends? Or is it specific to this one?

Sorry didn't get a chance to test more

  1. Not sure how are you getting the unloaded log? Are you making a unload request?

No, I just made load requests simultaneously from two clients, and I saw the unloaded logs

I am unable to reproduce this

When I try to load a model simultaneously it just gets loaded once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Development

No branches or pull requests

2 participants