-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread safety question about python grpcclient and server #2616
Comments
I use |
Yes it does contain the fix from that PR. Can you share a minimal example of your client to repro the same? |
@CoderHam Using the following code to get the face 512-d feature:
|
Then using threadpool to simulate the high concurrency situation:
|
Comparing each face feature with all the face feature
However getting a lot of repetitive 512-d.
I think it is related to somewhere thread unsafety of triton, beacause it's alright when it's running with single thread.
|
I replace thread pool with process pool, and get the results like:
Is that means the duplicated return values are resulted from server but not client? @tanmayv25 |
This is my config.pbtxt, using DALI, TensorRT, ONNX backend as pre-process, network and post-process respectively.
|
Hello @LightToYang, you mentioned that sometimes you get Segmentation fault. Does it happen on the client side, or the server side? Also, could you try creating a separate triton client instance for each process/thread to make sure that the thread-safety of the grpc client isn't a problem here? |
Closing. Reopen with additional information if issue is not resolved. |
I ran grpcclient infer() method in multi-thread application (FastAPI), and sometimes the output results are same when inputting different images.
The mistake is alway occurred between adjacent inputs.
For examples:
Since I read #1856 as it says python grpcclient infer() is thread safe, what's wrong with my application ?
The text was updated successfully, but these errors were encountered: