-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to run two engine in one gpu? #3169
Comments
Please use multi-thread, multi-process will create multiple cuda context, and cuda context switch on GPU is in time slice. |
But what if I start two docker container using one gput ? It still cannot utilize the gpu. And I tried torch2trt repo with multiprocessing, its running time will not double, but the gpu util double. As far as I know, the torch2trt repo is based on tensorrt, so tensorrt engine should work as well. |
Let me make it simpler: two process use GPU in time slice, it's expected. You can explore MPS as an option: https://docs.nvidia.com/deploy/mps/index.html |
Thanks, I found a simple way in python, which using torch.nn.Module to encapsulate the trt engine referring to the practice of torch2trt. |
Hi @ArtemisZGL, if it's possible could you please share the solution? |
Description
I tried to run two engine in one gpu. when runing one engine, the running time is 10ms with gpu util 10-20%, but when I start another docker and ran another engine at the same time, the running time for each one is 20ms, with gpu util still 10-20%.
I also tried to using multiprocess in python with creating different context in each process, the result is the same as above.
However, I tried to do the same operation in pytorch, its running time is still 10ms and gpu util double.
Environment
TensorRT Version: 8.6
NVIDIA GPU: 4070
NVIDIA Driver Version: 525.125.06
CUDA Version: 11.3
CUDNN Version:
Operating System: ubuntu 20.04
Python Version (if applicable): 3.10
Tensorflow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if so, version):
Steps To Reproduce
common.py(https://github.com/NVIDIA/TensorRT/blob/96e23978cd6e4a8fe869696d3d8ec2b47120629b/samples/python/common.py)
infer code
The text was updated successfully, but these errors were encountered: