-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slow with batch size > 1 #56
Comments
Does your processing time include image preprocessing? (Do you get a different result with trt_ssd_async.py?) |
time is only for trt_ssd.detect(img, 0.3). this object is included preprocessing step, but for simplicity I concatenate same image for 3 times before feed to the np.copyto(self.host_inputs[0], img_resized.ravel()), like this : I do following steps: Notice that when I comment self.stream.synchronize() in the ssd.py, I get first few result with 0.002 sec and then the time is growing reach to 0.06, and then the line self.stream.synchronize() remain uncomment, I get 0.06 for all result, why? |
Instead of timing the whole trt_ssd.detect() function, I think it makes more sense for you to only time the "cuda.memcpy_xxx"s, "context.execute_async" and "cuda.stream.synchronize" in that function. By the way, the "self.stream.synchronize" call cannot be commented out. Otherwise, you cannot be sure GPU has finished processing the image. |
This is my TensorRT OCR custom model when I use batch_size = 1, I get 0.02 sec and when I use batch_size= 10, I get 0.2 sec, which means, this batch_size input images running as serializing, not parallel, why? Batch_size = 1
Batch_size = 10
|
Duplicated issue: #106 |
Hi,
I set max_batch_size = 3 and I want to speed up model with 3 input image as parallel instead serial,
But I converted the model with batch_size = 3 correctly, when I run trt_ssd.py, I achieve this results:
for 1 batch_size : process time is 0.002 sec with 1080 TI,
for 3 batch_size : process time is 0.006 sec with 1080 TI,
That means the system process as serial not parallel, why?
The text was updated successfully, but these errors were encountered: