Add batch inference #8721

Burhan-Q · 2024-03-06T16:05:48Z

Search before asking

I have searched the YOLOv8 issues and found no similar feature requests.

Description

Related to PR #8058

...yes some predictions sources will run in batches, but I think a main one that's missing is the glob or directory inference, though txt list of sources may also be missing. Yes please open an issue and tag us in it along with @adrianboguszewski from Intel. Thanks!

Use case

Note

Currently batch inference is supported for certain types of input sources, this issue is to include the additional sources listed below

Example of existing batch inference support

import cv2 as cv
from ultralytics import YOLO

model = YOLO("yolov8n.pt")
im1 = "ultralytics/assets/bus.jpg"
im2 = "ultralytics/assets/zidane.jpg"

img1 = cv.imread(im1)
img2 = cv.imread(im2)

# Multi-image
results = model.predict(source=[img1, img2, img1, img1, img2, img1, img2, img2],)

>>> 0: 640x640 4 persons, 1 bus, 
1: 640x640 2 persons, 1 tie, 
2: 640x640 4 persons, 1 bus, 
3: 640x640 4 persons, 1 bus, 
4: 640x640 2 persons, 1 tie, 
5: 640x640 4 persons, 1 bus, 
6: 640x640 2 persons, 1 tie, 
7: 640x640 2 persons, 1 tie, 
155.0ms

[r.speed for r in results]

>>> [
{'preprocess': 2.5001466274261475, 'inference': 19.374817609786987, 'postprocess': 10.336160659790039}, 
{'preprocess': 2.5001466274261475, 'inference': 19.374817609786987, 'postprocess': 10.336160659790039}, 
{'preprocess': 2.5001466274261475, 'inference': 19.374817609786987, 'postprocess': 10.336160659790039}, 
{'preprocess': 2.5001466274261475, 'inference': 19.374817609786987, 'postprocess': 10.336160659790039}, 
{'preprocess': 2.5001466274261475, 'inference': 19.374817609786987, 'postprocess': 10.336160659790039}, 
{'preprocess': 2.5001466274261475, 'inference': 19.374817609786987, 'postprocess': 10.336160659790039}, 
{'preprocess': 2.5001466274261475, 'inference': 19.374817609786987, 'postprocess': 10.336160659790039}, 
{'preprocess': 2.5001466274261475, 'inference': 19.374817609786987, 'postprocess': 10.336160659790039}
]

# Single image
results2 = model.predict(source=[img1,])

>>> 0: 640x480 4 persons, 1 bus, 1 stop sign, 192.1ms

results2[0].speed
>>> {'preprocess': 2.998828887939453, 'inference': 192.11935997009277, 'postprocess': 8.776426315307617}

Add batch inference for (at least) the following sources:

glob or directories
list of text sources
...

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

The text was updated successfully, but these errors were encountered:

tienhoang1994 · 2024-03-07T06:39:51Z

batch predict has no effect with my test like this (i test same video input so frame input already in same shape, but the document said that it auto set batchzise if i put list of image as input @glenn-jocher

tienhoang1994 · 2024-03-07T06:56:07Z

i also tried with list.streams yolov8 provided, it also no improvement in both speed and hardware consuming

Burhan-Q · 2024-03-07T11:04:42Z

@tienhoang1994 you have 9 frames that complete processing in ~90 ms. Now you should test the time for inference on a single frame to compare like the example from the issue comment.

This issue was opened to track the work towards supporting additional batch inference sources. You can follow here to check progress but currently not all sources will support batch inference.

tienhoang1994 · 2024-03-07T14:05:05Z

i already tested with single frame, and u can also see in my screenshot the log of yolo for single frame, it take ~10 ms

…

On Thu, Mar 7, 2024, 6:05 PM Burhan ***@***.***> wrote: @tienhoang1994 <https://github.com/tienhoang1994> you have 9 frames that complete processing in ~90 ms. Now you should test the time for inference on a single frame to compare like the example from the issue comment. This issue was opened to track the work towards supporting additional batch inference sources. You can follow here to check progress but currently not all sources will support batch inference. — Reply to this email directly, view it on GitHub <#8721 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ARJC52ZWZWAZIIGEM53QT33YXBCWBAVCNFSM6AAAAABEJK3UFWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBTGI3TIMBUGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

glenn-jocher · 2024-03-07T19:01:38Z

Thanks for the update, @tienhoang1994! 🚀 If you're seeing ~10ms for a single frame and ~90ms for 9 frames, it seems like the batch processing is indeed working as expected, offering a more efficient throughput compared to single-frame processing. The slight overhead might be due to initial setup or IO delays which can be amortized over larger batch sizes. If you have specific performance targets or further questions, feel free to share!

Burhan-Q · 2024-03-07T20:14:37Z

@tienhoang1994 the screenshot you shared shows inference for all 9 frames input as a batch.

The ~9-10 ms inference is for each image in the batch. What I was saying and showed my initial comment is that the total batch inference time was 155.0ms however when passing only a single image, the inference time was 192.1ms. If you see my initial comment and expand the section ► Example of existing batch inference support to see the full details of the results I'm talking about

tienhoang1994 · 2024-03-08T04:46:38Z

@Burhan-Q ► Example of existing batch inference support you mentioned above is exactly what i expect. Total inference time batch of 8 imgs mostly same as inference 1 img(155 - 192ms).

i have tested again for you to see my result. it seem like batch not run all frame in parallel, and same as loop 9 time of predict single img(~90ms for batch of 9 imgs vs ~10ms for single). Thank you guys for your support, plz correct me if i have misunderstanding

Burhan-Q · 2024-03-08T20:15:33Z

@tienhoang1994 I follow your logic and results. I think maybe there is some subjectivity in the assessment of an appreciable difference. If I look at the initial batch results (your first comment) with a total of ~95ms inference versus the looping result (you most recent comment) at ~115ms, this is a difference of 20ms and ~17% faster. It's not a lot, but to me, it's measurably faster.

On my system, I think the difference is more noticeable, as inference on one (1) image was 192ms yet the 8-image batch inference was 155ms. This processes 8x images in ~20% faster time than a single image. Rechecking, here are my results but using timeit.repeat:

Code

import timeit
import cv2 as cv
from ultralytics import YOLO
from functools import partial

model = YOLO("yolov8n.pt")

im1 = "ultralytics/assets/bus.jpg"
im2 = "ultralytics/assets/zidane.jpg"

img1 = cv.imread(im1)
img2 = cv.imread(im2)

p1 = partial(model.predict,(im1,)) # 1-image batch
p2 = partial(model.predict, [img1, img2, img1, img1, img2, img1, img2, img2]) # 8-image batch

timeit.repeat(p1, repeat=3, number=3)
timeit.repeat(p2, repeat=3, number=3)

1 image batch inference

screenshot 1 image batch

first iteration inference time is 141ms
average of remaining 7 repeats is ~8ms

8 image batch inference

screenshot 8 image batch

first iteration inference time is 152ms
average of remaining 7 repeats is ~15ms

Ignoring the initial "slow" (warmup) result, the 8-image batch takes just under 2x longer than the 1-image batch in my recent test, but it is also processing 8x input data. Calculating the inference speed of individual images in the 8-image batch, I'll use the final 8-image batch inference time, I'll divide the total inference time (14.6ms) by the number of images (8) which is (1.8ms) per image for batch inference. Looking at the (last entry) 1-image batch inference time (8.6ms), you can see that the per-image inference time for the batch of 8 images is much faster.

tienhoang1994 · 2024-03-12T07:46:38Z

sorry for wasting your time. maybe i found the problem is about GPU.

on the left - weak GPU 1060 i am using on this Post has 7ms per img vs 5.6ms per img in batch of 8 (which i dont feel the power of batch inference)
on the right - strong GPU 3090 has 5.1ms per img vs 1.1ms per img in batch of 8 (much better)

glenn-jocher · 2024-03-12T10:40:47Z

@tienhoang1994 hey there! No worries at all, you're not wasting our time. We're here to help! 😊 It looks like you've made an interesting observation regarding the impact of GPU capabilities on batch inference performance. Indeed, the difference in processing power between a GTX 1060 and an RTX 3090 can significantly affect the efficiency gains from batch processing.

The RTX 3090, with its higher compute capability and memory bandwidth, can better leverage parallel processing, making the advantages of batch inference more pronounced. On the other hand, the GTX 1060, while still a capable GPU, might not exhibit as dramatic improvements due to its hardware limitations.

Here's a quick example to illustrate how you might adjust batch sizes based on your GPU's capabilities:

from ultralytics import YOLO

# Load your model
model = YOLO('yolov8n.pt')

# Define your source
source = ['path/to/image1.jpg', 'path/to/image2.jpg']  # and so on...

# Predict with batch inference
results = model.predict(source=source)

Remember, finding the optimal batch size for your specific hardware setup can maximize your inference efficiency. Keep experimenting, and thanks for sharing your findings! 👍

Burhan-Q · 2024-03-12T12:28:39Z

Additional batch inferencing included with #8817

Burhan-Q added TODO Items that needs completing enhancement New feature or request labels Mar 6, 2024

Burhan-Q assigned adrianboguszewski and glenn-jocher Mar 6, 2024

Laughing-q self-assigned this Mar 8, 2024

Burhan-Q linked a pull request Mar 12, 2024 that will close this issue

ultralytics 8.1.26 LoadImagesAndVideos batched inference #8817

Merged

Burhan-Q closed this as completed Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add batch inference #8721

Add batch inference #8721

Burhan-Q commented Mar 6, 2024

tienhoang1994 commented Mar 7, 2024

tienhoang1994 commented Mar 7, 2024

Burhan-Q commented Mar 7, 2024

tienhoang1994 commented Mar 7, 2024 via email

glenn-jocher commented Mar 7, 2024

Burhan-Q commented Mar 7, 2024

tienhoang1994 commented Mar 8, 2024 •

edited

Burhan-Q commented Mar 8, 2024

tienhoang1994 commented Mar 12, 2024 •

edited

glenn-jocher commented Mar 12, 2024

Burhan-Q commented Mar 12, 2024

Add batch inference #8721

Add batch inference #8721

Comments

Burhan-Q commented Mar 6, 2024

Search before asking

Description

Use case

Additional

Are you willing to submit a PR?

tienhoang1994 commented Mar 7, 2024

tienhoang1994 commented Mar 7, 2024

Burhan-Q commented Mar 7, 2024

tienhoang1994 commented Mar 7, 2024 via email

glenn-jocher commented Mar 7, 2024

Burhan-Q commented Mar 7, 2024

tienhoang1994 commented Mar 8, 2024 • edited

Burhan-Q commented Mar 8, 2024

Code

1 image batch inference

8 image batch inference

tienhoang1994 commented Mar 12, 2024 • edited

glenn-jocher commented Mar 12, 2024

Burhan-Q commented Mar 12, 2024

tienhoang1994 commented Mar 8, 2024 •

edited

tienhoang1994 commented Mar 12, 2024 •

edited