-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add batch inference #8721
Comments
|
i also tried with list.streams yolov8 provided, it also no improvement in both speed and hardware consuming |
@tienhoang1994 you have 9 frames that complete processing in ~90 ms. Now you should test the time for inference on a single frame to compare like the example from the issue comment. This issue was opened to track the work towards supporting additional batch inference sources. You can follow here to check progress but currently not all sources will support batch inference. |
i already tested with single frame, and u can also see in my screenshot the
log of yolo for single frame, it take ~10 ms
…On Thu, Mar 7, 2024, 6:05 PM Burhan ***@***.***> wrote:
@tienhoang1994 <https://github.com/tienhoang1994> you have 9 frames that
complete processing in ~90 ms. Now you should test the time for inference
on a single frame to compare like the example from the issue comment.
This issue was opened to track the work towards supporting additional
batch inference sources. You can follow here to check progress but
currently not all sources will support batch inference.
—
Reply to this email directly, view it on GitHub
<#8721 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARJC52ZWZWAZIIGEM53QT33YXBCWBAVCNFSM6AAAAABEJK3UFWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBTGI3TIMBUGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thanks for the update, @tienhoang1994! 🚀 If you're seeing ~10ms for a single frame and ~90ms for 9 frames, it seems like the batch processing is indeed working as expected, offering a more efficient throughput compared to single-frame processing. The slight overhead might be due to initial setup or IO delays which can be amortized over larger batch sizes. If you have specific performance targets or further questions, feel free to share! |
@tienhoang1994 the screenshot you shared shows inference for all 9 frames input as a batch. The ~9-10 ms inference is for each image in the batch. What I was saying and showed my initial comment is that the total batch inference time was |
@Burhan-Q ► Example of existing batch inference support you mentioned above is exactly what i expect. Total inference time batch of 8 imgs mostly same as inference 1 img(155 - 192ms). |
@tienhoang1994 I follow your logic and results. I think maybe there is some subjectivity in the assessment of an appreciable difference. If I look at the initial batch results (your first comment) with a total of On my system, I think the difference is more noticeable, as inference on one (1) image was 192ms yet the 8-image batch inference was Codeimport timeit
import cv2 as cv
from ultralytics import YOLO
from functools import partial
model = YOLO("yolov8n.pt")
im1 = "ultralytics/assets/bus.jpg"
im2 = "ultralytics/assets/zidane.jpg"
img1 = cv.imread(im1)
img2 = cv.imread(im2)
p1 = partial(model.predict,(im1,)) # 1-image batch
p2 = partial(model.predict, [img1, img2, img1, img1, img2, img1, img2, img2]) # 8-image batch
timeit.repeat(p1, repeat=3, number=3)
timeit.repeat(p2, repeat=3, number=3) 1 image batch inference
8 image batch inference
Ignoring the initial "slow" (warmup) result, the 8-image batch takes just under 2x longer than the 1-image batch in my recent test, but it is also processing 8x input data. Calculating the inference speed of individual images in the 8-image batch, I'll use the final 8-image batch inference time, I'll divide the total inference time (14.6ms) by the number of images (8) which is (1.8ms) per image for batch inference. Looking at the (last entry) 1-image batch inference time (8.6ms), you can see that the per-image inference time for the batch of 8 images is much faster. |
@tienhoang1994 hey there! No worries at all, you're not wasting our time. We're here to help! 😊 It looks like you've made an interesting observation regarding the impact of GPU capabilities on batch inference performance. Indeed, the difference in processing power between a GTX 1060 and an RTX 3090 can significantly affect the efficiency gains from batch processing. The RTX 3090, with its higher compute capability and memory bandwidth, can better leverage parallel processing, making the advantages of batch inference more pronounced. On the other hand, the GTX 1060, while still a capable GPU, might not exhibit as dramatic improvements due to its hardware limitations. Here's a quick example to illustrate how you might adjust batch sizes based on your GPU's capabilities: from ultralytics import YOLO
# Load your model
model = YOLO('yolov8n.pt')
# Define your source
source = ['path/to/image1.jpg', 'path/to/image2.jpg'] # and so on...
# Predict with batch inference
results = model.predict(source=source) Remember, finding the optimal batch size for your specific hardware setup can maximize your inference efficiency. Keep experimenting, and thanks for sharing your findings! 👍 |
Additional batch inferencing included with #8817 |
Search before asking
Description
Related to PR #8058
Related comment from @glenn-jocher
Use case
Note
Currently batch inference is supported for certain types of input sources, this issue is to include the additional sources listed below
Example of existing batch inference support
Add batch inference for (at least) the following sources:
glob
or directoriesAdditional
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: