Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concurrent processing for sv.InferenceSlicer #361

Merged
merged 7 commits into from
Oct 4, 2023

Conversation

capjamesg
Copy link
Collaborator

Description

This PR adds concurrent processing with the concurrent.futures threading method (part of the Python standard library) to sv.InferenceSlicer processing.

This PR drastically reduces inference times with the slicer.

Without concurrent processing:

Detections w/o SAHI: 127
--- 1.460580825805664 seconds ---
Detections w/ SAHI: 146
--- 13.294042110443115 seconds ---

With concurrent processing slicer:

Detections w/o SAHI: 127
--- 1.4807708263397217 seconds ---
Detections w/ SAHI: 146
--- 2.305016040802002 seconds ---

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

This change has been tested using the following code to validate the number of predictions with and without concurrent processing is the same.

import supervision as sv
import numpy as np
import cv2
import time
import roboflow

roboflow.login()

rf = roboflow.Roboflow()
workspace = rf.workspace()
project = workspace.project("vehicle-count-in-drone-video")
version = project.version(6)
model = version.model

def callback(x: np.ndarray) -> sv.Detections:
    result = model.predict(x).json()
    return sv.Detections.from_roboflow(result, class_list=list(project.classes.keys()))

image = cv2.imread("./original.jpg")
image_width = image.shape[1]
image_height = image.shape[0]

print("starting test...")

start_time = time.time()

slicer = callback(image)

print("Detections w/o SAHI:", len(slicer.xyxy))
print("--- %s seconds ---" % (time.time() - start_time))

start_time = time.time()

slicer = sv.InferenceSlicer(callback=callback)
sliced_detections = slicer(image=image)

prediction_num = len(sliced_detections.xyxy)

# sv.plot_image(sliced_image)
print("Detections w/ SAHI:", prediction_num)

print("--- %s seconds ---" % (time.time() - start_time))

Any specific deployment considerations

N/A

Docs

N/A

@capjamesg capjamesg self-assigned this Sep 8, 2023
onuralpszr and others added 3 commits September 8, 2023 15:03
@capjamesg
Copy link
Collaborator Author

Good call changing the default # of workers to 1. The use case for which this is built is web inference; on device inference is unlikely to benefit from the concurrent processing so it makes sense to have a 1 default.

Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
@hardikdava
Copy link
Collaborator

Hey @capjamesg, I took a look at PR. I am not getting same results as old implementation if worker_thread is more than 1. Please follow the logs:

Image Shape: (1174, 1920, 3)
Testing Old Implementation:
Numbers of detections: 365
Taken time: 23.60136890411377
***********************************************
Testing New Implementation:
------------------ Number of Threads 1 -----------------------
Numbers of detections: 365
Taken time: 17.404013872146606
Result is same as SAHI v1
------------------ Number of Threads 2 -----------------------
Numbers of detections: 365
Taken time: 14.097673654556274
Result is not same as SAHI v1
------------------ Number of Threads 3 -----------------------
Numbers of detections: 376
Taken time: 14.455281257629395
Result is not same as SAHI v1

Testing code:

import supervision as sv
import numpy as np
import cv2
import time
from ultralytics import YOLO

model = YOLO(model="yolov8m.pt")


def callback(x: np.ndarray) -> sv.Detections:
    result = model.predict(x, verbose=False)[0]
    return sv.Detections.from_ultralytics(result)


image = cv2.imread("../data/bird.jpg")
print("Image Shape:", image.shape)

print("Testing Old Implementation:")
start_time = time.time()
slicer = sv.InferenceSlicerOld(callback=callback)
sliced_detections_v1 = slicer(image=image.copy())

print("Numbers of detections:", len(sliced_detections_v1))
print("Taken time:", (time.time() - start_time))

print("***********************************************")

print("Testing New Implementation:")

for n in range(1, 4):
    print(f"------------------ Number of Threads {n} -----------------------")
    start_time = time.time()
    slicer = sv.InferenceSlicer(callback=callback, thread_workers=n)
    sliced_detections_v2 = slicer(image=image.copy())
    print("Numbers of detections:", len(sliced_detections_v2))
    print("Taken time:", (time.time() - start_time))
    if sliced_detections_v1 == sliced_detections_v2:
        print("Result is same as SAHI v1")
    else:
        print("Result is not same as SAHI v1")

P.S.: Copy and Rename sv.InferenceSlicer to sv.InferenceSlicerOld

@capjamesg
Copy link
Collaborator Author

@hardikdava Can you try with 8 threads?

@SkalskiP
Copy link
Collaborator

SkalskiP commented Oct 4, 2023

@hardikdava, shouldn't we use batch inference rather than multithreading?

@capjamesg
Copy link
Collaborator Author

@SkalskiP Can you say more?

@onuralpszr
Copy link
Collaborator

@hardikdava Can you try with 8 threads?

@hardikdava based on this code use "multiple image"

@capjamesg
Copy link
Collaborator Author

@hardikdava Is this the right way to compare the detections?

    if sliced_detections_v1 == sliced_detections_v2:
        print("Result is same as SAHI v1")
    else:
        print("Result is not same as SAHI v1")

Are two inferences guaranteed to be identical in terms of confidence, etc?

@hardikdava
Copy link
Collaborator

hardikdava commented Oct 4, 2023

@capjamesg We are not changing any parameters. So technically the results should be same. The result is same for thread=1. Numbers of detections are also different for higher threads.

@capjamesg
Copy link
Collaborator Author

The number of detections should be different, right? SAHI leads to more predictions in aggregate.

I slightly modified my test:

import supervision as sv
import numpy as np
import cv2
import time
import roboflow

roboflow.login()

rf = roboflow.Roboflow()
workspace = rf.workspace()
project = workspace.project("vehicle-count-in-drone-video")
version = project.version(6)
model = version.model

def callback(x: np.ndarray) -> sv.Detections:
    result = model.predict(x).json()
    return sv.Detections.from_roboflow(result, class_list=list(project.classes.keys()))

image = cv2.imread("./example.jpg")
image_width = image.shape[1]
image_height = image.shape[0]

print("starting test...")

start_time = time.time()

slicer = callback(image)

print("Detections w/o SAHI:", len(slicer.xyxy))
print("--- %s seconds ---" % (time.time() - start_time))

start_time = time.time()

slicer = sv.InferenceSlicer(callback=callback, thread_workers=8)
sliced_detections = slicer(image=image)

prediction_num = len(sliced_detections.xyxy)

# sv.plot_image(sliced_image)
print("Detections w/ SAHI:", prediction_num)

print("--- %s seconds ---" % (time.time() - start_time))

Here were the results:

You are already logged into Roboflow. To make a different login, run roboflow.login(force=True).
loading Roboflow workspace...
loading Roboflow project...
starting test...
Detections w/o SAHI: 126
--- 1.898620367050171 seconds ---
Detections w/ SAHI: 145
--- 3.1943469047546387 seconds ---

@capjamesg
Copy link
Collaborator Author

Disregard my last message 🙃

I was shifting contexts from something else and misinterpreted that the number of predictions should be the same.

I have this updated code:

import supervision as sv
import numpy as np
import cv2
import time
import roboflow

roboflow.login()

rf = roboflow.Roboflow()
workspace = rf.workspace()
project = workspace.project("vehicle-count-in-drone-video")
version = project.version(6)
model = version.model

def callback(x: np.ndarray) -> sv.Detections:
    result = model.predict(x).json()
    return sv.Detections.from_roboflow(result, class_list=list(project.classes.keys()))

image = cv2.imread("./example.jpg")
image_width = image.shape[1]
image_height = image.shape[0]

print("starting test...")

start_time = time.time()

slicer = sv.InferenceSlicer(callback=callback, thread_workers=1)
sliced_detections = slicer(image=image)

print("Detections w/o SAHI:", len(sliced_detections.xyxy))
print("--- %s seconds ---" % (time.time() - start_time))

start_time = time.time()

slicer = sv.InferenceSlicer(callback=callback, thread_workers=8)
sliced_detections = slicer(image=image)

prediction_num = len(sliced_detections.xyxy)

# sv.plot_image(sliced_image)
print("Detections w/ SAHI:", prediction_num)

print("--- %s seconds ---" % (time.time() - start_time))

Changing the number of threads from 1 to 8 results in the same number of predictions:

loading Roboflow workspace...
loading Roboflow project...
starting test...
Detections w/o SAHI: 145
--- 19.341336011886597 seconds ---
Detections w/ SAHI: 145
--- 2.5259499549865723 seconds ---

With that said this number is higher than the current SAHI implementation, even with the number of workers set to 1, which is not expected.

@capjamesg
Copy link
Collaborator Author

One potential reason for having higher numbers of predictions is that the concurrent predictions are not ordered, unlike when the predictions are processed linearly as they are now. Do we need a re-arranging post-processing step?

@SkalskiP
Copy link
Collaborator

SkalskiP commented Oct 4, 2023

@hardikdava, any chance you could review it once again?

@SkalskiP
Copy link
Collaborator

SkalskiP commented Oct 4, 2023

@capjamesg I think there is a small chance that this is the reason. After inference, all detections go into one bag anyway.

@hardikdava
Copy link
Collaborator

@capjamesg technically when you optimize the algorithm in terms of the speed, the results should not be differ.

@capjamesg
Copy link
Collaborator Author

@hardikdava Mind taking a look at the code to see what could be the problem?

@SkalskiP
Copy link
Collaborator

SkalskiP commented Oct 4, 2023

@capjamesg, have you made any visualizations? I would love to look at a google colab where we compare performance and results.

@hardikdava
Copy link
Collaborator

@capjamesg , just for clear understanding, we can not directly compare results between with or without SAHI. we can compare the results by SAHI with different numbers of threads.

@capjamesg
Copy link
Collaborator Author

One thread results:

example_one_thread

Multi-thread results:
example_multi_thread

@SkalskiP SkalskiP self-requested a review October 4, 2023 16:15
@SkalskiP SkalskiP merged commit 6110804 into develop Oct 4, 2023
4 checks passed
@SkalskiP SkalskiP deleted the add-concurrent-slicer branch January 2, 2024 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants