-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add concurrent processing for sv.InferenceSlicer
#361
Conversation
Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
Good call changing the default # of workers to 1. The use case for which this is built is web inference; on device inference is unlikely to benefit from the concurrent processing so it makes sense to have a 1 default. |
Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
Hey @capjamesg, I took a look at PR. I am not getting same results as old implementation if
Testing code:import supervision as sv
import numpy as np
import cv2
import time
from ultralytics import YOLO
model = YOLO(model="yolov8m.pt")
def callback(x: np.ndarray) -> sv.Detections:
result = model.predict(x, verbose=False)[0]
return sv.Detections.from_ultralytics(result)
image = cv2.imread("../data/bird.jpg")
print("Image Shape:", image.shape)
print("Testing Old Implementation:")
start_time = time.time()
slicer = sv.InferenceSlicerOld(callback=callback)
sliced_detections_v1 = slicer(image=image.copy())
print("Numbers of detections:", len(sliced_detections_v1))
print("Taken time:", (time.time() - start_time))
print("***********************************************")
print("Testing New Implementation:")
for n in range(1, 4):
print(f"------------------ Number of Threads {n} -----------------------")
start_time = time.time()
slicer = sv.InferenceSlicer(callback=callback, thread_workers=n)
sliced_detections_v2 = slicer(image=image.copy())
print("Numbers of detections:", len(sliced_detections_v2))
print("Taken time:", (time.time() - start_time))
if sliced_detections_v1 == sliced_detections_v2:
print("Result is same as SAHI v1")
else:
print("Result is not same as SAHI v1") P.S.: Copy and Rename |
@hardikdava Can you try with |
@hardikdava, shouldn't we use batch inference rather than multithreading? |
@SkalskiP Can you say more? |
@hardikdava based on this code use "multiple image" |
@hardikdava Is this the right way to compare the detections? if sliced_detections_v1 == sliced_detections_v2:
print("Result is same as SAHI v1")
else:
print("Result is not same as SAHI v1") Are two inferences guaranteed to be identical in terms of confidence, etc? |
@capjamesg We are not changing any parameters. So technically the results should be same. The result is same for |
The number of detections should be different, right? SAHI leads to more predictions in aggregate. I slightly modified my test: import supervision as sv
import numpy as np
import cv2
import time
import roboflow
roboflow.login()
rf = roboflow.Roboflow()
workspace = rf.workspace()
project = workspace.project("vehicle-count-in-drone-video")
version = project.version(6)
model = version.model
def callback(x: np.ndarray) -> sv.Detections:
result = model.predict(x).json()
return sv.Detections.from_roboflow(result, class_list=list(project.classes.keys()))
image = cv2.imread("./example.jpg")
image_width = image.shape[1]
image_height = image.shape[0]
print("starting test...")
start_time = time.time()
slicer = callback(image)
print("Detections w/o SAHI:", len(slicer.xyxy))
print("--- %s seconds ---" % (time.time() - start_time))
start_time = time.time()
slicer = sv.InferenceSlicer(callback=callback, thread_workers=8)
sliced_detections = slicer(image=image)
prediction_num = len(sliced_detections.xyxy)
# sv.plot_image(sliced_image)
print("Detections w/ SAHI:", prediction_num)
print("--- %s seconds ---" % (time.time() - start_time)) Here were the results:
|
Disregard my last message 🙃 I was shifting contexts from something else and misinterpreted that the number of predictions should be the same. I have this updated code: import supervision as sv
import numpy as np
import cv2
import time
import roboflow
roboflow.login()
rf = roboflow.Roboflow()
workspace = rf.workspace()
project = workspace.project("vehicle-count-in-drone-video")
version = project.version(6)
model = version.model
def callback(x: np.ndarray) -> sv.Detections:
result = model.predict(x).json()
return sv.Detections.from_roboflow(result, class_list=list(project.classes.keys()))
image = cv2.imread("./example.jpg")
image_width = image.shape[1]
image_height = image.shape[0]
print("starting test...")
start_time = time.time()
slicer = sv.InferenceSlicer(callback=callback, thread_workers=1)
sliced_detections = slicer(image=image)
print("Detections w/o SAHI:", len(sliced_detections.xyxy))
print("--- %s seconds ---" % (time.time() - start_time))
start_time = time.time()
slicer = sv.InferenceSlicer(callback=callback, thread_workers=8)
sliced_detections = slicer(image=image)
prediction_num = len(sliced_detections.xyxy)
# sv.plot_image(sliced_image)
print("Detections w/ SAHI:", prediction_num)
print("--- %s seconds ---" % (time.time() - start_time)) Changing the number of threads from
With that said this number is higher than the current SAHI implementation, even with the number of workers set to |
One potential reason for having higher numbers of predictions is that the concurrent predictions are not ordered, unlike when the predictions are processed linearly as they are now. Do we need a re-arranging post-processing step? |
@hardikdava, any chance you could review it once again? |
@capjamesg I think there is a small chance that this is the reason. After inference, all detections go into one bag anyway. |
@capjamesg technically when you optimize the algorithm in terms of the speed, the results should not be differ. |
@hardikdava Mind taking a look at the code to see what could be the problem? |
@capjamesg, have you made any visualizations? I would love to look at a google colab where we compare performance and results. |
@capjamesg , just for clear understanding, we can not directly compare results between with or without SAHI. we can compare the results by SAHI with different numbers of threads. |
Description
This PR adds concurrent processing with the
concurrent.futures
threading method (part of the Python standard library) tosv.InferenceSlicer
processing.This PR drastically reduces inference times with the slicer.
Without concurrent processing:
With concurrent processing slicer:
Type of change
Please delete options that are not relevant.
How has this change been tested, please provide a testcase or example of how you tested the change?
This change has been tested using the following code to validate the number of predictions with and without concurrent processing is the same.
Any specific deployment considerations
N/A
Docs
N/A