-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Made inference faster (this is especially useful when using Yolo9000) #8009
Conversation
custom_get_region_detections function now keeps track of class index with the highest probability.
Added best_class_idx to detection struct
added python code for faster negative removal and also faster non-max suppression
|
I am adding this comment to prove that my work account (this) and the personal account that made this commit are the same. |
|
@sarimmehdi @iAmJuan550 Hi, Thanks!
|
|
I will provide answers to all these tomorrow when I go to my workplace |
|
@AlexeyAB I have provided the benchmarks below (NVIDIA Corporation TU104 [GeForce RTX 2080 SUPER], Intel® Core™ i9-10900F CPU @ 2.80GHz × 20, 31.2 GB Memory):
Results on 6 images: Here is the same code but now with remove_negatives() and do_nms_sort(): Results:
|
…AlexeyAB#8009) * Update network.c custom_get_region_detections function now keeps track of class index with the highest probability. * Update darknet.h Added best_class_idx to detection struct * Update darknet.py added python code for faster negative removal and also faster non-max suppression
If you look at remove_negatives in darknet.py, it gives you the final detections for an image after going through a nested loop. The outer loop iterates through all the output detection objects given by the neural net and then the inner loop iterates through all the class names (since one of the outputs of the neural net is an array of probabilities whose size is equal to the number of all classes). This is not much of an issue if you have just 80 class names (the original COCO). But, with YOLO9000, you have 9418 class names. So, iterating through almost 9k names for each detection object causes a significant slowdown.
In network.c, I keep track of the class index that gives the highest probability value. Then, in remove_negatives_faster, instead of iterating through all class indices (9418 in the case of YOLO9000), I just grab the index with the highest probability value (called the best_class_idx). Furthermore, the nms method provided is quite slow. So, I decided to provide an extra function that can do nms in python but much faster.
You will notice that all the changes in this fork have already been provided in this issue. The person who opened that issue is actually me but I was using my work account. I will post in that comment action to prove it. On my public LinkedIn, I always advertise my personal Github account, hence why I am using it to make this commit and not my work account.