⚡️ Speed up function select_rightmost_detection
by 70%
#1094
+3
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Saurabh's comments - For most practical purposes this should have exactly the same behavior as before. but there is one edge case where it might have a different behavior if we were to be strict. If there are multiple detections with exactly the same x-coord center point, then the previous code would return the detection that in the detection array is right most. The new code, amongst the detections that share the right most x-coord center, it will return the left most duplicate.
Both the cases should still return in the "right most detection" logic, and would rarely, if ever should happen. But you can decide if this is acceptable or not. If it is not acceptable then you can close this PR.
📄 70% (0.70x) speedup for
select_rightmost_detection
ininference/core/workflows/core_steps/common/query_language/operations/detections/base.py
⏱️ Runtime :
1.04 millisecond
→610 microseconds
(best of473
runs)📝 Explanation and details
You can optimize the function by removing redundant operations and using numpy's efficient methods. Instead of using deepcopy when detections are empty, make use of numpy indexing to handle edge cases and unnecessary data copying. Here is the optimized version.
Explanation.
deepcopy
, the original detections object is returned to handle the empty detections case.argmax()
directly oncenters_x
to get the index of the maximum value thereby eliminating the combination ofmax()
andargwhere()
which is redundant and slower compared toargmax()
.✅ Correctness verification report:
🌀 Generated Regression Tests Details