Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Speed up function select_rightmost_detection by 70% #1094

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

misrasaurabh1
Copy link

Saurabh's comments - For most practical purposes this should have exactly the same behavior as before. but there is one edge case where it might have a different behavior if we were to be strict. If there are multiple detections with exactly the same x-coord center point, then the previous code would return the detection that in the detection array is right most. The new code, amongst the detections that share the right most x-coord center, it will return the left most duplicate.

Both the cases should still return in the "right most detection" logic, and would rarely, if ever should happen. But you can decide if this is acceptable or not. If it is not acceptable then you can close this PR.

📄 70% (0.70x) speedup for select_rightmost_detection in inference/core/workflows/core_steps/common/query_language/operations/detections/base.py

⏱️ Runtime : 1.04 millisecond 610 microseconds (best of 473 runs)

📝 Explanation and details

You can optimize the function by removing redundant operations and using numpy's efficient methods. Instead of using deepcopy when detections are empty, make use of numpy indexing to handle edge cases and unnecessary data copying. Here is the optimized version.

Explanation.

  1. Edge Case Optimization: Instead of using deepcopy, the original detections object is returned to handle the empty detections case.
  2. Numpy Optimization: Using argmax() directly on centers_x to get the index of the maximum value thereby eliminating the combination of max() and argwhere() which is redundant and slower compared to argmax().
  3. Code Clarity is preserved while enhancing the performance.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 29 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests Details
from copy import deepcopy

import numpy as np
# imports
import pytest  # used for our unit tests
import supervision as sv
from inference.core.workflows.core_steps.common.query_language.operations.detections.base import \
    select_rightmost_detection
from supervision import Position


# Mocking sv.Detections and Position for testing
class MockDetections:
    def __init__(self, coordinates):
        self.coordinates = coordinates

    def get_anchors_coordinates(self, anchor):
        return np.array(self.coordinates)

    def __len__(self):
        return len(self.coordinates)

    def __getitem__(self, index):
        return self.coordinates[index]

@pytest.fixture
def mock_detections():
    return MockDetections

# unit tests
def test_single_detection(mock_detections):
    # Single detection
    detections = mock_detections([[5, 5]])
    codeflash_output = select_rightmost_detection(detections)

def test_multiple_distinct_x(mock_detections):
    # Multiple detections with distinct x-coordinates
    detections = mock_detections([[1, 1], [2, 2], [3, 3]])
    codeflash_output = select_rightmost_detection(detections)

def test_multiple_identical_x(mock_detections):
    # Multiple detections with some identical x-coordinates
    detections = mock_detections([[1, 1], [2, 2], [2, 2], [3, 3]])
    codeflash_output = select_rightmost_detection(detections)

def test_empty_detections(mock_detections):
    # Empty detections list
    detections = mock_detections([])
    codeflash_output = select_rightmost_detection(detections)

def test_all_same_x(mock_detections):
    # All detections with the same x-coordinate
    detections = mock_detections([[2, 2], [2, 2], [2, 2]])
    codeflash_output = select_rightmost_detection(detections)

def test_negative_x(mock_detections):
    # Detections with negative x-coordinates
    detections = mock_detections([[-1, -1], [-2, -2], [-3, -3]])
    codeflash_output = select_rightmost_detection(detections)

def test_zero_x(mock_detections):
    # Detections with zero x-coordinates
    detections = mock_detections([[0, 0], [0, 0], [0, 0]])
    codeflash_output = select_rightmost_detection(detections)

def test_large_x(mock_detections):
    # Detections with very large x-coordinates
    detections = mock_detections([[1e6, 1], [1e7, 2], [1e8, 3]])
    codeflash_output = select_rightmost_detection(detections)

def test_small_x(mock_detections):
    # Detections with very small (close to zero) x-coordinates
    detections = mock_detections([[1e-6, 1], [1e-7, 2], [1e-8, 3]])
    codeflash_output = select_rightmost_detection(detections)

def test_mixed_x(mock_detections):
    # Mixed positive and negative x-coordinates
    detections = mock_detections([[-1, -1], [0, 0], [1, 1]])
    codeflash_output = select_rightmost_detection(detections)

def test_floating_point_x(mock_detections):
    # Detections with floating-point x-coordinates
    detections = mock_detections([[1.1, 1], [2.2, 2], [3.3, 3]])
    codeflash_output = select_rightmost_detection(detections)

def test_large_scale(mock_detections):
    # Large number of detections
    detections = mock_detections([[i, i] for i in range(1000)])
    codeflash_output = select_rightmost_detection(detections)

def test_very_large_scale(mock_detections):
    # Very large number of detections
    detections = mock_detections([[i, i] for i in range(1000)])
    codeflash_output = select_rightmost_detection(detections)

def test_non_integer_x(mock_detections):
    # Non-integer x-coordinates
    detections = mock_detections([[1.5, 1], [2.5, 2], [3.5, 3]])
    codeflash_output = select_rightmost_detection(detections)


def test_inf_x(mock_detections):
    # Detections with infinite x-coordinates
    detections = mock_detections([[1, 1], [np.inf, 2], [3, 3]])
    codeflash_output = select_rightmost_detection(detections)

def test_negative_inf_x(mock_detections):
    # Detections with negative infinite x-coordinates
    detections = mock_detections([[-np.inf, -1], [-1, -2], [0, 0]])
    codeflash_output = select_rightmost_detection(detections)

def test_mixed_valid_invalid_x(mock_detections):
    # Mixed valid and invalid x-coordinates
    detections = mock_detections([[1, 1], ['invalid', 2], [3, 3], [np.nan, 4]])
    with pytest.raises(TypeError):
        select_rightmost_detection(detections)

def test_none_x(mock_detections):
    # Detections with some x-coordinates as None
    detections = mock_detections([[1, 1], [None, 2], [3, 3]])
    with pytest.raises(TypeError):
        select_rightmost_detection(detections)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from copy import deepcopy
from unittest.mock import MagicMock

import numpy as np
# imports
import pytest  # used for our unit tests
import supervision as sv
from inference.core.workflows.core_steps.common.query_language.operations.detections.base import \
    select_rightmost_detection
from supervision import Position


# Mocking sv.Detections and Position for testing purposes
class MockDetections:
    def __init__(self, detections):
        self.detections = detections

    def __len__(self):
        return len(self.detections)

    def get_anchors_coordinates(self, anchor):
        return np.array(self.detections)

    def __getitem__(self, index):
        return self.detections[index]

# unit tests
def test_empty_detections():
    detections = MockDetections([])
    codeflash_output = select_rightmost_detection(detections)

def test_single_detection():
    detections = MockDetections([[10, 20]])
    codeflash_output = select_rightmost_detection(detections)

def test_multiple_detections_distinct_x():
    detections = MockDetections([[10, 20], [20, 30], [30, 40]])
    codeflash_output = select_rightmost_detection(detections)

def test_multiple_detections_identical_x():
    detections = MockDetections([[10, 20], [10, 30], [10, 40]])
    codeflash_output = select_rightmost_detection(detections)

def test_multiple_detections_some_identical_x():
    detections = MockDetections([[10, 20], [10, 30], [20, 40], [20, 50]])
    codeflash_output = select_rightmost_detection(detections)

def test_detections_negative_x():
    detections = MockDetections([[-10, 20], [-5, 30], [-1, 40]])
    codeflash_output = select_rightmost_detection(detections)

def test_detections_mixed_x():
    detections = MockDetections([[-10, 20], [0, 30], [10, 40]])
    codeflash_output = select_rightmost_detection(detections)

def test_large_number_of_detections():
    detections = MockDetections([[i, i+10] for i in range(-500, 500)])
    codeflash_output = select_rightmost_detection(detections)

def test_floating_point_x():
    detections = MockDetections([[10.5, 20], [20.25, 30], [30.75, 40]])
    codeflash_output = select_rightmost_detection(detections)

def test_very_large_x():
    detections = MockDetections([[1e10, 20], [2e10, 30], [3e10, 40]])
    codeflash_output = select_rightmost_detection(detections)

def test_very_small_x():
    detections = MockDetections([[1e-10, 20], [2e-10, 30], [3e-10, 40]])
    codeflash_output = select_rightmost_detection(detections)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

Codeflash

You can optimize the function by removing redundant operations and using numpy's efficient methods. Instead of using deepcopy when detections are empty, make use of numpy indexing to handle edge cases and unnecessary data copying. Here is the optimized version.



### Explanation.
1. **Edge Case Optimization**: Instead of using `deepcopy`, the original detections object is returned to handle the empty detections case.
2. **Numpy Optimization**: Using `argmax()` directly on `centers_x` to get the index of the maximum value thereby eliminating the combination of `max()` and `argwhere()` which is redundant and slower compared to `argmax()`. 
3. Code Clarity is preserved while enhancing the performance.
@CLAassistant
Copy link

CLAassistant commented Mar 21, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ misrasaurabh1
❌ codeflash-ai[bot]
You have signed the CLA already but the status is still pending? Let us recheck it.

@PawelPeczek-Roboflow
Copy link
Collaborator

Could you please sign CLA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants