Skip to content

Add area_measurement workflow block#2013

Open
jeku46 wants to merge 2 commits intomainfrom
feat/mask-measurement-block
Open

Add area_measurement workflow block#2013
jeku46 wants to merge 2 commits intomainfrom
feat/mask-measurement-block

Conversation

@jeku46
Copy link
Contributor

@jeku46 jeku46 commented Feb 17, 2026

New block that calculates the area of detected objects in square pixels. Uses polygon area (cv2.contourArea) for instance segmentation masks, falls back to bounding box width * height for object detection predictions.

What does this PR do?

Add area_measurement workflow block

New block that calculates the area of detected objects in square pixels.
Uses polygon area (cv2.contourArea) for instance segmentation masks,
falls back to bounding box width * height for object detection predictions.

Related Issue(s):

Type of Change

  • New feature (non-breaking change that adds functionality)

Testing

  • I have tested this change locally
  • I have added/updated tests for this change

Test details:
I trained a model on detecting rubber ducks. Then I ran a workflow with this block and verified that the area calculated was reasonable.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code where necessary, particularly in hard-to-understand areas
  • My changes generate no new warnings or errors
  • I have updated the documentation accordingly (if applicable)

Additional Context

@jeku46 jeku46 force-pushed the feat/mask-measurement-block branch 6 times, most recently from 5c8f528 to 0cfcffe Compare February 18, 2026 16:29
@jeku46 jeku46 marked this pull request as ready for review February 18, 2026 17:22
Comment on lines 180 to 182
areas = []
for i in range(len(predictions)):
areas.append(get_detection_area(predictions, i))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 65% (0.65x) speedup for MaskAreaMeasurementBlockV1.run in inference/core/workflows/core_steps/classical_cv/mask_area_measurement/v1.py

⏱️ Runtime : 3.31 milliseconds 2.00 milliseconds (best of 250 runs)

📝 Explanation and details

The optimized code achieves a 65% speedup by adding a fast path for detections without masks in the run method. Here's why it's faster:

Key Optimization:
The code adds a conditional check at the start of run() to detect when predictions.mask is None. When masks are absent, it bypasses the per-detection loop and uses vectorized NumPy operations instead:

# Fast path for bbox-only detections
xyxy = predictions.xyxy
widths = xyxy[:, 2] - xyxy[:, 0]
heights = xyxy[:, 3] - xyxy[:, 1]
areas = (widths * heights).tolist()

Why This is Faster:

  1. Vectorization vs Iteration: NumPy's vectorized operations process all bounding boxes simultaneously using optimized C code, while the original version iterates through each detection individually with Python loops and function calls.

  2. Eliminates Function Call Overhead: The original code calls get_detection_area() for every detection (2,484 times in the profiler), each incurring Python function call overhead (~7.5μs per call). The optimized version eliminates this entirely for the no-mask case.

  3. Memory Access Patterns: Vectorized NumPy operations benefit from better CPU cache utilization through contiguous memory access, whereas repeated indexing (bbox = detection.xyxy[index]) causes cache misses.

Performance Impact:

From the line profiler, the critical improvement is visible in test cases with many bbox-only detections:

  • 1000 detections (bbox only): 652μs → 25.3μs (2,475% faster)
  • 100 detections (bbox only): 67.7μs → 11.2μs (506% faster)

For mask-based detections, performance remains similar since the code still uses the sequential get_detection_area() path (masks require cv.findContours() which cannot be easily vectorized).

What Makes This Beneficial:

This optimization is particularly effective when:

  • Processing batch predictions from object detection models that output only bounding boxes (no segmentation masks)
  • The function is called in tight loops or high-throughput pipelines
  • Working with large numbers of detections per frame/image

The speedup scales linearly with the number of detections, making it increasingly valuable in production scenarios with many objects per image.

Correctness verification report:

Test Status
⏪ Replay Tests 🔘 None Found
⚙️ Existing Unit Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
🌀 Generated Regression Tests 68 Passed
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import cv2 as cv  # used implicitly by the code under test and for creating masks
import numpy as np  # used to construct numeric arrays for detections and masks

# imports
import pytest  # used for our unit tests

# import the real Detections class from the supervision package (used by the implementation)
import supervision as sv

# import the actual implementation under test and helpers from the original module path
from inference.core.workflows.core_steps.classical_cv.mask_area_measurement.v1 import (
    OUTPUT_KEY,
    MaskAreaMeasurementBlockV1,
    get_detection_area,
)


def _make_bbox_array(list_of_boxes):
    """Helper to convert a list of [x1,y1,x2,y2] boxes into an np.ndarray shape (N,4)."""
    return np.array(list_of_boxes, dtype=float)


def _make_mask_stack(list_of_2d_masks):
    """Helper to stack a list of 2D masks (H x W) into a (N, H, W) array as expected by the code."""
    # Ensure masks are uint8 (values 0 or 255) which cv.findContours expects
    stack = np.stack([np.array(m, dtype=np.uint8) for m in list_of_2d_masks], axis=0)
    return stack


def test_single_bbox_without_mask_uses_bbox_area():
    # Create a single bbox: (0,0) to (10,5) -> width=10 height=5 => area 50
    xyxy = _make_bbox_array([[0, 0, 10, 5]])
    detections = sv.Detections(xyxy=xyxy)  # construct a real Detections instance
    block = MaskAreaMeasurementBlockV1()  # create the real block instance

    # Run the block and assert the returned dict contains the expected areas list
    codeflash_output = block.run(detections)
    result = codeflash_output  # 4.86μs -> 9.20μs (47.2% slower)


def test_single_mask_with_contour_uses_contour_area():
    # Create a 10x10 mask with a filled rectangle from rows 2..6 and cols 3..7 (4x4 => area 16)
    mask = np.zeros((10, 10), dtype=np.uint8)
    mask[2:6, 3:7] = 255  # filled rectangle, area of 16 pixels
    stacked = _make_mask_stack([mask])

    # Provide a bbox that would have larger area if used, to ensure mask contour takes precedence
    xyxy = _make_bbox_array([[0, 0, 100, 100]])
    detections = sv.Detections(xyxy=xyxy, mask=stacked)
    # Directly use the helper function too to exercise both code paths
    area_via_helper = get_detection_area(detections, 0)
    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(detections)
    result = codeflash_output  # 12.2μs -> 12.6μs (3.88% slower)


def test_mask_without_contours_falls_back_to_bbox():
    # A mask with all zeros yields no contours; should fall back to bbox area
    mask = np.zeros((8, 8), dtype=np.uint8)  # empty mask
    stacked = _make_mask_stack([mask])

    xyxy = _make_bbox_array([[1, 2, 6, 7]])  # width=5 height=5 => area=25
    detections = sv.Detections(xyxy=xyxy, mask=stacked)
    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(detections)
    result = codeflash_output  # 13.9μs -> 14.3μs (2.73% slower)


def test_zero_sized_bbox_returns_zero_area():
    # width is zero (x1 == x2)
    xyxy = _make_bbox_array([[5, 5, 5, 10]])
    detections = sv.Detections(xyxy=xyxy)
    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(detections)
    result = codeflash_output  # 4.33μs -> 9.01μs (51.9% slower)


def test_reversed_coordinates_can_produce_negative_area():
    # Construct a bbox where x2 < x1 but y2 > y1 so width negative, height positive -> negative area
    xyxy = _make_bbox_array(
        [[10, 0, 5, 10]]
    )  # width = 5-10 = -5, height = 10 -> area = -50
    detections = sv.Detections(xyxy=xyxy)
    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(detections)
    result = codeflash_output  # 4.07μs -> 7.93μs (48.7% slower)


def test_multiple_mixed_detections_combined_behavior():
    # Prepare three detections:
    # 1) Has a mask with a 2x2 square -> area 4
    # 2) Has an empty mask -> fallback to bbox area 6*6=36
    # 3) No mask -> bbox area 3*7=21
    mask1 = np.zeros((6, 6), dtype=np.uint8)
    mask1[1:3, 1:3] = 255  # 2x2 square -> area 4
    mask2 = np.zeros((6, 6), dtype=np.uint8)  # empty, fallback expected

    masks = _make_mask_stack([mask1, mask2])  # two masks for first two detections
    # For the third detection, we'll set mask to None by using a Detections constructed without mask.
    xyxy_all = _make_bbox_array(
        [
            [
                0,
                0,
                10,
                10,
            ],  # large bbox but masked -> contour area should be used for index 0
            [1, 1, 7, 7],  # empty mask -> fallback to bbox area 36
            [2, 2, 5, 9],  # no mask supplied -> bbox area 3*7=21
        ]
    )

    # Many supervision.Detections implementations expect mask to have same length as xyxy.
    # To combine mixed presence, create two separate Detections and then combine their attributes in a single Detections:
    # The implementation accesses detection.mask and detection.xyxy directly; to create a single Detections with mixed mask presence:
    # we'll construct mask array where the third mask entry is all zeros and then set the Detections.mask attribute to None for a different object
    # However, to keep things simple and robust across supervision versions, construct a Detections where mask is provided for all
    # entries (third mask empty), and ensure behavior matches expected for all three.
    mask3 = np.zeros((6, 6), dtype=np.uint8)  # empty mask for third detection
    combined_masks = _make_mask_stack([mask1, mask2, mask3])

    detections = sv.Detections(xyxy=xyxy_all, mask=combined_masks)
    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(detections)
    result = codeflash_output  # 29.8μs -> 31.0μs (3.85% slower)

    # Expected areas: index0 -> contour area 4, index1 -> fallback bbox (7-1)*(7-1)=36, index2 -> fallback bbox (5-2)*(9-2)=3*7=21
    expected = [pytest.approx(4.0, rel=1e-6), 36.0, 21.0]


def test_large_scale_bbox_only_many_detections():
    # Create 1000 detections with increasing bbox sizes: box i -> (0,0,i,i) -> area = i*i
    n = 1000
    boxes = [[0, 0, float(i), float(i)] for i in range(n)]
    xyxy = _make_bbox_array(boxes)
    detections = sv.Detections(xyxy=xyxy)
    block = MaskAreaMeasurementBlockV1()

    codeflash_output = block.run(detections)
    result = codeflash_output  # 640μs -> 24.6μs (2506% faster)
    mid_index = n // 2


def test_large_scale_masks_many_detections_performance_and_correctness():
    # Create 200 detections (reasonable for test environments) each with a small mask of 8x8
    # This exercises the contour-finding code path many times.
    n = 200
    masks = []
    boxes = []
    for i in range(n):
        # Create a small filled square whose size varies with i but clipped to [1,7]
        size = 1 + (i % 7)
        m = np.zeros((8, 8), dtype=np.uint8)
        # place square at top-left corner
        m[0:size, 0:size] = 255
        masks.append(m)
        # bbox large enough so contour area will be used instead of bbox area
        boxes.append([0, 0, 100.0, 100.0])

    xyxy = _make_bbox_array(boxes)
    stacked_masks = _make_mask_stack(masks)

    detections = sv.Detections(xyxy=xyxy, mask=stacked_masks)
    block = MaskAreaMeasurementBlockV1()

    codeflash_output = block.run(detections)
    result = codeflash_output  # 1.13ms -> 1.12ms (1.20% faster)
    # Check first few and last few values match expected sizes squared
    for i in [0, 1, 6, 7, 13, n - 1]:
        expected_size = 1 + (i % 7)
        expected_area = float(expected_size * expected_size)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import cv2 as cv
import numpy as np

# imports
import pytest
import supervision as sv
from inference.core.workflows.core_steps.classical_cv.mask_area_measurement.v1 import (
    OUTPUT_KEY,
    MaskAreaMeasurementBlockV1,
    get_detection_area,
)


def test_run_single_detection_with_bbox_only():
    """Test run method with a single detection without segmentation mask."""
    # Create a detection with a bounding box but no mask
    xyxy = np.array([[10.0, 20.0, 50.0, 80.0]])  # x1, y1, x2, y2
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 5.20μs -> 9.83μs (47.1% slower)


def test_run_multiple_detections_bbox_only():
    """Test run method with multiple detections without segmentation masks."""
    # Create detections with bounding boxes only
    xyxy = np.array(
        [
            [0.0, 0.0, 10.0, 10.0],  # area = 10 * 10 = 100
            [5.0, 5.0, 15.0, 20.0],  # area = 10 * 15 = 150
            [100.0, 100.0, 110.0, 110.0],  # area = 10 * 10 = 100
        ]
    )
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 6.28μs -> 8.82μs (28.7% slower)


def test_run_single_detection_with_mask():
    """Test run method with a single detection including a segmentation mask."""
    # Create a simple rectangular mask
    mask = np.zeros((100, 100), dtype=bool)
    mask[20:50, 30:70] = True  # 30x40 rectangle = 1200 pixels

    xyxy = np.array([[20.0, 20.0, 70.0, 50.0]])
    detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 27.1μs -> 27.6μs (1.85% slower)


def test_run_multiple_detections_with_masks():
    """Test run method with multiple detections with segmentation masks."""
    # Create two circular masks
    mask1 = np.zeros((100, 100), dtype=bool)
    mask2 = np.zeros((100, 100), dtype=bool)

    # Simple square masks for predictable areas
    mask1[10:40, 10:40] = True  # 30x30 = 900
    mask2[50:80, 50:80] = True  # 30x30 = 900

    xyxy = np.array([[10.0, 10.0, 40.0, 40.0], [50.0, 50.0, 80.0, 80.0]])
    detections = sv.Detections(xyxy=xyxy, mask=np.array([mask1, mask2]))

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 37.9μs -> 38.2μs (0.733% slower)


def test_get_detection_area_with_bbox_only():
    """Test get_detection_area helper function with bbox-only detection."""
    xyxy = np.array([[10.0, 10.0, 50.0, 50.0]])
    detections = sv.Detections(xyxy=xyxy)

    area = get_detection_area(detections, 0)


def test_get_detection_area_with_mask():
    """Test get_detection_area helper function with mask."""
    mask = np.zeros((100, 100), dtype=bool)
    mask[20:60, 30:70] = True  # 40x40 = 1600

    xyxy = np.array([[20.0, 20.0, 70.0, 60.0]])
    detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))

    area = get_detection_area(detections, 0)


def test_run_returns_dict_with_correct_key():
    """Test that run method returns a dictionary with the correct key."""
    xyxy = np.array([[0.0, 0.0, 5.0, 5.0]])
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 4.95μs -> 9.54μs (48.1% slower)


def test_run_empty_detections():
    """Test run method with empty detections object."""
    # Create empty detections
    xyxy = np.empty((0, 4), dtype=np.float32)
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 2.13μs -> 7.73μs (72.4% slower)


def test_run_zero_area_bbox():
    """Test with detection having zero area bbox (single point or line)."""
    # Point detection: same coordinates
    xyxy = np.array([[10.0, 10.0, 10.0, 10.0]])
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 4.63μs -> 8.49μs (45.5% slower)


def test_run_very_small_bbox():
    """Test with very small bounding box (fractional coordinates)."""
    xyxy = np.array([[0.0, 0.0, 0.5, 0.5]])
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 4.49μs -> 8.15μs (45.0% slower)


def test_run_large_bbox():
    """Test with very large bounding box."""
    xyxy = np.array([[0.0, 0.0, 10000.0, 10000.0]])
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 4.43μs -> 7.99μs (44.5% slower)


def test_get_detection_area_mask_with_empty_contours():
    """Test get_detection_area when mask has no contours (empty mask)."""
    # Create an empty mask (all zeros)
    mask = np.zeros((100, 100), dtype=bool)

    xyxy = np.array([[10.0, 10.0, 50.0, 50.0]])
    detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))

    area = get_detection_area(detections, 0)


def test_get_detection_area_mask_with_zero_area_contour():
    """Test get_detection_area with a mask that has zero area contours."""
    # Create a mask with only a line (contour with zero area)
    mask = np.zeros((100, 100), dtype=np.uint8)
    mask[50, 10:90] = 1  # Horizontal line

    xyxy = np.array([[10.0, 40.0, 90.0, 60.0]])
    detections = sv.Detections(xyxy=xyxy, mask=np.array([mask.astype(bool)]))

    area = get_detection_area(detections, 0)


def test_run_negative_coordinates():
    """Test with negative bounding box coordinates."""
    xyxy = np.array([[-50.0, -50.0, 50.0, 50.0]])
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 4.82μs -> 9.10μs (47.0% slower)


def test_run_inverted_bbox_coordinates():
    """Test with inverted bbox coordinates (x2 < x1)."""
    # This tests robustness; result may be negative area
    xyxy = np.array([[50.0, 50.0, 10.0, 100.0]])
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 4.43μs -> 8.20μs (46.0% slower)


def test_get_detection_area_mask_single_pixel():
    """Test get_detection_area with mask containing single pixel."""
    mask = np.zeros((100, 100), dtype=bool)
    mask[50, 50] = True  # Single pixel

    xyxy = np.array([[45.0, 45.0, 55.0, 55.0]])
    detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))

    area = get_detection_area(detections, 0)


def test_run_detection_at_image_boundary():
    """Test detection positioned at image boundaries."""
    # Detection at top-left corner
    xyxy = np.array([[0.0, 0.0, 100.0, 100.0]])
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 4.81μs -> 8.98μs (46.4% slower)


def test_run_rectangular_mask_vs_bbox_fallback():
    """Test that mask area is used when mask exists, not bbox fallback."""
    # Create a small mask within a larger bbox
    mask = np.zeros((200, 200), dtype=bool)
    mask[50:100, 50:100] = True  # 50x50 = 2500

    # Larger bbox
    xyxy = np.array([[0.0, 0.0, 200.0, 200.0]])
    detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))

    area = get_detection_area(detections, 0)


def test_run_many_detections_bbox_only():
    """Test run method with 100 detections."""
    # Create 100 detections with varying sizes
    xyxy = np.array(
        [
            [float(i * 10), float(i * 10), float(i * 10 + 50), float(i * 10 + 50)]
            for i in range(100)
        ]
    )
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 67.7μs -> 11.2μs (506% faster)
    # All should have area 50*50 = 2500
    for area in result[OUTPUT_KEY]:
        pass


def test_run_many_detections_with_masks():
    """Test run method with 50 detections with masks."""
    masks = []
    xyxy_list = []

    for i in range(50):
        # Create a mask for each detection
        mask = np.zeros((100, 100), dtype=bool)
        y_start = (i % 10) * 5
        x_start = (i // 10) * 10
        mask[y_start : y_start + 20, x_start : x_start + 20] = True  # 20x20 = 400
        masks.append(mask)
        xyxy_list.append(
            [float(x_start), float(y_start), float(x_start + 30), float(y_start + 30)]
        )

    xyxy = np.array(xyxy_list)
    detections = sv.Detections(xyxy=xyxy, mask=np.array(masks))

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 483μs -> 476μs (1.51% faster)
    # Each mask has area 400
    for area in result[OUTPUT_KEY]:
        pass


def test_run_1000_detections_bbox_only():
    """Test run method with 1000 detections for performance."""
    xyxy = np.array(
        [[float(i), float(i), float(i + 10), float(i + 10)] for i in range(1000)]
    )
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 652μs -> 25.3μs (2475% faster)


def test_run_mixed_mask_and_bbox_detections():
    """Test with some detections having masks and others not."""
    # Create detections: first 3 with masks, last 2 without
    masks_list = [
        np.ones((100, 100), dtype=bool) * False,
        np.ones((100, 100), dtype=bool) * False,
        np.ones((100, 100), dtype=bool) * False,
    ]
    # Add masks to specific regions
    masks_list[0][20:40, 20:40] = True  # 20x20 = 400
    masks_list[1][30:60, 30:60] = True  # 30x30 = 900
    masks_list[2][10:50, 10:50] = True  # 40x40 = 1600

    # Pad with False masks for the other detections
    all_masks = masks_list + [
        np.zeros((100, 100), dtype=bool),
        np.zeros((100, 100), dtype=bool),
    ]

    xyxy = np.array(
        [
            [10.0, 10.0, 50.0, 50.0],  # with mask: 400
            [20.0, 20.0, 80.0, 80.0],  # with mask: 900
            [0.0, 0.0, 100.0, 100.0],  # with mask: 1600
            [0.0, 0.0, 100.0, 100.0],  # without mask: 10000
            [50.0, 50.0, 100.0, 100.0],  # without mask: 2500
        ]
    )

    detections = sv.Detections(xyxy=xyxy, mask=np.array(all_masks))

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 64.1μs -> 64.4μs (0.512% slower)


def test_run_large_masks():
    """Test with large resolution masks (1000x1000)."""
    mask = np.zeros((1000, 1000), dtype=bool)
    mask[100:900, 100:900] = True  # 800x800 = 640000

    xyxy = np.array([[0.0, 0.0, 1000.0, 1000.0]])
    detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))

    area = get_detection_area(detections, 0)


def test_run_many_detections_with_various_shapes():
    """Test 100 detections with different aspect ratios."""
    xyxy = np.array(
        [
            (
                [float(i * 5), float(i * 2), float(i * 5 + 100), float(i * 2 + 10)]
                if i % 2 == 0
                else [float(i * 5), float(i * 2), float(i * 5 + 10), float(i * 2 + 100)]
            )
            for i in range(100)
        ]
    )
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 67.5μs -> 11.9μs (467% faster)


def test_get_detection_area_index_boundaries():
    """Test get_detection_area with different index values."""
    xyxy = np.array(
        [
            [0.0, 0.0, 10.0, 10.0],
            [20.0, 20.0, 40.0, 40.0],
            [50.0, 50.0, 80.0, 80.0],
        ]
    )
    detections = sv.Detections(xyxy=xyxy)

    # Test first, middle, and last indices
    area0 = get_detection_area(detections, 0)
    area1 = get_detection_area(detections, 1)
    area2 = get_detection_area(detections, 2)


def test_run_preserves_order_of_areas():
    """Test that areas are returned in the same order as input detections."""
    xyxy = np.array(
        [
            [0.0, 0.0, 10.0, 10.0],  # area = 100
            [0.0, 0.0, 5.0, 5.0],  # area = 25
            [0.0, 0.0, 20.0, 20.0],  # area = 400
        ]
    )
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 6.31μs -> 9.32μs (32.3% slower)


def test_run_returns_float_values():
    """Test that all returned areas are float type."""
    xyxy = np.array(
        [
            [0.0, 0.0, 5.0, 5.0],
            [10.0, 10.0, 20.0, 20.0],
        ]
    )
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 5.34μs -> 8.52μs (37.3% slower)

    for area in result[OUTPUT_KEY]:
        pass


def test_run_with_integer_coordinates():
    """Test run with integer coordinates in detection bboxes."""
    xyxy = np.array([[10, 10, 30, 40]], dtype=np.int32)
    detections = sv.Detections(xyxy=xyxy.astype(np.float32))

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 4.52μs -> 8.39μs (46.1% slower)


def test_run_with_float32_coordinates():
    """Test run with float32 precision coordinates."""
    xyxy = np.array([[10.5, 20.5, 50.7, 80.3]], dtype=np.float32)
    detections = sv.Detections(xyxy=xyxy)

    block = MaskAreaMeasurementBlockV1()
    codeflash_output = block.run(predictions=detections)
    result = codeflash_output  # 4.50μs -> 8.34μs (46.0% slower)

    # Expected: (50.7-10.5) * (80.3-20.5) = 40.2 * 59.8 ≈ 2403.96
    expected = (50.7 - 10.5) * (80.3 - 20.5)


def test_run_complex_polygon_mask():
    """Test with a complex polygon mask (L-shaped)."""
    mask = np.zeros((100, 100), dtype=bool)
    # Create L-shaped mask
    mask[10:60, 10:40] = True  # vertical part: 50x30 = 1500
    mask[40:60, 10:60] = True  # horizontal part: 20x50 = 1000
    # Overlapping region: 20x30 = 600
    # Total unique area: 1500 + 1000 - 600 = 1900

    xyxy = np.array([[0.0, 0.0, 100.0, 100.0]])
    detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))

    area = get_detection_area(detections, 0)


def test_run_circular_mask_approximation():
    """Test with approximately circular mask."""
    mask = np.zeros((100, 100), dtype=bool)
    # Create a circular-ish region using distance
    center_x, center_y = 50, 50
    radius = 25
    for i in range(100):
        for j in range(100):
            if (i - center_y) ** 2 + (j - center_x) ** 2 <= radius**2:
                mask[i, j] = True

    xyxy = np.array([[25.0, 25.0, 75.0, 75.0]])
    detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))

    area = get_detection_area(detections, 0)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To test or edit this optimization locally git merge codeflash/optimize-pr2013-2026-02-18T17.59.10

Suggested change
areas = []
for i in range(len(predictions)):
areas.append(get_detection_area(predictions, i))
if predictions.mask is not None:
areas = []
for i in range(len(predictions)):
areas.append(get_detection_area(predictions, i))
else:
xyxy = predictions.xyxy
widths = xyxy[:, 2] - xyxy[:, 0]
heights = xyxy[:, 3] - xyxy[:, 1]
areas = (widths * heights).tolist()

Static Badge

Area in square pixels.
"""
if detection.mask is not None:
mask = detection.mask[index].astype(np.uint8)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can do this in vectorized way, handling all detections at once rather than iterating over np arrays?

New block that calculates the area of detected objects in square pixels.
Uses polygon area (cv2.contourArea) for instance segmentation masks,
falls back to bounding box width * height for object detection predictions.
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 19, 2026

⚡️ Codeflash found optimizations for this PR

📄 19% (0.19x) speedup for serialise_sv_detections in inference/core/workflows/core_steps/common/serializers.py

⏱️ Runtime : 32.7 milliseconds 27.5 milliseconds (best of 7 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feat/mask-measurement-block).

Static Badge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments