Conversation
5c8f528 to
0cfcffe
Compare
| areas = [] | ||
| for i in range(len(predictions)): | ||
| areas.append(get_detection_area(predictions, i)) |
There was a problem hiding this comment.
⚡️Codeflash found 65% (0.65x) speedup for MaskAreaMeasurementBlockV1.run in inference/core/workflows/core_steps/classical_cv/mask_area_measurement/v1.py
⏱️ Runtime : 3.31 milliseconds → 2.00 milliseconds (best of 250 runs)
📝 Explanation and details
The optimized code achieves a 65% speedup by adding a fast path for detections without masks in the run method. Here's why it's faster:
Key Optimization:
The code adds a conditional check at the start of run() to detect when predictions.mask is None. When masks are absent, it bypasses the per-detection loop and uses vectorized NumPy operations instead:
# Fast path for bbox-only detections
xyxy = predictions.xyxy
widths = xyxy[:, 2] - xyxy[:, 0]
heights = xyxy[:, 3] - xyxy[:, 1]
areas = (widths * heights).tolist()Why This is Faster:
-
Vectorization vs Iteration: NumPy's vectorized operations process all bounding boxes simultaneously using optimized C code, while the original version iterates through each detection individually with Python loops and function calls.
-
Eliminates Function Call Overhead: The original code calls
get_detection_area()for every detection (2,484 times in the profiler), each incurring Python function call overhead (~7.5μs per call). The optimized version eliminates this entirely for the no-mask case. -
Memory Access Patterns: Vectorized NumPy operations benefit from better CPU cache utilization through contiguous memory access, whereas repeated indexing (
bbox = detection.xyxy[index]) causes cache misses.
Performance Impact:
From the line profiler, the critical improvement is visible in test cases with many bbox-only detections:
- 1000 detections (bbox only): 652μs → 25.3μs (2,475% faster)
- 100 detections (bbox only): 67.7μs → 11.2μs (506% faster)
For mask-based detections, performance remains similar since the code still uses the sequential get_detection_area() path (masks require cv.findContours() which cannot be easily vectorized).
What Makes This Beneficial:
This optimization is particularly effective when:
- Processing batch predictions from object detection models that output only bounding boxes (no segmentation masks)
- The function is called in tight loops or high-throughput pipelines
- Working with large numbers of detections per frame/image
The speedup scales linearly with the number of detections, making it increasingly valuable in production scenarios with many objects per image.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⏪ Replay Tests | 🔘 None Found |
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 68 Passed |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
import cv2 as cv # used implicitly by the code under test and for creating masks
import numpy as np # used to construct numeric arrays for detections and masks
# imports
import pytest # used for our unit tests
# import the real Detections class from the supervision package (used by the implementation)
import supervision as sv
# import the actual implementation under test and helpers from the original module path
from inference.core.workflows.core_steps.classical_cv.mask_area_measurement.v1 import (
OUTPUT_KEY,
MaskAreaMeasurementBlockV1,
get_detection_area,
)
def _make_bbox_array(list_of_boxes):
"""Helper to convert a list of [x1,y1,x2,y2] boxes into an np.ndarray shape (N,4)."""
return np.array(list_of_boxes, dtype=float)
def _make_mask_stack(list_of_2d_masks):
"""Helper to stack a list of 2D masks (H x W) into a (N, H, W) array as expected by the code."""
# Ensure masks are uint8 (values 0 or 255) which cv.findContours expects
stack = np.stack([np.array(m, dtype=np.uint8) for m in list_of_2d_masks], axis=0)
return stack
def test_single_bbox_without_mask_uses_bbox_area():
# Create a single bbox: (0,0) to (10,5) -> width=10 height=5 => area 50
xyxy = _make_bbox_array([[0, 0, 10, 5]])
detections = sv.Detections(xyxy=xyxy) # construct a real Detections instance
block = MaskAreaMeasurementBlockV1() # create the real block instance
# Run the block and assert the returned dict contains the expected areas list
codeflash_output = block.run(detections)
result = codeflash_output # 4.86μs -> 9.20μs (47.2% slower)
def test_single_mask_with_contour_uses_contour_area():
# Create a 10x10 mask with a filled rectangle from rows 2..6 and cols 3..7 (4x4 => area 16)
mask = np.zeros((10, 10), dtype=np.uint8)
mask[2:6, 3:7] = 255 # filled rectangle, area of 16 pixels
stacked = _make_mask_stack([mask])
# Provide a bbox that would have larger area if used, to ensure mask contour takes precedence
xyxy = _make_bbox_array([[0, 0, 100, 100]])
detections = sv.Detections(xyxy=xyxy, mask=stacked)
# Directly use the helper function too to exercise both code paths
area_via_helper = get_detection_area(detections, 0)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(detections)
result = codeflash_output # 12.2μs -> 12.6μs (3.88% slower)
def test_mask_without_contours_falls_back_to_bbox():
# A mask with all zeros yields no contours; should fall back to bbox area
mask = np.zeros((8, 8), dtype=np.uint8) # empty mask
stacked = _make_mask_stack([mask])
xyxy = _make_bbox_array([[1, 2, 6, 7]]) # width=5 height=5 => area=25
detections = sv.Detections(xyxy=xyxy, mask=stacked)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(detections)
result = codeflash_output # 13.9μs -> 14.3μs (2.73% slower)
def test_zero_sized_bbox_returns_zero_area():
# width is zero (x1 == x2)
xyxy = _make_bbox_array([[5, 5, 5, 10]])
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(detections)
result = codeflash_output # 4.33μs -> 9.01μs (51.9% slower)
def test_reversed_coordinates_can_produce_negative_area():
# Construct a bbox where x2 < x1 but y2 > y1 so width negative, height positive -> negative area
xyxy = _make_bbox_array(
[[10, 0, 5, 10]]
) # width = 5-10 = -5, height = 10 -> area = -50
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(detections)
result = codeflash_output # 4.07μs -> 7.93μs (48.7% slower)
def test_multiple_mixed_detections_combined_behavior():
# Prepare three detections:
# 1) Has a mask with a 2x2 square -> area 4
# 2) Has an empty mask -> fallback to bbox area 6*6=36
# 3) No mask -> bbox area 3*7=21
mask1 = np.zeros((6, 6), dtype=np.uint8)
mask1[1:3, 1:3] = 255 # 2x2 square -> area 4
mask2 = np.zeros((6, 6), dtype=np.uint8) # empty, fallback expected
masks = _make_mask_stack([mask1, mask2]) # two masks for first two detections
# For the third detection, we'll set mask to None by using a Detections constructed without mask.
xyxy_all = _make_bbox_array(
[
[
0,
0,
10,
10,
], # large bbox but masked -> contour area should be used for index 0
[1, 1, 7, 7], # empty mask -> fallback to bbox area 36
[2, 2, 5, 9], # no mask supplied -> bbox area 3*7=21
]
)
# Many supervision.Detections implementations expect mask to have same length as xyxy.
# To combine mixed presence, create two separate Detections and then combine their attributes in a single Detections:
# The implementation accesses detection.mask and detection.xyxy directly; to create a single Detections with mixed mask presence:
# we'll construct mask array where the third mask entry is all zeros and then set the Detections.mask attribute to None for a different object
# However, to keep things simple and robust across supervision versions, construct a Detections where mask is provided for all
# entries (third mask empty), and ensure behavior matches expected for all three.
mask3 = np.zeros((6, 6), dtype=np.uint8) # empty mask for third detection
combined_masks = _make_mask_stack([mask1, mask2, mask3])
detections = sv.Detections(xyxy=xyxy_all, mask=combined_masks)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(detections)
result = codeflash_output # 29.8μs -> 31.0μs (3.85% slower)
# Expected areas: index0 -> contour area 4, index1 -> fallback bbox (7-1)*(7-1)=36, index2 -> fallback bbox (5-2)*(9-2)=3*7=21
expected = [pytest.approx(4.0, rel=1e-6), 36.0, 21.0]
def test_large_scale_bbox_only_many_detections():
# Create 1000 detections with increasing bbox sizes: box i -> (0,0,i,i) -> area = i*i
n = 1000
boxes = [[0, 0, float(i), float(i)] for i in range(n)]
xyxy = _make_bbox_array(boxes)
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(detections)
result = codeflash_output # 640μs -> 24.6μs (2506% faster)
mid_index = n // 2
def test_large_scale_masks_many_detections_performance_and_correctness():
# Create 200 detections (reasonable for test environments) each with a small mask of 8x8
# This exercises the contour-finding code path many times.
n = 200
masks = []
boxes = []
for i in range(n):
# Create a small filled square whose size varies with i but clipped to [1,7]
size = 1 + (i % 7)
m = np.zeros((8, 8), dtype=np.uint8)
# place square at top-left corner
m[0:size, 0:size] = 255
masks.append(m)
# bbox large enough so contour area will be used instead of bbox area
boxes.append([0, 0, 100.0, 100.0])
xyxy = _make_bbox_array(boxes)
stacked_masks = _make_mask_stack(masks)
detections = sv.Detections(xyxy=xyxy, mask=stacked_masks)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(detections)
result = codeflash_output # 1.13ms -> 1.12ms (1.20% faster)
# Check first few and last few values match expected sizes squared
for i in [0, 1, 6, 7, 13, n - 1]:
expected_size = 1 + (i % 7)
expected_area = float(expected_size * expected_size)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.import cv2 as cv
import numpy as np
# imports
import pytest
import supervision as sv
from inference.core.workflows.core_steps.classical_cv.mask_area_measurement.v1 import (
OUTPUT_KEY,
MaskAreaMeasurementBlockV1,
get_detection_area,
)
def test_run_single_detection_with_bbox_only():
"""Test run method with a single detection without segmentation mask."""
# Create a detection with a bounding box but no mask
xyxy = np.array([[10.0, 20.0, 50.0, 80.0]]) # x1, y1, x2, y2
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 5.20μs -> 9.83μs (47.1% slower)
def test_run_multiple_detections_bbox_only():
"""Test run method with multiple detections without segmentation masks."""
# Create detections with bounding boxes only
xyxy = np.array(
[
[0.0, 0.0, 10.0, 10.0], # area = 10 * 10 = 100
[5.0, 5.0, 15.0, 20.0], # area = 10 * 15 = 150
[100.0, 100.0, 110.0, 110.0], # area = 10 * 10 = 100
]
)
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 6.28μs -> 8.82μs (28.7% slower)
def test_run_single_detection_with_mask():
"""Test run method with a single detection including a segmentation mask."""
# Create a simple rectangular mask
mask = np.zeros((100, 100), dtype=bool)
mask[20:50, 30:70] = True # 30x40 rectangle = 1200 pixels
xyxy = np.array([[20.0, 20.0, 70.0, 50.0]])
detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 27.1μs -> 27.6μs (1.85% slower)
def test_run_multiple_detections_with_masks():
"""Test run method with multiple detections with segmentation masks."""
# Create two circular masks
mask1 = np.zeros((100, 100), dtype=bool)
mask2 = np.zeros((100, 100), dtype=bool)
# Simple square masks for predictable areas
mask1[10:40, 10:40] = True # 30x30 = 900
mask2[50:80, 50:80] = True # 30x30 = 900
xyxy = np.array([[10.0, 10.0, 40.0, 40.0], [50.0, 50.0, 80.0, 80.0]])
detections = sv.Detections(xyxy=xyxy, mask=np.array([mask1, mask2]))
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 37.9μs -> 38.2μs (0.733% slower)
def test_get_detection_area_with_bbox_only():
"""Test get_detection_area helper function with bbox-only detection."""
xyxy = np.array([[10.0, 10.0, 50.0, 50.0]])
detections = sv.Detections(xyxy=xyxy)
area = get_detection_area(detections, 0)
def test_get_detection_area_with_mask():
"""Test get_detection_area helper function with mask."""
mask = np.zeros((100, 100), dtype=bool)
mask[20:60, 30:70] = True # 40x40 = 1600
xyxy = np.array([[20.0, 20.0, 70.0, 60.0]])
detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))
area = get_detection_area(detections, 0)
def test_run_returns_dict_with_correct_key():
"""Test that run method returns a dictionary with the correct key."""
xyxy = np.array([[0.0, 0.0, 5.0, 5.0]])
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 4.95μs -> 9.54μs (48.1% slower)
def test_run_empty_detections():
"""Test run method with empty detections object."""
# Create empty detections
xyxy = np.empty((0, 4), dtype=np.float32)
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 2.13μs -> 7.73μs (72.4% slower)
def test_run_zero_area_bbox():
"""Test with detection having zero area bbox (single point or line)."""
# Point detection: same coordinates
xyxy = np.array([[10.0, 10.0, 10.0, 10.0]])
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 4.63μs -> 8.49μs (45.5% slower)
def test_run_very_small_bbox():
"""Test with very small bounding box (fractional coordinates)."""
xyxy = np.array([[0.0, 0.0, 0.5, 0.5]])
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 4.49μs -> 8.15μs (45.0% slower)
def test_run_large_bbox():
"""Test with very large bounding box."""
xyxy = np.array([[0.0, 0.0, 10000.0, 10000.0]])
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 4.43μs -> 7.99μs (44.5% slower)
def test_get_detection_area_mask_with_empty_contours():
"""Test get_detection_area when mask has no contours (empty mask)."""
# Create an empty mask (all zeros)
mask = np.zeros((100, 100), dtype=bool)
xyxy = np.array([[10.0, 10.0, 50.0, 50.0]])
detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))
area = get_detection_area(detections, 0)
def test_get_detection_area_mask_with_zero_area_contour():
"""Test get_detection_area with a mask that has zero area contours."""
# Create a mask with only a line (contour with zero area)
mask = np.zeros((100, 100), dtype=np.uint8)
mask[50, 10:90] = 1 # Horizontal line
xyxy = np.array([[10.0, 40.0, 90.0, 60.0]])
detections = sv.Detections(xyxy=xyxy, mask=np.array([mask.astype(bool)]))
area = get_detection_area(detections, 0)
def test_run_negative_coordinates():
"""Test with negative bounding box coordinates."""
xyxy = np.array([[-50.0, -50.0, 50.0, 50.0]])
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 4.82μs -> 9.10μs (47.0% slower)
def test_run_inverted_bbox_coordinates():
"""Test with inverted bbox coordinates (x2 < x1)."""
# This tests robustness; result may be negative area
xyxy = np.array([[50.0, 50.0, 10.0, 100.0]])
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 4.43μs -> 8.20μs (46.0% slower)
def test_get_detection_area_mask_single_pixel():
"""Test get_detection_area with mask containing single pixel."""
mask = np.zeros((100, 100), dtype=bool)
mask[50, 50] = True # Single pixel
xyxy = np.array([[45.0, 45.0, 55.0, 55.0]])
detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))
area = get_detection_area(detections, 0)
def test_run_detection_at_image_boundary():
"""Test detection positioned at image boundaries."""
# Detection at top-left corner
xyxy = np.array([[0.0, 0.0, 100.0, 100.0]])
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 4.81μs -> 8.98μs (46.4% slower)
def test_run_rectangular_mask_vs_bbox_fallback():
"""Test that mask area is used when mask exists, not bbox fallback."""
# Create a small mask within a larger bbox
mask = np.zeros((200, 200), dtype=bool)
mask[50:100, 50:100] = True # 50x50 = 2500
# Larger bbox
xyxy = np.array([[0.0, 0.0, 200.0, 200.0]])
detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))
area = get_detection_area(detections, 0)
def test_run_many_detections_bbox_only():
"""Test run method with 100 detections."""
# Create 100 detections with varying sizes
xyxy = np.array(
[
[float(i * 10), float(i * 10), float(i * 10 + 50), float(i * 10 + 50)]
for i in range(100)
]
)
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 67.7μs -> 11.2μs (506% faster)
# All should have area 50*50 = 2500
for area in result[OUTPUT_KEY]:
pass
def test_run_many_detections_with_masks():
"""Test run method with 50 detections with masks."""
masks = []
xyxy_list = []
for i in range(50):
# Create a mask for each detection
mask = np.zeros((100, 100), dtype=bool)
y_start = (i % 10) * 5
x_start = (i // 10) * 10
mask[y_start : y_start + 20, x_start : x_start + 20] = True # 20x20 = 400
masks.append(mask)
xyxy_list.append(
[float(x_start), float(y_start), float(x_start + 30), float(y_start + 30)]
)
xyxy = np.array(xyxy_list)
detections = sv.Detections(xyxy=xyxy, mask=np.array(masks))
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 483μs -> 476μs (1.51% faster)
# Each mask has area 400
for area in result[OUTPUT_KEY]:
pass
def test_run_1000_detections_bbox_only():
"""Test run method with 1000 detections for performance."""
xyxy = np.array(
[[float(i), float(i), float(i + 10), float(i + 10)] for i in range(1000)]
)
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 652μs -> 25.3μs (2475% faster)
def test_run_mixed_mask_and_bbox_detections():
"""Test with some detections having masks and others not."""
# Create detections: first 3 with masks, last 2 without
masks_list = [
np.ones((100, 100), dtype=bool) * False,
np.ones((100, 100), dtype=bool) * False,
np.ones((100, 100), dtype=bool) * False,
]
# Add masks to specific regions
masks_list[0][20:40, 20:40] = True # 20x20 = 400
masks_list[1][30:60, 30:60] = True # 30x30 = 900
masks_list[2][10:50, 10:50] = True # 40x40 = 1600
# Pad with False masks for the other detections
all_masks = masks_list + [
np.zeros((100, 100), dtype=bool),
np.zeros((100, 100), dtype=bool),
]
xyxy = np.array(
[
[10.0, 10.0, 50.0, 50.0], # with mask: 400
[20.0, 20.0, 80.0, 80.0], # with mask: 900
[0.0, 0.0, 100.0, 100.0], # with mask: 1600
[0.0, 0.0, 100.0, 100.0], # without mask: 10000
[50.0, 50.0, 100.0, 100.0], # without mask: 2500
]
)
detections = sv.Detections(xyxy=xyxy, mask=np.array(all_masks))
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 64.1μs -> 64.4μs (0.512% slower)
def test_run_large_masks():
"""Test with large resolution masks (1000x1000)."""
mask = np.zeros((1000, 1000), dtype=bool)
mask[100:900, 100:900] = True # 800x800 = 640000
xyxy = np.array([[0.0, 0.0, 1000.0, 1000.0]])
detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))
area = get_detection_area(detections, 0)
def test_run_many_detections_with_various_shapes():
"""Test 100 detections with different aspect ratios."""
xyxy = np.array(
[
(
[float(i * 5), float(i * 2), float(i * 5 + 100), float(i * 2 + 10)]
if i % 2 == 0
else [float(i * 5), float(i * 2), float(i * 5 + 10), float(i * 2 + 100)]
)
for i in range(100)
]
)
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 67.5μs -> 11.9μs (467% faster)
def test_get_detection_area_index_boundaries():
"""Test get_detection_area with different index values."""
xyxy = np.array(
[
[0.0, 0.0, 10.0, 10.0],
[20.0, 20.0, 40.0, 40.0],
[50.0, 50.0, 80.0, 80.0],
]
)
detections = sv.Detections(xyxy=xyxy)
# Test first, middle, and last indices
area0 = get_detection_area(detections, 0)
area1 = get_detection_area(detections, 1)
area2 = get_detection_area(detections, 2)
def test_run_preserves_order_of_areas():
"""Test that areas are returned in the same order as input detections."""
xyxy = np.array(
[
[0.0, 0.0, 10.0, 10.0], # area = 100
[0.0, 0.0, 5.0, 5.0], # area = 25
[0.0, 0.0, 20.0, 20.0], # area = 400
]
)
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 6.31μs -> 9.32μs (32.3% slower)
def test_run_returns_float_values():
"""Test that all returned areas are float type."""
xyxy = np.array(
[
[0.0, 0.0, 5.0, 5.0],
[10.0, 10.0, 20.0, 20.0],
]
)
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 5.34μs -> 8.52μs (37.3% slower)
for area in result[OUTPUT_KEY]:
pass
def test_run_with_integer_coordinates():
"""Test run with integer coordinates in detection bboxes."""
xyxy = np.array([[10, 10, 30, 40]], dtype=np.int32)
detections = sv.Detections(xyxy=xyxy.astype(np.float32))
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 4.52μs -> 8.39μs (46.1% slower)
def test_run_with_float32_coordinates():
"""Test run with float32 precision coordinates."""
xyxy = np.array([[10.5, 20.5, 50.7, 80.3]], dtype=np.float32)
detections = sv.Detections(xyxy=xyxy)
block = MaskAreaMeasurementBlockV1()
codeflash_output = block.run(predictions=detections)
result = codeflash_output # 4.50μs -> 8.34μs (46.0% slower)
# Expected: (50.7-10.5) * (80.3-20.5) = 40.2 * 59.8 ≈ 2403.96
expected = (50.7 - 10.5) * (80.3 - 20.5)
def test_run_complex_polygon_mask():
"""Test with a complex polygon mask (L-shaped)."""
mask = np.zeros((100, 100), dtype=bool)
# Create L-shaped mask
mask[10:60, 10:40] = True # vertical part: 50x30 = 1500
mask[40:60, 10:60] = True # horizontal part: 20x50 = 1000
# Overlapping region: 20x30 = 600
# Total unique area: 1500 + 1000 - 600 = 1900
xyxy = np.array([[0.0, 0.0, 100.0, 100.0]])
detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))
area = get_detection_area(detections, 0)
def test_run_circular_mask_approximation():
"""Test with approximately circular mask."""
mask = np.zeros((100, 100), dtype=bool)
# Create a circular-ish region using distance
center_x, center_y = 50, 50
radius = 25
for i in range(100):
for j in range(100):
if (i - center_y) ** 2 + (j - center_x) ** 2 <= radius**2:
mask[i, j] = True
xyxy = np.array([[25.0, 25.0, 75.0, 75.0]])
detections = sv.Detections(xyxy=xyxy, mask=np.array([mask]))
area = get_detection_area(detections, 0)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.To test or edit this optimization locally git merge codeflash/optimize-pr2013-2026-02-18T17.59.10
| areas = [] | |
| for i in range(len(predictions)): | |
| areas.append(get_detection_area(predictions, i)) | |
| if predictions.mask is not None: | |
| areas = [] | |
| for i in range(len(predictions)): | |
| areas.append(get_detection_area(predictions, i)) | |
| else: | |
| xyxy = predictions.xyxy | |
| widths = xyxy[:, 2] - xyxy[:, 0] | |
| heights = xyxy[:, 3] - xyxy[:, 1] | |
| areas = (widths * heights).tolist() |
inference/core/workflows/core_steps/classical_cv/mask_area_measurement/v1.py
Outdated
Show resolved
Hide resolved
inference/core/workflows/core_steps/classical_cv/mask_area_measurement/v1.py
Outdated
Show resolved
Hide resolved
inference/core/workflows/core_steps/classical_cv/mask_area_measurement/v1.py
Outdated
Show resolved
Hide resolved
| Area in square pixels. | ||
| """ | ||
| if detection.mask is not None: | ||
| mask = detection.mask[index].astype(np.uint8) |
There was a problem hiding this comment.
maybe we can do this in vectorized way, handling all detections at once rather than iterating over np arrays?
inference/core/workflows/core_steps/classical_cv/mask_area_measurement/v1.py
Outdated
Show resolved
Hide resolved
New block that calculates the area of detected objects in square pixels. Uses polygon area (cv2.contourArea) for instance segmentation masks, falls back to bounding box width * height for object detection predictions.
680c506 to
23b5022
Compare
⚡️ Codeflash found optimizations for this PR📄 19% (0.19x) speedup for
|
New block that calculates the area of detected objects in square pixels. Uses polygon area (cv2.contourArea) for instance segmentation masks, falls back to bounding box width * height for object detection predictions.
What does this PR do?
Add area_measurement workflow block
New block that calculates the area of detected objects in square pixels.
Uses polygon area (cv2.contourArea) for instance segmentation masks,
falls back to bounding box width * height for object detection predictions.
Related Issue(s):
Type of Change
Testing
Test details:
I trained a model on detecting rubber ducks. Then I ran a workflow with this block and verified that the area calculated was reasonable.
Checklist
Additional Context