Fix extreme memory usage when loading OBB datasets#2187
Merged
Conversation
2 tasks
…ory usage Agent-Logs-Url: https://github.com/roboflow/supervision/sessions/173888c9-f492-4c33-9f7d-4ee58bac4f90 Co-authored-by: Borda <6035284+Borda@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix extreme memory usage loading OBB dataset
Fix extreme memory usage when loading OBB datasets
Mar 30, 2026
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #2187 +/- ##
=======================================
Coverage 77% 77%
=======================================
Files 62 62
Lines 7637 7637
=======================================
+ Hits 5903 5908 +5
+ Misses 1734 1729 -5 🚀 New features to boost your workflow:
|
[resolve #6] /review finding by doc-scribe: add note to force_masks docstring that it has no effect when is_obb=True, preventing caller confusion. --- Co-authored-by: Claude Code <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR prevents extreme memory usage when loading YOLO OBB datasets by ensuring OBB annotation lines (which have >5 tokens) do not trigger segmentation-mask generation in load_yolo_annotations.
Changes:
- Disable
with_maskswhenis_obb=Trueto avoid allocating per-image(N, H, W)boolean mask arrays for OBB data. - Add regression tests confirming OBB loading never produces masks (even with
force_masks=True) and that non-OBB segmentation still produces masks.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/supervision/dataset/formats/yolo.py |
Prevents OBB datasets from enabling mask generation based on token count. |
tests/dataset/formats/test_yolo.py |
Adds coverage to prevent the OBB mask-memory regression and verify segmentation behavior remains intact. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…e-memory-obb-dataset
…ub.com/roboflow/supervision into copilot/fix-extreme-memory-obb-dataset
Borda
approved these changes
Mar 31, 2026
Borda
added a commit
that referenced
this pull request
Mar 31, 2026
* fix: prevent mask generation for OBB annotations to avoid extreme memory usage * docs: clarify force_masks is ignored when is_obb=True * test: pin force_masks=True is ignored for OBB and segmentation mask regression --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Borda <6035284+Borda@users.noreply.github.com> Co-authored-by: Claude Code <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
OBB annotation lines have 9 values (
class_id x1 y1 x2 y2 x3 y3 x4 y4), which caused_with_seg_maskto returnTrue(it checks for> 5values). This incorrectly setwith_masks=Truefor OBB data, generating an(N, W, H)boolean mask array per image — tens of GBs for high-resolution datasets.Changes
src/supervision/dataset/formats/yolo.py: Guardwith_maskswithnot is_obbinload_yolo_annotations, so OBB annotations never trigger mask generation:tests/dataset/formats/test_yolo.py: Addedtest_load_yolo_annotations_obb_does_not_generate_masks— creates a minimal OBB dataset on disk, loads it withis_obb=True, and assertsdetection.mask is None.🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.