feat(annotate_subtasks): 448x448 frames; select camera0 from standard mapping#257
Merged
Merged
Conversation
…tandard mapping - Replace --target-width (640) with --target-size (448) and resize each frame by scaling the shorter side to target_size + center-cropping to a square, matching the standard VLM preprocessing pipeline. - Resolve the per-dataset video feature from camera0 of the standard data format mapping (DatasetConfig.data_features_name_mapping inline override in the mixture config first, then DATA_FEATURES_NAME_MAPPING globally). Fall back to the first dtype=='video' feature in info.json when no mapping exists; skip datasets whose camera0 resolves to a non-video feature (not a LeRobot video dataset). - Add tests for _resize_and_center_crop and _resolve_camera0_video_key. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
[claude-review] summary for commit 6e0380c No blocking issues found. |
shuheng-liu
approved these changes
May 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
Two changes to
src/opentau/scripts/annotate_subtasks.py(🗃️ Feature):Default frame resolution → 448 × 448 square. Replaces
--target-width 640with--target-size 448. Frames are now scaled so the shorter side equalstarget_size, then center-cropped totarget_size × target_size, matching standard VLM preprocessing. Already-square inputs at the target resolution pass through untouched.camera0selection from the standard data format mapping. The dataset'scamera0video feature is resolved with this priority:DatasetConfig.data_features_name_mapping["camera0"]inline on the mixture config (src/opentau/configs/default.py:101),DATA_FEATURES_NAME_MAPPING[repo_id]["camera0"]fromsrc/opentau/datasets/standard_data_format_mapping.py,info.jsonwithdtype=='video'(the script's pre-existing behavior).Datasets whose
camera0resolves to a feature that is missing frominfo.jsonor hasdtype != 'video'(e.g. thedummy/vsr/cocoqarows that mapcamera0 → "image") are skipped with a warning — they are not LeRobot video datasets.The module docstring and CLI
--helptext are updated to match.How it was tested
TestResizeAndCenterCrop(5 cases: landscape / portrait / already-square / upscale / non-default size) andTestResolveCamera0VideoKey(8 cases: inline-vs-global precedence, fallback paths, non-video feature, missing feature, missing repo_id, no-video-features) intests/scripts/test_annotate_subtasks.py.How to checkout & try? (for the reviewer)
Checklist
🤖 Generated with Claude Code