Enhance VideoReader initialization with path validation#1845
Conversation
Signed-off-by: Ao Tang <aot@nvidia.com>
|
/ok to test 799c7a2 |
Greptile SummaryThis PR adds path validation to
Confidence Score: 4/5Safe to merge after addressing the case-sensitivity inconsistency in the directory rglob check. One remaining P2 finding (case-sensitive rglob vs case-insensitive suffix check) describes a real inconsistency that could cause false negatives on Linux; prior P1/P2 concerns from earlier review rounds are mostly addressed (directory guard added, test iterator fixed with side_effect). The change is otherwise well-structured with good test coverage. nemo_curator/stages/video/io/video_reader.py — specifically the rglob extension patterns on line 269 Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[VideoReader.__post_init__] --> B{is_remote_url?}
B -->|Yes| Z[Skip validation - OK]
B -->|No| C[path = Path input_video_path]
C --> D{path.exists?}
D -->|No| E[raise FileNotFoundError\nVideo directory does not exist]
D -->|Yes| F{path.is_file?}
F -->|Yes| G{suffix.lower in\nvideo_extensions?}
G -->|No| H[raise FileNotFoundError\nNot a supported video file]
G -->|Yes| Z
F -->|No - is dir| I{any rglob match\nfor lowercase ext?}
I -->|No| J[raise FileNotFoundError\nNo video files found]
I -->|Yes| Z
Reviews (5): Last reviewed commit: "Merge branch 'main' into aot/video-pipel..." | Re-trigger Greptile |
| video_extensions = (".mp4", ".mov", ".avi", ".mkv", ".webm") | ||
| if path.is_file(): | ||
| if path.suffix.lower() not in video_extensions: | ||
| msg = f"Not a supported video file: {self.input_video_path}" |
There was a problem hiding this comment.
Can we extend this message. Support formats: .....
Signed-off-by: Ao Tang <aot@nvidia.com>
| msg = f"Not a supported video file: {self.input_video_path}. Supported formats: {', '.join(video_extensions)}" | ||
| raise FileNotFoundError(msg) |
There was a problem hiding this comment.
FileNotFoundError is semantically wrong for a format mismatch
FileNotFoundError signals that the file doesn't exist, but here the file was found — it just has an unsupported extension. Any caller that catches FileNotFoundError to handle "path not found" cases (e.g., to create a path or retry) will silently swallow a format-mismatch error. ValueError (or a custom exception) is the appropriate type here.
|
/ok to test 98defc5 |
|
/ok to test 567bc1c |
Description
Usage
# Add snippet demonstrating usageChecklist