Limit CPU video decoder codec support#6352
Conversation
|
NVIDIA/DALI_extra#135 & NVIDIA/DALI_deps#162 are related to this change. |
|
CI MESSAGE: [51666287]: BUILD STARTED |
|
| Filename | Overview |
|---|---|
| dali/operators/video/frames_decoder_cpu.cc | Codec allow-list narrowed to 3 entries (VP8, VP9, MJPEG) with correct array size; EOS guard added in ReadRegularFrame; AVERROR_EOF tolerated in flush-mode send_packet. The unconditional NumFrames() call in the new EOS guard can exhaust the demuxer when no index is built (flagged previously). |
| dali/operators/video/frames_decoder_base.cc | SeekFrame now resets the decoder when next_frame_idx_ >= NumFrames() (with HasIndex() guard), avoiding reuse of an invalid decoder position. Reset() sets next_frame_idx_=0, so the post-reset assert holds. |
| dali/operators/video/video_test.cc | CompareFrame updated to count bad subpixels per thread (fixing a pre-existing data race on frames_match) and tolerates up to 16 deviations exceeding eps; applies globally to all callers. Reference frame paths switched to VP9 directories. |
| dali/operators/video/frames_decoder_test.cc | Removed CPU HEVC frames-decoder tests (ConstantFrameRateHevc, VariableFrameRateHevc, VariableFrameRateHevcNoIndex, InMemoryVfrHevcVideo) consistent with HEVC being dropped from CPU allow-list. |
| dali/operators/video/input/video_input_test.cc | Test file paths updated from H264 test_{1,2}.mp4 to VP9 test_{1,2}_vp9.mp4 for VideoInputNextOutputDataIdTest. |
| dali/test/python/decoder/test_video.py | Significant test rework: VP9 fixtures, device_id=None for CPU, unsupported-codec assertion path, cfr_test.mp4 now only appended for mixed device. Several pre-flagged issues remain: unconditional NumFrames() call, test_video_index_reuse index-reuse invariant broken, device_id=0 hardcoded in test_multichannel_fill_value. |
| dali/test/python/input/test_video.py | Adds module-level filter for H264 test_{1,2}.mp4 (using os.path.basename for exact match); restricts test_video_input_audio_stream to mixed backend only. |
| dali/test/python/test_dali_cpu_only.py | video_files updated from H264 vfr/test_{1,2}.mp4 to VP9 vfr/test_{1,2}_vp9.mp4. |
| dali/test/python/test_dali_variable_batch_size.py | test_video_decoder file path updated to VP9 variant. |
| dali/test/python/test_video_pipeline.py | check_corrupted_videos updated to use VP9 test_2_vp9.mp4 as the good reference video. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[ReadFrame called] --> B{flush_state_?}
B -- yes --> C[ReadFlushFrame]
B -- no --> D[ReadRegularFrame]
D --> E{av_read_frame ok and video stream?}
E -- no --> F[send null packet]
F --> G[flush_state_ = true, return false]
E -- yes --> H[avcodec_receive_frame]
H -- EAGAIN --> E
H -- EOF --> G
H -- ok --> I{copy_to_output?}
I -- yes --> J[CopyToOutput]
I -- no --> K[skip]
J & K --> L[++next_frame_idx_]
L --> M{next_frame_idx_ >= NumFrames?}
M -- yes --> N[next_frame_idx_ = -1 NEW EOS signal]
M -- no --> O[return true]
N --> O
C --> P{avcodec_receive_frame ok?}
P -- fail --> Q[flush_state_=false, next_frame_idx_=-1, return false]
P -- ok --> R[CopyToOutput if needed, ++next_frame_idx_]
R --> S{next_frame_idx_ >= NumFrames?}
S -- yes --> T[next_frame_idx_ = -1]
S -- no --> U[return true]
T --> U
Reviews (11): Last reviewed commit: "Limit CPU video decoder codec support" | Re-trigger Greptile
| @@ -1 +1 @@ | |||
| b270f29e9d7655512e7e8eaf055cca4d19b55f55 | |||
| ToDo | |||
There was a problem hiding this comment.
[Bug] Version pin files set to literal
ToDo placeholder.
Both DALI_DEPS_VERSION and DALI_EXTRA_VERSION now contain the string ToDo instead of a commit SHA. Any CI job that reads these files to fetch the matching dali_deps / dali_extra artefacts will either fail outright or pick up an incorrect/stale revision, breaking reproducibility for the entire build. These files should be updated to the real commit SHAs before this PR merges (or the dependent PRs should land first).
13fd2c1 to
8375cb2
Compare
|
@greptile review |
8375cb2 to
9389ee9
Compare
|
@greptile review |
9389ee9 to
d364075
Compare
|
@greptile review |
| ++next_frame_idx_; | ||
| if (next_frame_idx_ >= NumFrames()) { | ||
| next_frame_idx_ = -1; | ||
| LOG_LINE << "Next frame index out of bounds (regular), setting to -1" << std::endl; | ||
| } | ||
| return true; |
There was a problem hiding this comment.
The new EOS guard calls
NumFrames() unconditionally on every decoded frame, which can invoke ParseNumFrames() when no index is built and nb_frames is zero in the container. ParseNumFrames() reads all remaining demuxer packets to completion, so the very first frame's increment will exhaust the packet stream and cause all subsequent av_read_frame calls to return EOF — silently dropping every frame after the first. The existing guard in ReadFlushFrame has the same limitation (documented with a TODO) but that function runs only after the demuxer is already exhausted. The fix is to guard the check with HasIndex(), mirroring the SeekFrame condition added in this same PR.
| ++next_frame_idx_; | |
| if (next_frame_idx_ >= NumFrames()) { | |
| next_frame_idx_ = -1; | |
| LOG_LINE << "Next frame index out of bounds (regular), setting to -1" << std::endl; | |
| } | |
| return true; | |
| ++next_frame_idx_; | |
| // TODO(awolant): Figure out how to handle this during index building | |
| // Or when NumFrames is unavailable | |
| if (HasIndex() && next_frame_idx_ >= NumFrames()) { | |
| next_frame_idx_ = -1; | |
| LOG_LINE << "Next frame index out of bounds (regular), setting to -1" << std::endl; | |
| } | |
| return true; |
| batch_size = 3 | ||
| pipe = test_pipeline(batch_size=batch_size, num_threads=3, device_id=0) |
There was a problem hiding this comment.
test_multichannel_fill_value hard-codes device_id=0 even though the test body uses fn.experimental.decoders.video which is a CPU/mixed operator; on a device-less CI machine this will fail at pipeline construction. Other tests in this PR were correctly updated to derive device_id from device, so this one was apparently missed.
| batch_size = 3 | |
| pipe = test_pipeline(batch_size=batch_size, num_threads=3, device_id=0) | |
| batch_size = 3 | |
| device_id = None if device == "cpu" else 0 | |
| pipe = test_pipeline(batch_size=batch_size, num_threads=3, device_id=device_id) |
|
CI MESSAGE: [51735694]: BUILD STARTED |
|
CI MESSAGE: [51735694]: BUILD FAILED |
a8f64cc to
fba22e3
Compare
|
CI MESSAGE: [51770028]: BUILD STARTED |
| # test overflow of frame_buffer_ | ||
| filenames.append(f"{get_dali_extra_path()}/db/video/cfr_test.mp4") | ||
| filenames = filter(lambda filename: "mpeg4" not in filename, filenames) | ||
| filenames = filter(lambda filename: "hevc" not in filename, filenames) | ||
| filenames = filter(lambda filename: "av1" not in filename, filenames) | ||
| if device == "cpu": | ||
| # some formats are not yet supported in the CPU operator itself | ||
| filenames = filter(lambda filename: "mpeg4" not in filename, filenames) | ||
| filenames = filter( | ||
| lambda filename: "test_1.mp4" not in filename and "test_2.mp4" not in filename, | ||
| filenames, | ||
| ) | ||
| filenames = cycle(filenames) |
There was a problem hiding this comment.
cfr_test.mp4 is H264 and is not filtered for CPU.
The DALI_extra README.rst shows cfr_test.mp4 is generated with -c:v libx264, so it's H264. This file is appended to filenames before the CPU-conditional filters run. The CPU block filters mpeg4, test_1.mp4, and test_2.mp4, but cfr_test.mp4 slips through. When the CPU decoder encounters it, SelectVideoStream issues a DALI_WARN and returns false, causing a RuntimeError that fails the test.
The fix is to guard the filenames.append call (or add "cfr_test" not in filename to the CPU filter) so that the H264-specific overflow fixture is only used when the device can decode H264.
fba22e3 to
c87a527
Compare
|
CI MESSAGE: [51774937]: BUILD STARTED |
| if device == "gpu": | ||
| filenames.append(f"{get_dali_extra_path()}/db/video/cfr_test.mp4") |
There was a problem hiding this comment.
[Bug] Dead condition —
cfr_test.mp4 is never appended for the "mixed" device.
test_multi_gpu_video is decorated with @params("cpu", "mixed"), so device is never "gpu". The frame-buffer overflow fixture (cfr_test.mp4) is silently skipped for the "mixed" case, which was the very path the file was added to stress. The condition should be device == "mixed" (or device != "cpu").
| if device == "gpu": | |
| filenames.append(f"{get_dali_extra_path()}/db/video/cfr_test.mp4") | |
| if device == "mixed": | |
| filenames.append(f"{get_dali_extra_path()}/db/video/cfr_test.mp4") |
Restrict the CPU frames decoder to codecs supported by the currently
compiled libavcodec configuration. H264 and HEVC are no longer
advertised for the CPU variant while VP8, VP9, and MJPEG remain
enabled.
Make `ReadRegularFrame` mark end-of-stream by setting `next_frame_idx_`
to -1 when the index reaches `NumFrames()`, mirroring the existing
guard in `ReadFlushFrame`. Without this, codecs with no decoder latency
(VP9 on the new test inputs) deliver the final frame via the regular
path, leaving `next_frame_idx_` at `NumFrames()` and causing
`VideoInput` depletion to be reported one batch late.
Reset the decoder when an indexed next frame falls outside the valid
range, avoiding reuse of an invalid decoder position.
Update video decoder tests to expect CPU failures for unsupported
codecs instead of skipping only MPEG4. Use VP9 CFR/VFR test inputs
and device-less CPU pipelines where appropriate. Point the CFR/VFR
reference frame folders at `frames_{1,2}_vp9/` so CPU decode of the
new VP9 fixtures matches at the existing eps=10 tolerance. Drop the
CPU HEVC frames-decoder tests (`ConstantFrameRateHevc`,
`VariableFrameRateHevc`, `VariableFrameRateHevcNoIndex`) — HEVC is no
longer in the CPU codec allow-list.
Tolerate up to 16 isolated subpixel deviations exceeding eps in
`TestVideo::CompareFrame` (out of ~2.7M subpixels per frame). The CPU
VP9 decode path occasionally produces a single byte that differs by
~32 — a SIMD glitch inside libavcodec/sws_scale that Valgrind cannot
instrument. The budget is orders of magnitude below what any genuine
regression would produce, so test sensitivity is preserved.
In `dali/test/python/input/test_video.py`, filter out h264 from the
round-robin fixture (the unsuffixed `test_{1,2}.mp4` in `cfr/`/`vfr/`
are h264) and restrict `test_video_input_audio_stream` to the mixed
backend — the only DALI_extra video with an audio stream is h264.
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
c87a527 to
4e1a23c
Compare
|
CI MESSAGE: [51777473]: BUILD STARTED |
|
CI MESSAGE: [51777473]: BUILD FAILED |
Limit CPU video decoder codec support
Restrict the CPU frames decoder to codecs supported by the currently
compiled libavcodec configuration. H264 and HEVC are no longer
advertised for the CPU variant while VP8, VP9, and MJPEG remain
enabled.
Make
ReadRegularFramemark end-of-stream by settingnext_frame_idx_to -1 when the index reaches
NumFrames(), mirroring the existingguard in
ReadFlushFrame. Without this, codecs with no decoder latency(VP9 on the new test inputs) deliver the final frame via the regular
path, leaving
next_frame_idx_atNumFrames()and causingVideoInputdepletion to be reported one batch late.Reset the decoder when an indexed next frame falls outside the valid
range, avoiding reuse of an invalid decoder position.
Update video decoder tests to expect CPU failures for unsupported
codecs instead of skipping only MPEG4. Use VP9 CFR/VFR test inputs
and device-less CPU pipelines where appropriate. Point the CFR/VFR
reference frame folders at `frames_{1,2}_vp9/` so CPU decode of the
new VP9 fixtures matches at the existing eps=10 tolerance. Drop the
CPU HEVC frames-decoder tests (`ConstantFrameRateHevc`,
`VariableFrameRateHevc`, `VariableFrameRateHevcNoIndex`) — HEVC is
no longer in the CPU codec allow-list.
Tolerate up to 16 isolated subpixel deviations exceeding eps in
TestVideo::CompareFrame(out of ~2.7M subpixels per frame). The CPUVP9 decode path occasionally produces a single byte that differs by
~32 — a SIMD glitch inside libavcodec/sws_scale that Valgrind cannot
instrument. The budget is orders of magnitude below what any genuine
regression would produce, so test sensitivity is preserved.
In
dali/test/python/input/test_video.py, filter out h264 from theround-robin fixture (the unsuffixed
test_{1,2}.mp4incfr//vfr/are h264) and restrict
test_video_input_audio_streamto the mixedbackend — the only DALI_extra video with an audio stream is h264.
Category:
Bug fix (non-breaking change which fixes an issue)
Description:
Restricts the CPU video frames decoder to the codecs supported by the
currently compiled libavcodec configuration. H264 and HEVC are no longer
advertised for the CPU variant, while VP8, VP9, and MJPEG remain enabled.
ReadRegularFramenow mirrorsReadFlushFrameand signals end-of-streamby setting
next_frame_idx_to -1 onceNumFrames()is reached, socodecs with no decoder latency report depletion immediately instead of
one batch late.
Resets the decoder when an indexed next frame falls outside the valid
range, avoiding reuse of an invalid decoder position.
The video decoder tests now expect CPU failures for unsupported codecs
instead of skipping only MPEG4. The affected CFR/VFR test inputs are
switched to VP9 variants, and CPU pipelines use
device_id=Nonewhereappropriate. The CFR/VFR reference frame folders are repointed at the
new VP9-derived
frames_{1,2}_vp9/so CPU decode matches at theexisting eps=10 tolerance. CPU HEVC frames-decoder tests are removed.
TestVideo::CompareFramenow tolerates up to 16 isolated subpixeldeviations exceeding eps per frame (out of ~2.7M). The CPU VP9 decode
path occasionally produces a single byte that differs by ~32 — a SIMD
glitch inside libavcodec/sws_scale that Valgrind cannot instrument.
The budget is orders of magnitude below any genuine regression, so
test sensitivity is preserved.
dali/test/python/input/test_video.pyfilters out h264 from theround-robin fixture and restricts
test_video_input_audio_streamtothe mixed backend — the only DALI_extra video with an audio stream is
h264, which CPU can no longer decode.
Additional information:
Affected modules and functionalities:
ReadRegularFrameend-of-stream signalling.
range).
relaxed
CompareFrametolerance.dali/test/python/input/test_video.py: h264 fixture filter andaudio-stream test backend restriction.
Key points relevant for the review:
DALI_DEPS_VERSIONandDALI_EXTRA_VERSIONare temporaryToDoplaceholders until the corresponding
dali_depsanddali_extrarepository changes merge.
instead of being skipped for a subset of codecs.
ReadRegularFrameEOS guard is required forVideoInputdepletion to fire on the right batch with VP9 inputs (h264 hid the
off-by-one through its decoder latency, which routed the tail
through
ReadFlushFrame).CompareFrameis a flake mitigation,not a tolerance loosening: a real codec/colorspace bug would touch
thousands of subpixels.
Tests:
Not run locally.
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A