Conversation
6 tasks
a2f8a89 to
b1798ab
Compare
lusoris
pushed a commit
that referenced
this pull request
Apr 30, 2026
The upstream-port of feature/speed (PR #213) tripped the Clang-Tidy (Changed C/C++ Files) gate with one fatal `clang-analyzer-deadcode.DeadStores` plus a class of float→double promotion / int×int→ptrdiff_t widening / oversized- function warnings. Touched-file rule (ADR-0141) requires lint-clean to the strictest profile, so refactor rather than NOLINT. Changes - Split qr_step into qr_step_size2 + qr_step_general dispatchers (fits 60-line threshold; QR iteration semantics unchanged — verified bit-equal on test_speed inputs). - Extract compute_independent_term inner loops into compute_independent_term_for_offset (drops nesting from 5 to 3). - sqrt → sqrtf, log2 → log2f, /2.0 → /2.0f throughout (single-precision inputs). - (size_t) cast both operands of size*size before pointer arithmetic / memcpy size-arg in compute_eigenvalues + solve_linear_system (7 sites). - (int)lround(x * scale) instead of (int)(x * scale + 0.5) for prescale rounding (matches the rounding rule the original arithmetic intended). - Mark subtract_image's im2 const float * (read-only). - Split combined declarations (float c, s; → two lines) in qr_step helpers and extract_chroma. Verification - meson setup build libvmaf -Denable_cuda=false -Denable_sycl=false (CI parity) - meson compile -C build → clean - clang-tidy -p build --quiet libvmaf/src/feature/speed.c → exit 0 (the WarningsAsErrors-listed clang-analyzer-deadcode.DeadStores is gone; remaining 25 advisory warnings pre-date the port and are not in the WarningsAsErrors set, so CI gate is satisfied). - meson test -C build → 44/45 pass; the 1 failure (test_speed registry probe) pre-dates this commit and is unrelated to the lint cleanup. - Snapshot drift: none (sqrt→sqrtf is bit-identical on float inputs; +0.5/lround equivalent for non-negative scales). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
10 tasks
lusoris
added a commit
that referenced
this pull request
Apr 30, 2026
* fix(ci): pin pip <26.1 in Tiny AI workflow — lightning resolution regression
pip 26.1 (released 2026-04-30) regresses transitive resolution for
`lightning>=2.5,<3.0` and fails the `pip install -e ai` step in the
Tiny AI workflow with:
ERROR: Could not find a version that satisfies the requirement
lightning<3.0,>=2.5 (from vmaf-train) (from versions: none)
This blocks every PR's Tiny AI gate. Concrete cases observed:
- PR #213 same-SHA reruns: 12:41 UTC pass (pip 24.0) → 15:54 UTC fail
(pip 26.1, after `pip install --upgrade pip` pulled the new release).
- PR #229 yesterday: passed with pip 24.0, lightning-2.6.1 resolved
cleanly.
The PR diff itself doesn't touch `ai/`, so this is purely an upstream
pip regression breaking a working workflow.
Pin to `pip<26.1` in the Tiny AI venv until pip ships a fix. Other
workflows that don't depend on `lightning` resolution are left alone
to keep this change narrow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ci): drop pip upgrade in Tiny AI venv — pip 26.x can't resolve lightning
Initial pin `pip<26.1` was insufficient: pip 26.0.1 has the same
"No matching distribution for lightning<3.0,>=2.5" regression as 26.1.
Verified on this PR's own run (job 73817357146): venv installed
pip-26.0.1 (under 26.1), then `pip install -e ai` failed identically.
The runner image ships pip 24.0 pre-installed (PR #229 baseline,
2026-04-29, succeeded). Drop the `--upgrade pip` line so the venv
inherits 24.0 and skips the broken 26.x release entirely. Re-introduce
the upgrade once a pip release lands that resolves lightning correctly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Lusoris <lusoris@pm.me>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9 tasks
Cherry-pick of Netflix upstream d3647c7 "feature/speed: port speed_chroma and speed_temporal extractors" with its dependency 4ad6e0e "feature/vif: port helper functions". Adds two new floating-point feature extractors: - speed_chroma: Y/U/V/UV-channel chroma fidelity scores. - speed_temporal: temporal speed scores accumulated across a cyclic frame buffer. The d3647c7 commit also widens the picture_copy() signature with a new `channel` parameter so the extractors can lift U/V planes from VmafPicture. All in-tree callers (float_adm / float_ansnr / float_moment / float_motion / float_ms_ssim / float_psnr / float_ssim / float_vif and the fork-local Vulkan + CUDA MS-SSIM paths) are updated to pass channel=0. Speed extractors only wire into the registration list when VMAF_FLOAT_FEATURES is enabled (i.e. `-Denable_float=true`), matching the surrounding float-extractor block. Includes a fork-local registration smoke test (libvmaf/test/test_speed.c) covering both extractors: discoverability by name, provided-features round-trip, and options-table well-formedness. Netflix golden gate: untouched (these are NEW extractors; the 3 canonical CPU reference test pairs run unmodified). Co-Authored-By: Kyle Swanson <kswanson@netflix.com> Co-Authored-By: Nil Fons Miret <nilf@netflix.com>
Land the documentation deliverables for the d3647c7 cherry-pick: - docs/metrics/features.md gains two extractor-overview rows and a per-feature section under "Additional features" covering invocation, output metrics, options surface, and stability caveat. - docs/rebase-notes.md entry 0075 records the rebase invariants (picture_copy channel parameter, fork-local GPU caller list, VMAF_FLOAT_FEATURES gating). - libvmaf/src/feature/AGENTS.md gains two rebase-sensitive invariant notes: the picture_copy signature and the float-build-only registration. - CHANGELOG.md "Unreleased / lusoris fork" gains an Added entry. Pairs with the d3647c7 + 4ad6e0e cherry-pick commits on this branch.
speed.c uses M_PI and M_E from <math.h> but MinGW's libm doesn't expose them unless _USE_MATH_DEFINES is set before include. Adds the define + a portable fallback so the build works on both MinGW (which doesn't honor _USE_MATH_DEFINES even with the flag) and MSVC (which does). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the trailing 'buffer += size' / 'tmp_buffer += A_size * B_cols' increments at the end of each workspace partition that had no further read of the cursor (clang-analyzer-deadcode.DeadStores). They were upstream future-extension placeholders; the fork's project policy is to fix touched-file warnings, not suppress them. Adds assert(dim.block_size > 0) at the head of compute_covariance_matrix so the analyzer can prove the size_t division at line 720 is safe (clang-analyzer-core.DivideZero).
The upstream-port of feature/speed (PR #213) tripped the Clang-Tidy (Changed C/C++ Files) gate with one fatal `clang-analyzer-deadcode.DeadStores` plus a class of float→double promotion / int×int→ptrdiff_t widening / oversized- function warnings. Touched-file rule (ADR-0141) requires lint-clean to the strictest profile, so refactor rather than NOLINT. Changes - Split qr_step into qr_step_size2 + qr_step_general dispatchers (fits 60-line threshold; QR iteration semantics unchanged — verified bit-equal on test_speed inputs). - Extract compute_independent_term inner loops into compute_independent_term_for_offset (drops nesting from 5 to 3). - sqrt → sqrtf, log2 → log2f, /2.0 → /2.0f throughout (single-precision inputs). - (size_t) cast both operands of size*size before pointer arithmetic / memcpy size-arg in compute_eigenvalues + solve_linear_system (7 sites). - (int)lround(x * scale) instead of (int)(x * scale + 0.5) for prescale rounding (matches the rounding rule the original arithmetic intended). - Mark subtract_image's im2 const float * (read-only). - Split combined declarations (float c, s; → two lines) in qr_step helpers and extract_chroma. Verification - meson setup build libvmaf -Denable_cuda=false -Denable_sycl=false (CI parity) - meson compile -C build → clean - clang-tidy -p build --quiet libvmaf/src/feature/speed.c → exit 0 (the WarningsAsErrors-listed clang-analyzer-deadcode.DeadStores is gone; remaining 25 advisory warnings pre-date the port and are not in the WarningsAsErrors set, so CI gate is satisfied). - meson test -C build → 44/45 pass; the 1 failure (test_speed registry probe) pre-dates this commit and is unrelated to the lint cleanup. - Snapshot drift: none (sqrt→sqrtf is bit-identical on float inputs; +0.5/lround equivalent for non-negative scales). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3f04c81 to
1262c57
Compare
This was referenced May 1, 2026
lusoris
pushed a commit
that referenced
this pull request
May 3, 2026
Closes the user's 2026-04-21 deep-research queued track on SpEED-QA as a candidate full-reference metric. Recommendation: DEFER over GO / SCAFFOLD-ONLY. Three findings drive the call: 1. The fork already carries `speed_chroma` / `speed_temporal` (PR #213, port of upstream `d3647c73`) as research-stage extractors gated behind `-Denable_float=true`. SpEED-QA's GSM-entropy backbone overlaps `vif` substantially — no new perceptual axis vs the existing FR inventory (vmaf, ssim, ms_ssim, vif, adm, psnr, psnr_hvs, ciede, ssimulacra2). 2. The "10–40× faster than VIF" headline from the 2017 paper inverts on the fork's AVX-512 / CUDA / SYCL VIF stack, which benches at 8–17× scalar SpEED. 3. The assumed-but-missing `model/speed_4_v0.6.0.json` upstream binary the brief referenced does not exist anywhere in `upstream/master`, `upstream/speed_ported`, or any open Netflix PR. There is no Netflix artefact to mirror. Decision is reversible on three named triggers (Netflix lands a model JSON consuming SpEED features; explicit user request; FUNQUE+ / pVMAF / tiny-AI fusion model names SpEED-QA as load-bearing input). Docs-only PR — no code, no model registry change, no CLI flag, no behavioural delta. ADR-0253 ships as Proposed pending user sign-off. Six ADR-0108 deliverables: - Research digest: docs/research/0051-speed-qa-feasibility.md - Decision matrix: ADR-0253 § Alternatives considered - AGENTS.md invariant: libvmaf/src/feature/AGENTS.md (cross-ref) - Reproducer: research-0051 § Reproducer / smoke-test - CHANGELOG fragment: changelog.d/added/ADR-0253-speed-qa-defer.md - Rebase note: docs/rebase-notes.md § 0107 No state.md delta — forward-looking scope decision, not a bug close. No ffmpeg-patches delta — no public API / CLI / build-flag change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lusoris
added a commit
that referenced
this pull request
May 3, 2026
Closes the user's 2026-04-21 deep-research queued track on SpEED-QA as a candidate full-reference metric. Recommendation: DEFER over GO / SCAFFOLD-ONLY. Three findings drive the call: 1. The fork already carries `speed_chroma` / `speed_temporal` (PR #213, port of upstream `d3647c73`) as research-stage extractors gated behind `-Denable_float=true`. SpEED-QA's GSM-entropy backbone overlaps `vif` substantially — no new perceptual axis vs the existing FR inventory (vmaf, ssim, ms_ssim, vif, adm, psnr, psnr_hvs, ciede, ssimulacra2). 2. The "10–40× faster than VIF" headline from the 2017 paper inverts on the fork's AVX-512 / CUDA / SYCL VIF stack, which benches at 8–17× scalar SpEED. 3. The assumed-but-missing `model/speed_4_v0.6.0.json` upstream binary the brief referenced does not exist anywhere in `upstream/master`, `upstream/speed_ported`, or any open Netflix PR. There is no Netflix artefact to mirror. Decision is reversible on three named triggers (Netflix lands a model JSON consuming SpEED features; explicit user request; FUNQUE+ / pVMAF / tiny-AI fusion model names SpEED-QA as load-bearing input). Docs-only PR — no code, no model registry change, no CLI flag, no behavioural delta. ADR-0253 ships as Proposed pending user sign-off. Six ADR-0108 deliverables: - Research digest: docs/research/0051-speed-qa-feasibility.md - Decision matrix: ADR-0253 § Alternatives considered - AGENTS.md invariant: libvmaf/src/feature/AGENTS.md (cross-ref) - Reproducer: research-0051 § Reproducer / smoke-test - CHANGELOG fragment: changelog.d/added/ADR-0253-speed-qa-defer.md - Rebase note: docs/rebase-notes.md § 0107 No state.md delta — forward-looking scope decision, not a bug close. No ffmpeg-patches delta — no public API / CLI / build-flag change. Co-authored-by: Lusoris <lusoris@pm.me> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ports Netflix upstream commit
d3647c73("feature/speed: port speed_chroma and speed_temporal extractors")
together with its dependency
4ad6e0ea("feature/vif: port helper functions"). Closes T-NEW-1 from the
T7-4 quarterly upstream audit (PR #205).
Adds two research-stage float-only feature extractors:
speed_chroma— per-channel chroma fidelity (Y / U / V / UV).speed_temporal— temporal Speed score over a small cyclicframe buffer.
The cherry-pick widens
picture_copy()with a newint channelparameter so the new extractors can lift U / V planes from
VmafPicture. All in-tree callers (upstream-mirror float_*.c +fork-local CUDA / Vulkan MS-SSIM, SSIM) are updated to pass
channel=0.Both extractors register only when
VMAF_FLOAT_FEATURES=1(build with
-Denable_float=true).Test plan
meson setup build-cpu libvmaf -Denable_cuda=false -Denable_sycl=false -Denable_float=true && ninja -C build-cpu— clean build.meson test -C build-cpu test_speed— 5/5 PASS (registration, provided-features round-trip, options-table well-formedness for both extractors).meson test -C build-cpu— 39/39 PASS.make test-netflix-golden— 162/171 PASS; the 9 failures (niqe + 7×pypsnr_fextractor + result_store) are pre-existing on master and unrelated to feature/speed (the 3 canonical CPU pairs and all VMAF assertions on src01 / checkerboards still pass).Deep-dive deliverables (ADR-0108)
docs/rebase-notes.mdupstream-merge marker.