Skip to content

port(libvmaf/feature/x86): AVX2/AVX-512 portability fixes (Netflix PR #1475)#48

Closed
lusoris wants to merge 3 commits intomasterfrom
port/upstream-pr-1475-avx2-portability
Closed

port(libvmaf/feature/x86): AVX2/AVX-512 portability fixes (Netflix PR #1475)#48
lusoris wants to merge 3 commits intomasterfrom
port/upstream-pr-1475-avx2-portability

Conversation

@lusoris
Copy link
Copy Markdown
Owner

@lusoris lusoris commented Apr 18, 2026

Summary

Cherry-picks Netflix PR #1475
("AVX2: replace non-portable vector indexing & SIMD casts with intrinsic
extraction/reinterpretation") onto fork master.

Two upstream commits land:

  • c437f43d — replaces GCC vector subscripting (rN[0] + rN[1]) with
    _mm_extract_epi64 / _mm256_extract_epi64 across adm_avx2.c,
    adm_avx512.c, motion_avx2.c, motion_avx512.c. The vector-subscript
    syntax is a GCC extension; MSVC and stricter compilers reject it.
  • b4fcd21b — replaces a stray C-style (__m256i) cast on a __m256
    operand in adm_avx2.c with _mm256_castps_si256.

The third upstream commit (50e772a1, cosmetic indentation fix)
collapsed to a no-op against our existing clang-format-22 wrapped
formatting and was skipped.

10 conflicts arose from the fork's own modifications on these same files
(rebase-notes entries 0001 and 0013); all resolved by keeping our
wrapped formatting and adopting upstream's portable intrinsic
substitutions.

Type

  • port — cherry-pick from upstream Netflix/vmaf

Checklist

  • Commits follow Conventional Commits (PR will squash to a single
    port(...) subject; the two cherry-picks preserve upstream attribution
    to David C. Manuelda).
  • Unit tests pass: meson test -C libvmaf/build → 27/27 OK.
  • SIMD-only port; no GPU paths touched. Cross-backend ULP unchanged.
  • No new files; no license-header concerns.

Netflix golden-data gate

  • Did not modify any assertAlmostEqual(...) score in the
    Netflix golden Python tests.
  • Netflix golden VMAF mean unchanged from entry 0013 baseline
    (76.66783).

Deep-dive deliverables

  • Research digest — no digest needed: trivial portability
    substitution; intrinsics-equivalence is documented by Intel.
  • Decision matrix — no alternatives: only-one-way fix
    (substitute the documented portable intrinsic for the GCC extension).
  • AGENTS.md invariant note — covered by the existing
    libvmaf/src/feature/x86/AGENTS.md direction "match upstream byte-for-byte
    modulo our wrapped formatting"; the new invariant is captured in
    rebase-note 0016.
  • Reproducer / smoke-test command — see below.
  • CHANGELOG.md "lusoris fork" entry — added under
    ### Changed ("Upstream port -- AVX2/AVX-512 portability").
  • Rebase notedocs/rebase-notes.md entry 0016 (numbering may
    shift to 0014/0015/0017 depending on PR-fix(ci): coverage gate lcov→gcovr + ORT + lint upstream tests in-tree #46 / PR-fix(libvmaf/feature): free VIF init base pointer on fail path (Netflix PR #1476, leak half) #47 merge order; will
    rebase if needed).

Reproducer

```bash
ninja -C libvmaf/build && meson test -C libvmaf/build
libvmaf/build/tools/vmaf -r python/test/resource/yuv/src01_hrc00_576x324.yuv \
-d python/test/resource/yuv/src01_hrc01_576x324.yuv \
-w 576 -h 324 -p 420 -b 8 \
--model version=vmaf_v0.6.1 -o /tmp/vmaf-1475-port.json
grep -E '<metric name="vmaf"' /tmp/vmaf-1475-port.json

Expected: vmaf mean unchanged from entry 0013 (~ 76.66783).

```

Known follow-ups

  • The cosmetic upstream commit 50e772a1 was skipped as a no-op against
    our clang-format-22 pass; re-running upstream sync should not surface
    it again.

🤖 Generated with Claude Code

StormBytePP and others added 3 commits April 18, 2026 16:24
Adds the deep-dive deliverables for the Netflix PR Netflix#1475 port (cherry-picks
c51a73d and 456eaf1 on this branch):

- docs/rebase-notes.md entry 0016 documents the invariant (don't revert
  _mm_extract_epi64 to vector subscripting; don't revert
  _mm256_castps_si256 to a C-style cast on __m256), notes the
  collision-free coexistence with entries 0001 (ADM double-precision
  reduction) and 0013 (motion files mirror upstream), and provides a
  reproducer.
- CHANGELOG.md "Upstream port -- AVX2/AVX-512 portability" entry under
  the Unreleased section.
- Re-runs clang-format on adm_avx2.c after the upstream cast-fix
  commit landed (the upstream patch added a single-long-line that the
  fork's clang-format-22 wraps).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@lusoris
Copy link
Copy Markdown
Owner Author

lusoris commented Apr 19, 2026

Closing as fully superseded by the MSVC Windows portability sweep already on master (rebase-notes 0022, commits ddb58bdb, 555de409, and siblings).

Verified by git diff origin/master..port/upstream-pr-1475-avx2-portability -- libvmaf/src/feature/x86/:

  • Master already has the _mm256_castps_si256(_mm256_cmp_ps(...)) migration in adm_avx2.c, adm_avx512.c, motion_avx2.c, motion_avx512.c.
  • Master already has _mm_extract_epi64() replacing GCC vector-indexing in the same files.
  • Master additionally has refinements this PR does not carry: explicit (int64_t) / (uint64_t) casts on the extract-and-sum expressions, #include "feature/compat_builtin.h", and an explanatory comment about the GNU vector-extension cast that MSVC rejects.

Merging this PR would revert those refinements. Closing in favour of the master state. No code intent from upstream Netflix PR Netflix#1475 is lost.

@lusoris lusoris closed this Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants