docs(research): T7-9 — Intel AI-PC NPU/EP applicability digest by lusoris · Pull Request #194 · lusoris/vmaf

lusoris · 2026-04-29T09:03:45Z

Summary

Backlog item T7-9: research digest evaluating whether the tiny-AI surface should add first-class support for Intel AI-PC platforms (Meteor / Lunar / Arrow Lake — NPU + integrated Xe/Xe2 GPU).

Verdict: defer the NPU path until a maintainer has hardware to validate int8 + fp16 accuracy gates against Research-0006's PTQ pipeline. The integrated Xe GPU portion of an AI-PC platform is already reachable today through the existing --tiny-device openvino path (same code path the Arc A380 uses), so the iGPU surface costs the fork zero additional code; only the NPU device type is genuinely new surface and is the part deferred.

Re-evaluation triggers documented in the digest §5: hardware acquisition, explicit user request, or a dedicated ORT NPU EP shipping.

Note: the Intel developer overview URL was unreachable from this session's WebFetch sandbox; the digest is therefore explicitly flagged as a training-context summary plus in-tree fork-doc anchors, not freshly fetched citations. All vendor claims should be re-verified before any code lands on the back of the digest.

Files

docs/research/0031-intel-ai-pc-applicability.md (new, 5-section digest per the BACKLOG row spec)
docs/research/README.md (index row)
docs/ai/inference.md (one-line forward-pointer in the EP matrix so readers find the digest)
CHANGELOG.md (lusoris fork entry)

ADR-0108 deep-dive 6-key checklist

research digest — docs/research/0031-intel-ai-pc-applicability.md
decision matrix in ADR ## Alternatives considered — no ADR needed: doc-only, verdict is defer. The digest's own ## Alternatives explored section carries the equivalent decision matrix.
AGENTS.md invariant note — no rebase-sensitive invariants: doc-only, no source code touched.
reproducer / smoke-test command — make format-check && pre-commit run --files docs/research/0031-intel-ai-pc-applicability.md docs/research/README.md docs/ai/inference.md CHANGELOG.md (already passes locally).
CHANGELOG.md lusoris-fork entry — under ## [Unreleased] — lusoris fork.
docs/rebase-notes.md entry — no rebase impact: doc-only changes under docs/, no upstream-mirrored code touched.

Test plan

Pre-commit hooks pass on touched files (trim whitespace, EOF, merge-conflict, mixed line ending, secrets, conventional-commit).
No source files modified — Netflix golden gate not exercised, but cannot regress.
Reviewer eyeballs the verdict (defer) and confirms the re-evaluation triggers in §5 are reasonable.

Research-0031 evaluates whether the tiny-AI surface should add first-class NPU support for Intel Meteor / Lunar / Arrow Lake AI-PC platforms. Verdict: defer the NPU path — no maintainer hardware available to validate int8 + fp16 accuracy gates against Research-0006's PTQ pipeline. The integrated Xe / Xe2 GPU portion of an AI-PC platform is already reachable today through the existing --tiny-device openvino path (same code path the Arc A380 uses), so the iGPU surface costs zero new code; only the NPU device type is genuinely new surface and is the part that's deferred. Closes backlog T7-9. Doc-only — no C/Python source changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…h T7-9 T7-9 (#194, just merged) shipped Research-0031 (Intel AI-PC NPU applicability digest). This PR's cambi-vulkan-integration digest was independently numbered 0031 by the agent that drafted it. Renumber to 0032 to keep the one-number-per-digest invariant. References updated: filename, in-body title, ADR-0210 cross-link, ADR-0210 README index row, CHANGELOG.md, docs/rebase-notes.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(vulkan): T7-36 — cambi Vulkan integration (Strategy II) Closes the GPU long-tail matrix terminus (per ADR-0192 + ADR-0205). Replaces the spike scaffold's `init_stub`/`extract_stub`/`close_stub` triple in `libvmaf/src/feature/vulkan/cambi_vulkan.c` with the full Vulkan-aware lifecycle. After this PR every registered feature extractor in the fork has at least one GPU twin (lpips remains via ORT EPs per ADR-0022). Strategy II hybrid (per ADR-0205 §Decision): - GPU runs the integer phases — preprocess (forward-compatible scaffold; v1 wires the CPU bilinear-resize for bit-exactness on resolution mismatches), per-pixel derivative, the 7×7 spatial mask SAT, 2× decimate, and the separable 3-tap mode filter. - Host runs the precision-sensitive sliding-histogram `calculate_c_values` + top-K spatial pooling + scale-weighted final score on byte-identical readback buffers. - Bit-exact w.r.t. CPU by construction (every GPU phase is integer arithmetic; host residual runs the unmodified CPU code on byte-identical buffers); cross-backend gate runs at `places=4` from day one with no per-metric tolerance carve-out. New shaders + 1 unified TU for the 3 SAT phases: - `cambi_preprocess.comp` (new) — per-pixel decimate + bit-shift + optional anti-dither, exact-resolution fast path. - `cambi_mask_dp.comp` (new) — single TU with `PASS=0/1/2` spec const for row-SAT / col-SAT / threshold-compare. - Existing `cambi_derivative.comp`, `cambi_filter_mode.comp`, `cambi_decimate.comp` shaders wired into the dispatch chain unchanged (renamed `min3` → `cambi_min3` / `mode3` → `cambi_mode3` to avoid the GLSL precision-overload conflict). `cambi_internal.h` (new) exposes cambi.c's file-static helpers (`vmaf_cambi_calculate_c_values`, `vmaf_cambi_get_spatial_mask`, `vmaf_cambi_decimate`, `vmaf_cambi_filter_mode`, `vmaf_cambi_spatial_pooling`, `vmaf_cambi_weight_scores_per_scale`, `vmaf_cambi_get_pixels_in_window`, `vmaf_cambi_preprocessing`, `vmaf_cambi_default_callbacks`) to the GPU twin via a thin trampoline block at the bottom of `cambi.c` — no upstream-mirror function-static code is renamed or moved, keeping Netflix sync clean. Picked over the buffer-pair refactor ADR-0205 sketched because the latter would ripple through CPU AVX2 / AVX-512 / NEON callsites for ~200 LOC of churn (vs the trampoline's <70). Wires: - Registers 5 cambi shaders in `vulkan_shader_sources[]` and `cambi_vulkan.c` in `vulkan_sources` in `libvmaf/src/vulkan/meson.build`. - Registers `vmaf_fex_cambi_vulkan` in `feature_extractor_list[]` under `#if HAVE_VULKAN`. - Adds a `cambi` row to `scripts/ci/cross_backend_vif_diff.py`'s `FEATURE_METRICS` so the cross-backend gate at `places=4` runs against the CPU baseline. Documentation (six deep-dive deliverables per ADR-0108): - ADR-0210 (`docs/adr/0210-cambi-vulkan-integration.md`) - Research-0031 (`docs/research/0031-cambi-vulkan-integration.md`) - `docs/rebase-notes.md` entry 0090 - `docs/backends/vulkan/overview.md` extractor row - `libvmaf/src/feature/AGENTS.md` rebase-sensitive invariant note (lock-step CPU residual + cambi_internal.h signature contract) - `CHANGELOG.md` Unreleased / lusoris fork entry Smoke verified: 38/38 meson tests pass on the Vulkan-enabled build including `test_cambi`, `test_vulkan_smoke`, `test_feature_extractor`. Pre-commit (clang-format + ruff + ADR-0105 copyright header gate) clean on every touched file. Closes backlog item T7-36. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(docs): renumber Research-0031 cambi → 0032 to avoid collision with T7-9 T7-9 (#194, just merged) shipped Research-0031 (Intel AI-PC NPU applicability digest). This PR's cambi-vulkan-integration digest was independently numbered 0031 by the agent that drafted it. Renumber to 0032 to keep the one-number-per-digest invariant. References updated: filename, in-body title, ADR-0210 cross-link, ADR-0210 README index row, CHANGELOG.md, docs/rebase-notes.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(metrics): add cambi Vulkan-backend section Resolves PR #196 Doc-Substance Gate (ADR-0167) failure. The cambi feature extractor gained a Vulkan backend in this PR (T7-36 / ADR-0210), making `feature_extractor.c` a touched "feature extractor" surface per ADR-0100/0167 — which requires a matching `docs/metrics/` edit. Adds a "## GPU support" section to docs/metrics/cambi.md with the integer-phase / host-residual split summary, the meson flag recipe, and pointers to ADR-0210 + Research-0032. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Lusoris <lusoris@pm.me> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-1a Netflix Public dataset row) Update docs/state.md `_Updated:` stamp to 2026-04-29 and rewrite the "Tiny-AI C1 baseline `fr_regressor_v1.onnx`" deferral row's reopen-trigger to TRIGGERED — the Netflix Public training corpus that gated C1 is now locally available at `.workingdir2/netflix/` (9 ref + 70 dis YUVs, ~37 GB, gitignored; provided by lawrence 2026-04-27), unblocking BACKLOG T6-1a. Verified the rest of state.md against the 2026-04-29-session merged PR set (#193–#205, #209). Every merged PR was feature / chore / docs / perf with no bug-status delta to record per CLAUDE §12 rule 13: - #193 chore(dnn) T7-12 env override removal — chore. - #194 docs(research) T7-9 NPU digest — research. - #195 feat(mcp) T5-2 embedded scaffold — feature. - #196 feat(vulkan) T7-36 cambi integration — feature. - #197 feat(motion) Netflix b949ceb port — upstream port. - #198 chore(backlog) T7-32 micro-investigations — verify-only. - #199 feat(ai) T6-9 model registry — feature. - #200 feat(hip) T7-10 HIP scaffold — feature. - #201 feat(simd) T7-38 SVE2 ports — feature. - #202 feat(ci) T6-8 parity matrix — feature. - #203 feat(ai) T6-7 FastDVDnet — feature. - #205 docs(audit) T7-4 quarterly audit — explicitly notes "no state.md changes (no upstream commit ruled in/out a fork bug)". - #209 perf(sycl) T7-17 fp64-less device — perf. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-1a Netflix Public dataset row) (#245) Update docs/state.md `_Updated:` stamp to 2026-04-29 and rewrite the "Tiny-AI C1 baseline `fr_regressor_v1.onnx`" deferral row's reopen-trigger to TRIGGERED — the Netflix Public training corpus that gated C1 is now locally available at `.workingdir2/netflix/` (9 ref + 70 dis YUVs, ~37 GB, gitignored; provided by lawrence 2026-04-27), unblocking BACKLOG T6-1a. Verified the rest of state.md against the 2026-04-29-session merged PR set (#193–#205, #209). Every merged PR was feature / chore / docs / perf with no bug-status delta to record per CLAUDE §12 rule 13: - #193 chore(dnn) T7-12 env override removal — chore. - #194 docs(research) T7-9 NPU digest — research. - #195 feat(mcp) T5-2 embedded scaffold — feature. - #196 feat(vulkan) T7-36 cambi integration — feature. - #197 feat(motion) Netflix b949ceb port — upstream port. - #198 chore(backlog) T7-32 micro-investigations — verify-only. - #199 feat(ai) T6-9 model registry — feature. - #200 feat(hip) T7-10 HIP scaffold — feature. - #201 feat(simd) T7-38 SVE2 ports — feature. - #202 feat(ci) T6-8 parity matrix — feature. - #203 feat(ai) T6-7 FastDVDnet — feature. - #205 docs(audit) T7-4 quarterly audit — explicitly notes "no state.md changes (no upstream commit ruled in/out a fork bug)". - #209 perf(sycl) T7-17 fp64-less device — perf. Co-authored-by: Lusoris <lusoris@pm.me> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

lusoris merged commit e1244aa into master Apr 29, 2026
49 checks passed

lusoris deleted the docs/t7-9-intel-ai-pc-digest branch April 29, 2026 09:46

github-actions Bot mentioned this pull request Apr 29, 2026

chore: release master #1

Open

lusoris mentioned this pull request Apr 29, 2026

chore(release): introduce CHANGELOG + ADR-index fragment files (drop merge-conflict pain) #228

Closed

14 tasks

lusoris mentioned this pull request May 1, 2026

chore(release): introduce CHANGELOG + ADR-index fragment files (drop merge-conflict pain) #253

Merged

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs(research): T7-9 — Intel AI-PC NPU/EP applicability digest#194

docs(research): T7-9 — Intel AI-PC NPU/EP applicability digest#194
lusoris merged 1 commit intomasterfrom
docs/t7-9-intel-ai-pc-digest

lusoris commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lusoris commented Apr 29, 2026

Summary

Files

ADR-0108 deep-dive 6-key checklist

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant