docs: project-wide doc-substance sweep (ADR-0100 batches 1-4) by lusoris · Pull Request #25 · lusoris/vmaf

lusoris · 2026-04-17T21:58:47Z

Summary

Implements ADR-0100's per-surface doc bar across the four user-discoverable surface areas that had thin or missing documentation. Four batches, one commit each.

Batches

Batch 1 — docs/usage/ CLI surface + docs/backends/index.md fix
- New docs/usage/cli.md (full vmaf reference: flags, models, tiny-AI, presets, exit codes, Netflix-golden worked example)
- New docs/usage/bench.md (vmaf_bench harness)
- New docs/usage/precision.md (--precision deep dive, round-trip rationale)
- Fixed docs/backends/index.md — removed phantom VMAF_FORCE_BACKEND env var claim (grep showed nothing in C / Python reads it); documented real --no_cuda / --no_sycl / --sycl_device mechanism instead.
Batch 2 — docs/api/ public C API reference
- docs/api/index.md — core (libvmaf.h + picture.h + model.h + feature.h): lifecycle, VmafConfiguration, picture ownership, feature dictionary, model collections, ABI tiers, thread-safety, end-to-end runnable C example.
- docs/api/dnn.md — tiny-AI session API (ADR-0040 named binding, -ENOSPC retry, runnable luma-filter example).
- docs/api/gpu.md — CUDA (libvmaf_cuda.h) + SYCL (libvmaf_sycl.h): zero-copy frame buffers, dmabuf / VA / D3D11 import, profiling, state_free vs list_devices asymmetry.
Batch 3 — docs/metrics/features.md expansion
- Rewrote from 1–2-sentence entries to the full ADR-0100 feature-extractor bar (what / range / invocation / input formats / options / backends / limitations) for every extractor: VIF, Motion2, ADM, CAMBI, CIEDE2000, PSNR, PSNR-HVS, SSIM, MS-SSIM, ANSNR. Options table with names / aliases / ranges per extractor, grounded in a direct survey of libvmaf/src/feature/.
Batch 4 — docs/mcp/ per-tool reference
- docs/mcp/index.md — MCP server overview: install, Claude Desktop config, env vars, security model (argv exec + Path.resolve() allowlist), when-not-to-use.
- docs/mcp/tools.md — per-tool reference for all 6 tools exposed by mcp-server/vmaf-mcp/ (README only listed 4; live server.py has 6 — added eval_model_on_split + compare_models). Input schema, behaviour, JSON example call + response, enumerated error codes per tool.

All new pages wired into docs/index.md + mkdocs.yml nav. mkdocs build --strict passes clean on every commit.

Test plan

./.venv/bin/mkdocs build --strict clean on each commit
Grep confirmed VMAF_FORCE_BACKEND is not referenced by any runtime code (only stale docs + vestigial CI coverage export)
API docs cross-checked against public headers under libvmaf/include/libvmaf/
Metrics docs cross-checked against libvmaf/src/feature/ extractor .options[] arrays and provided_features
MCP tool reference cross-checked against mcp-server/vmaf-mcp/src/vmaf_mcp/server.py list_tools() handler

🤖 Generated with Claude Code

…sion, backends (ADR-0100) First batch under ADR-0100's project-wide doc-substance rule. Adds the three highest-discoverability pages that were missing (every CLI flag every user hits) and fixes one phantom claim in the backends page. New pages - docs/usage/cli.md — complete `vmaf` flag reference: required input, `--model` grammar, built-in model versions, `--feature` syntax, output formats, `--precision`, backend-selection flags (`--no_cuda`, `--no_sycl`, `--sycl_device`, `--cpumask`, `--gpumask`), frame range (`--frame_cnt`, `--frame_skip_*`, `--subsample`), `--aom_ctc` / `--nflx_ctc` preset bundles, `--tiny-*` tiny-AI flags, exit codes, worked Netflix-golden example, and interaction pitfalls. Supersedes the abbreviated help in libvmaf/tools/README.md as the canonical user-facing reference. - docs/usage/bench.md — `vmaf_bench` harness: build, test-data layout, performance + validation modes, per-flag table, runnable 1080p-benchmark + validation examples, limitations. - docs/usage/precision.md — `--precision` deep-dive: grammar, mode selection table, `%.17g` round-trip rationale, uniformity across XML /JSON/CSV/SUB/stderr, legacy-mode cautions, interaction with Netflix goldens. Fixes - docs/backends/index.md — `VMAF_FORCE_BACKEND` env var was *not* wired to anything in libvmaf (grep confirms: only docs/backends/index.md and one vestigial CI coverage-job export mention it; no getenv in C or Python). Replaced with the real CLI-flag mechanism (--no_cuda/--no_sycl/--sycl_device) and documented the dispatch precedence rules. Adds ARM NEON + HIP rows, links to per-backend pages, and clarifies the build-time-opt-in vs runtime-per-invocation split. Navigation - docs/index.md Usage section lists the three new pages. - mkdocs.yml nav adds them under the Usage tab. All four pages meet the ADR-0100 per-surface minimum bar (CLI flags: what / args+defaults / runnable example / how output surfaces / interactions + limitations). ADR-0100 itself is unchanged; this batch is the first instance of the rule producing actual docs. No code changes. CI should be unaffected (coverage lcov + link validation are the only docs-touching gates). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Second batch of the project-wide doc-substance sweep. Adds the missing public C API reference tree under docs/api/ to satisfy ADR-0100's Public C API bar (what / inputs + outputs + ownership + lifetime / thread-safety / ABI-stability tier / runnable C snippet / error semantics) for every header shipped under libvmaf/include/libvmaf/. - docs/api/index.md — core reference covering libvmaf.h, picture.h, model.h, feature.h: context lifecycle, VmafConfiguration, VmafPicture ownership rules, VmafFeatureDictionary, built-in model versions, model collections (bootstrap + bagging CI), ABI tiers, thread-safety, negative-errno convention, runnable end-to-end C example reading YUV and pooling scores. - docs/api/dnn.md — tiny-AI session API per ADR-0040. Covers VmafDnnConfig (AUTO/CPU/CUDA/OPENVINO/ROCM), vmaf_use_tiny_model attached mode, standalone sessions, luma-only convenience call, general multi-input/output named-binding call, -ENOSPC retry semantics, runnable filter example. - docs/api/gpu.md — CUDA (libvmaf_cuda.h) and SYCL (libvmaf_sycl.h) state + picture preallocation + zero-copy frame buffers + dmabuf / VA / D3D11 import + profiling helpers. Documents state_free vs list_devices asymmetry and the HOST/HOST_PINNED/ DEVICE preallocation tiers. Wired into docs/index.md (new C API section) and mkdocs.yml nav (new C API tab with DNN + GPU sub-pages). mkdocs build --strict passes. Refs ADR-0100, ADR-0022, ADR-0023, ADR-0039, ADR-0040, ADR-0041. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Third batch of the project-wide doc-substance sweep. The previous features.md had thin 1–2-sentence entries per extractor — ADR-0100's feature-extractor bar requires: what measures / output range / invocation string / input pixel formats / options / SIMD+GPU backends / limitations. Every extractor shipped by libvmaf is now documented to that bar, grounded in a source survey of libvmaf/src/feature/. Coverage: - VIF (integer + float) — per-scale [0,1], debug/egl/kernelscale options, backend matrix - Motion2 (integer + float) — [0,∞), temporal state, force_zero test hook, flush-callback note - ADM (integer + float) — [0,1], egl/nvd/rdf/csf options with ranges, full backend matrix - CAMBI — quick facts + pointer to dedicated cambi.md - CIEDE2000 — typical-ΔE bands (<1 / 1–5 / >15), BT.709 assumption, 4:0:0 rejection - PSNR (integer + float) — dB ceiling formula (6·bpc+12), enable_chroma/mse/apsnr/min_sse - PSNR-HVS — 8-bit-only rejection, Xiph reference - SSIM / MS-SSIM — enable_lcs / enable_db / clip_db / scale option semantics, dB conversion formula - ANSNR — kept for back-compat, no shipped model consumes it Also added CLI / C API invocation examples and a cross-reference block linking to backends/, models/, cambi.md, confidence-interval.md, ADR-0100. Refs ADR-0100. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Fourth batch of the project-wide doc-substance sweep. Adds docs/mcp/ tree to satisfy ADR-0100's MCP-tool bar (what / input schema / allowed paths / example / error codes) for every tool exposed by mcp-server/vmaf-mcp/. - docs/mcp/index.md — server overview. Install path (Meson build first, then pip install -e), Claude Desktop config (with optional Docker variant pointer), env vars (VMAF_BIN / VMAF_MCP_ALLOW / VMAF_MCP_ASYNC), security model (argv-only exec, Path.resolve() allowlist with built-in roots testdata/ / python/test/resource/ / model/), when-not-to-use section pointing at CLI / C API / Docker. - docs/mcp/tools.md — per-tool reference for all 6 tools: vmaf_score, list_models, list_backends, run_benchmark, eval_model_on_split, compare_models. Each entry lists input-schema table, behaviour, concrete JSON example call + response body, and enumerated error conditions. eval_model_on_split documents the feature-column contract (adm2 + vif_scale0..3 + motion2, mos target) and the 'eval' extra install. Cross-tool error-convention table at the bottom. Discoveries vs prior README: - Live tool count is 6 (vmaf-mcp/README.md and prior batch plan only listed 4) — eval_model_on_split and compare_models shipped alongside the tiny-AI work. - VMAF_MCP_ASYNC env var was undocumented; server.py reads it to pick the anyio backend. Wired into docs/index.md (new MCP section) and mkdocs.yml nav (new MCP tab above Tiny AI). mkdocs build --strict passes. Refs ADR-0100. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…h 5) Fifth batch of the project-wide doc-substance sweep. Closes the remaining ADR-0100 surface bars: - Build flag bar: docs/development/build-flags.md - Every option in libvmaf/meson_options.txt with type, default, and effect: enable_tests, enable_docs, enable_tools, enable_asm, enable_avx512, built_in_models, enable_float, enable_cuda, enable_nvtx, enable_nvcc, enable_sycl, sycl_compiler, enable_dnn. - Flag-interaction notes (enable_avx512 auto-downgrade on nasm<2.14, enable_cuda + enable_sycl coexistence, enable_float as additive not replacement, enable_dnn=auto silent-skip pitfall). - Standard Meson options that change the artifact (buildtype, default_library, b_ndebug, b_sanitize, b_lto, c_args with the VMAF_PICTURE_POOL gate). - Recommended configurations block (fast-iteration CPU, release+CUDA+NVTX, CI golden-gate matching, tiny-AI tests, sanitizer run). - Documents the undefined enable_hip option referenced by the build-vmaf skill so future users don't pass it and wonder why configure fails. - GPU backend bar deepening (known gaps + ULP tolerance): - docs/backends/cuda/overview.md — explicit bit-identity statement vs CPU scalar (no fp slack), known-gaps list (no CUDA kernel for CAMBI, CIEDE, SSIM, MS-SSIM, PSNR, PSNR-HVS, ANSNR, float_*). - docs/backends/sycl/overview.md — same treatment, plus fp16-forced- off note, dmabuf Linux-only caveat, HIP-via-DPC++ availability note, programmatic profiling pointer. - docs/backends/index.md — corrected the HIP row: claimed 'experimental' with a -Denable_hip flag, but the option does not exist in meson_options.txt. Marked as 'not yet scaffolded' and pointed at /add-gpu-backend hip. Wired docs/development/build-flags.md into docs/index.md Development section and mkdocs.yml nav. Refs ADR-0100. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…lag fix (ADR-0100 batch 6) Sixth batch of the project-wide doc-substance sweep. Closes two gaps: - docs/usage/ffmpeg.md: the libvmaf filter (shipped by upstream FFmpeg, wraps this repo's C API) had examples but no options reference. Added a table covering the user-tunable options (model, log_path, log_fmt, feature, pool, n_threads, n_subsample) with types + defaults, plus multi-feature / multi-model examples (pipe-separated `name=` / `version=` grammar) and a 'when to use the CLI instead' section for the surfaces the filter doesn't expose (--precision, tiny-AI flags, explicit --no_cuda / --no_sycl). Points at the FFmpeg source (libavfilter/vf_libvmaf.c) as authoritative. - docs/usage/docker.md: claimed the SYCL backend is 'selected at CLI level with --sycl'. No such flag exists — cli_parse.c only defines --no_sycl (opt-out) and --sycl_device N (pin). Fixed to describe the actual auto-select + opt-out behaviour and pointed at cli.md#backend-selection. This is the same class of phantom-flag fix as the VMAF_FORCE_BACKEND correction in batch 1 and the enable_hip row in batch 5 — users following the doc verbatim would get 'unrecognized option' errors. Surface those claims as they turn up. Refs ADR-0100. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Seventh batch of the project-wide doc-substance sweep. Continues the phantom-flag sweep started in batch 1 (VMAF_FORCE_BACKEND), batch 5 (enable_hip), batch 6 (docker --sycl). docs/reference/faq.md claimed 'To enable a backend at call time, use --cuda or --sycl on the vmaf CLI' — neither flag exists. cli_parse.c defines only --no_cuda (opt-out), --no_sycl (opt-out), and --sycl_device N (pin). GPU backends are auto-selected when compiled in. Also normalised the meson setup command to match CLAUDE.md §2 and the build-vmaf skill (cd libvmaf first, then meson setup build) instead of the <builddir> <sourcedir> form nothing else in the docs uses. Pointed at docs/development/build-flags.md (added in batch 5) and the updated docs/backends/index.md (HIP row corrected in batch 5) so the FAQ's backend answer now matches the per-surface pages. Refs ADR-0100. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Eighth and final phantom-flag fix in this sweep. Grepped docs/ for --cuda / --sycl (as backend selectors, not the real --sycl_device) and cleaned the last two occurrences: - docs/usage/ffmpeg.md: claimed the SYCL backend is reached through 'vmaf ... via --sycl'. No such flag; SYCL auto-selects when built with -Denable_sycl=true. Replaced with the accurate auto-select + --sycl_device pin + --no_sycl opt-out description. - docs/models/overview.md: claimed models run on 'CUDA, SYCL, and HIP backends (select via --cuda / --sycl / build flags)'. HIP isn't scaffolded yet (batch 5 corrected that in backends/index.md) and --cuda / --sycl don't exist. Rewrote to: CUDA + SYCL only, enabled at build time, auto-selected at runtime, opt-out per invocation. After this commit `rg -- '--cuda[^_=]|--sycl[^_=]' docs/` returns only the three 'there is no --cuda/--sycl selector' disambiguation sentences we intentionally added in batches 6-8. No phantom-flag claims remain in the docs tree. Refs ADR-0100. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- docs/api/gpu.md: the `-Denable_hip=true` meson option does not exist (phantom flag). Normalise the HIP "Limitations" bullet to match backends/index.md's "planned — meson option does not exist yet" phrasing. - docs/mcp/tools.md: anchor `cli.md#output-modes` → `#output` (the real `## Output` heading in cli.md). - docs/metrics/features.md: anchor `api/index.md#vmaffeaturedictionary--tuning-extractors` → `#vmaffeaturedictionary` (the real heading slug). - docs/reference/faq.md: anchor `principles.md#netflix-golden-data-gate` → `#31-netflix-golden-data-gate` (the real H3 slug under "3. Quality gates"). `mkdocs build --strict` now reports zero "contains an anchor" errors against the three fixed links. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Verified against libvmaf/include/libvmaf/libvmaf_sycl.h and libvmaf/src/sycl/. The earlier page was accurate for CUDA but carried over assumptions that do not match SYCL's actual implementation: 1. "Simple path same semantics as CUDA" — wrong. On SYCL, `vmaf_sycl_preallocate_pictures` is a no-op stub and `vmaf_sycl_picture_fetch` just calls `vmaf_picture_alloc()` (regular host buffer). The `DEVICE`/`HOST` enum values are declared for CUDA-symmetry but are not acted upon (libvmaf/src/libvmaf.c:398-415). Document the stub behaviour and point users at the frame-buffer API as the real GPU-resident path. 2. "OpenCL-backend SYCL builds fall back silently to upload_plane" — wrong. `vmaf_sycl_dmabuf_import` and `vmaf_sycl_import_va_surface` call `sycl::get_native<ext_oneapi_level_zero>` directly (libvmaf/src/sycl/dmabuf_import.cpp:79). On a non-Level-Zero backend the SYCL exception is caught and returned as `-EIO` with an error log. No implicit fallback — the caller must choose `vmaf_sycl_upload_plane` themselves. 3. "D3D11 import is DMA-free on the source side" — wrong. The header comment says "creates staging texture → copies decoded surface → maps for CPU read → uploads via H2D memcpy" — that is GPU→CPU→GPU. Re-describe as a full host roundtrip; keep the NT-handle future-work note. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Continuing the ADR-0100 doc-substance audit surfaced two implementation bugs behind the SYCL public API: 1. [issue #26] `vmaf_sycl_preallocate_pictures` is a no-op stub that silently ignores the `VmafSyclPicturePreallocationMethod` enum. Callers requesting USM-device allocation get a regular host picture and no way to detect the mismatch. 2. [issue #27] `vmaf_sycl_import_d3d11_surface` is declared in the public header under `#ifdef _WIN32` but has zero implementations in-tree — ghost symbol. Any Windows caller would fail to link. Doc now labels both as known bugs with GitHub issue cross-refs so readers are not misled while the implementation questions are resolved. Re-wrote the D3D11 bullet list entry + Limitations entry to stop describing the (imaginary) staging-texture path as if it existed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Per user correction: the Netflix golden-data gate applies to the CPU scalar + fixed-point path only — not to CUDA, SYCL, or SIMD backends. GPU paths have never been bit-exact with CPU (upstream or fork), only close (~6 decimals). Different reduction orders, parallel-prefix scans, and FMA contractions on parallel hardware introduce small ULP-level deltas that cannot be eliminated without giving up the throughput the GPU is there for in the first place. Earlier batches had repeated "scores are bit-identical across backends" / "CUDA integer kernels are bit-identical by design" / "golden-data gate on all backends" — all wrong. docs/principles.md §3.1 was already correct ("CPU only") so the source of truth didn't drift; the derivative pages did. Fixed pages: - docs/backends/cuda/overview.md §"Numerical tolerance" — rewrite to "close agreement, not bit-exact"; remove the "no fp slack on VIF/ADM/Motion2" claim. - docs/backends/sycl/overview.md §"Numerical tolerance" — same rewrite; keep the fp16/fixed-iteration-order notes as "reduces but does not eliminate" the deviation. - docs/models/overview.md §"GPU and SIMD acceleration" — fix the one-liner that claimed goldens enforce cross-backend identity. - docs/development/release.md — gate runs CPU only; GPU/SIMD covered by per-backend snapshot tests (testdata/scores_cpu_*.json + netflix_benchmark_results.json) at ULP tolerance. - docs/reference/faq.md — rewrite the golden-contract answer with explicit CPU/GPU split. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- docs/ai/security.md: VMAF_TINY_MODEL_DIR is not implemented in vmaf_dnn_validate_onnx (no getenv call in libvmaf/src/dnn/model_loader.c). Replaced the claim with a "Planned (not yet implemented)" callout that points at tracking issue #28. Today the loader trusts caller-supplied paths once realpath + S_ISREG + readable checks pass; MCP callers retain their separate path allowlist. - docs/api/index.md: VmafConfiguration.gpumask is described as "disable CUDA" but libvmaf.c:694-698 treats any non-zero value as a boolean that disables BOTH CUDA and SYCL. Corrected the inline comment and added a caveat blockquote noting the field is not actually per-bit despite the uint64_t type. Per-backend opt-out lives on the CLI (--no_cuda / --no_sycl). Follow-up issues: #28 (tiny-model path-jail), #29 (picture_pool ghost).

- docs/api/gpu.md: vmaf_sycl_profiling_enable only flips a bool (libvmaf/src/sycl/common.cpp:1053-1059). The SYCL queue's enable_profiling property is fixed at vmaf_sycl_state_init() time, so calling the enable-fn on a state that was inited without profiling succeeds silently but triggers sycl::exception on later get_profiling_info calls. Old wording ("use on a state that was inited with the flag set") buried the lede; new wording states the requirement plus the failure mode explicitly. - docs/api/dnn.md: vmaf_dnn_session_run_luma8 also returns -EINVAL for NULL args, -ENOTSUP when ORT under-fills the output buffer, and the whole DNN surface degrades to -ENOSYS when built without ONNX Runtime (enable_onnx=false). Documented all four cases. No code changes.

Two low-severity audit findings from round-2 verification against libvmaf/src/sycl/common.cpp + libvmaf/src/sycl/dmabuf_import.cpp: - vmaf_sycl_list_devices enumerates device_type::gpu only and prints platform, vendor, driver version, and fp64 flag per device — not just the ordinal. Returns -EIO on sycl::exception. Previous one-line description understated the output. - Zero-copy ingest paths catch sycl::exception and convert to -EIO, but the log message is the generic exception text — callers can't string- match for "Level Zero required". Documented the correct graceful- degradation pattern: sniff queue::get_backend() up front rather than parsing the log. No behaviour changes — docs-only.

Round-3 audit against libvmaf/src/dnn/ort_backend.c:85-98 found that: - VmafDnnDevice AUTO / OPENVINO / ROCM all fall through to the default CPU EP. The switch only special-cases CUDA (under ORT_API_HAS_CUDA). No SessionOptionsAppendExecutionProvider_OpenVINO / _MIGraphX calls exist anywhere. - VmafDnnConfig.fp16_io is declared in the public header but never read anywhere in libvmaf/src/dnn/. It is a pure ghost field. Doc updates: - Device-config section: AUTO = CPU today; OPENVINO / ROCM / fp16_io explicitly labelled "accepted but ignored" with issue #30 backref. - Return-code tables: added -ENOMEM to both vmaf_use_tiny_model and vmaf_dnn_session_run — both are reachable (session open / tensor allocation failures). - ENOSPC semantics clarified: outputs[i].written = produced is set BEFORE the capacity check, so the required-size hint is always available on -ENOSPC. - Known-limitations: added fp16_io + OPENVINO/ROCM rows (previously claimed OPENVINO "requires an ORT built with the OpenVINO EP" which is only true *once* the EP wire-up lands). Follow-up code-fix: #30.

lusoris changed the title ~~docs(usage): CLI / bench / precision + backends fix (ADR-0100 batch 1)~~ docs: project-wide doc-substance sweep (ADR-0100 batches 1-4) Apr 17, 2026

Lusoris and others added 16 commits April 18, 2026 02:43

lusoris force-pushed the docs/project-wide-substance-sweep-0100 branch from 3823797 to fb6da65 Compare April 18, 2026 00:43

lusoris merged commit 4f3f992 into master Apr 18, 2026
22 of 23 checks passed

lusoris deleted the docs/project-wide-substance-sweep-0100 branch April 18, 2026 00:55

github-actions Bot mentioned this pull request Apr 18, 2026

chore: release master #1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: project-wide doc-substance sweep (ADR-0100 batches 1-4)#25

docs: project-wide doc-substance sweep (ADR-0100 batches 1-4)#25
lusoris merged 16 commits intomasterfrom
docs/project-wide-substance-sweep-0100

lusoris commented Apr 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lusoris commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Batches

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lusoris commented Apr 17, 2026 •

edited

Loading