docs: project-wide doc-substance sweep (ADR-0100 batches 1-4)#25
Merged
docs: project-wide doc-substance sweep (ADR-0100 batches 1-4)#25
Conversation
This was referenced Apr 17, 2026
Closed
…sion, backends (ADR-0100) First batch under ADR-0100's project-wide doc-substance rule. Adds the three highest-discoverability pages that were missing (every CLI flag every user hits) and fixes one phantom claim in the backends page. New pages - docs/usage/cli.md — complete `vmaf` flag reference: required input, `--model` grammar, built-in model versions, `--feature` syntax, output formats, `--precision`, backend-selection flags (`--no_cuda`, `--no_sycl`, `--sycl_device`, `--cpumask`, `--gpumask`), frame range (`--frame_cnt`, `--frame_skip_*`, `--subsample`), `--aom_ctc` / `--nflx_ctc` preset bundles, `--tiny-*` tiny-AI flags, exit codes, worked Netflix-golden example, and interaction pitfalls. Supersedes the abbreviated help in libvmaf/tools/README.md as the canonical user-facing reference. - docs/usage/bench.md — `vmaf_bench` harness: build, test-data layout, performance + validation modes, per-flag table, runnable 1080p-benchmark + validation examples, limitations. - docs/usage/precision.md — `--precision` deep-dive: grammar, mode selection table, `%.17g` round-trip rationale, uniformity across XML /JSON/CSV/SUB/stderr, legacy-mode cautions, interaction with Netflix goldens. Fixes - docs/backends/index.md — `VMAF_FORCE_BACKEND` env var was *not* wired to anything in libvmaf (grep confirms: only docs/backends/index.md and one vestigial CI coverage-job export mention it; no getenv in C or Python). Replaced with the real CLI-flag mechanism (--no_cuda/--no_sycl/--sycl_device) and documented the dispatch precedence rules. Adds ARM NEON + HIP rows, links to per-backend pages, and clarifies the build-time-opt-in vs runtime-per-invocation split. Navigation - docs/index.md Usage section lists the three new pages. - mkdocs.yml nav adds them under the Usage tab. All four pages meet the ADR-0100 per-surface minimum bar (CLI flags: what / args+defaults / runnable example / how output surfaces / interactions + limitations). ADR-0100 itself is unchanged; this batch is the first instance of the rule producing actual docs. No code changes. CI should be unaffected (coverage lcov + link validation are the only docs-touching gates). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Second batch of the project-wide doc-substance sweep. Adds the missing public C API reference tree under docs/api/ to satisfy ADR-0100's Public C API bar (what / inputs + outputs + ownership + lifetime / thread-safety / ABI-stability tier / runnable C snippet / error semantics) for every header shipped under libvmaf/include/libvmaf/. - docs/api/index.md — core reference covering libvmaf.h, picture.h, model.h, feature.h: context lifecycle, VmafConfiguration, VmafPicture ownership rules, VmafFeatureDictionary, built-in model versions, model collections (bootstrap + bagging CI), ABI tiers, thread-safety, negative-errno convention, runnable end-to-end C example reading YUV and pooling scores. - docs/api/dnn.md — tiny-AI session API per ADR-0040. Covers VmafDnnConfig (AUTO/CPU/CUDA/OPENVINO/ROCM), vmaf_use_tiny_model attached mode, standalone sessions, luma-only convenience call, general multi-input/output named-binding call, -ENOSPC retry semantics, runnable filter example. - docs/api/gpu.md — CUDA (libvmaf_cuda.h) and SYCL (libvmaf_sycl.h) state + picture preallocation + zero-copy frame buffers + dmabuf / VA / D3D11 import + profiling helpers. Documents state_free vs list_devices asymmetry and the HOST/HOST_PINNED/ DEVICE preallocation tiers. Wired into docs/index.md (new C API section) and mkdocs.yml nav (new C API tab with DNN + GPU sub-pages). mkdocs build --strict passes. Refs ADR-0100, ADR-0022, ADR-0023, ADR-0039, ADR-0040, ADR-0041. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Third batch of the project-wide doc-substance sweep. The previous
features.md had thin 1–2-sentence entries per extractor — ADR-0100's
feature-extractor bar requires: what measures / output range /
invocation string / input pixel formats / options / SIMD+GPU backends /
limitations. Every extractor shipped by libvmaf is now documented to
that bar, grounded in a source survey of libvmaf/src/feature/.
Coverage:
- VIF (integer + float) — per-scale [0,1], debug/egl/kernelscale
options, backend matrix
- Motion2 (integer + float) — [0,∞), temporal state, force_zero
test hook, flush-callback note
- ADM (integer + float) — [0,1], egl/nvd/rdf/csf options with
ranges, full backend matrix
- CAMBI — quick facts + pointer to dedicated cambi.md
- CIEDE2000 — typical-ΔE bands (<1 / 1–5 / >15), BT.709 assumption,
4:0:0 rejection
- PSNR (integer + float) — dB ceiling formula (6·bpc+12),
enable_chroma/mse/apsnr/min_sse
- PSNR-HVS — 8-bit-only rejection, Xiph reference
- SSIM / MS-SSIM — enable_lcs / enable_db / clip_db / scale option
semantics, dB conversion formula
- ANSNR — kept for back-compat, no shipped model consumes it
Also added CLI / C API invocation examples and a cross-reference block
linking to backends/, models/, cambi.md, confidence-interval.md,
ADR-0100.
Refs ADR-0100.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fourth batch of the project-wide doc-substance sweep. Adds docs/mcp/
tree to satisfy ADR-0100's MCP-tool bar (what / input schema / allowed
paths / example / error codes) for every tool exposed by
mcp-server/vmaf-mcp/.
- docs/mcp/index.md — server overview. Install path (Meson build first,
then pip install -e), Claude Desktop config (with optional Docker
variant pointer), env vars (VMAF_BIN / VMAF_MCP_ALLOW /
VMAF_MCP_ASYNC), security model (argv-only exec, Path.resolve()
allowlist with built-in roots testdata/ / python/test/resource/ /
model/), when-not-to-use section pointing at CLI / C API / Docker.
- docs/mcp/tools.md — per-tool reference for all 6 tools: vmaf_score,
list_models, list_backends, run_benchmark, eval_model_on_split,
compare_models. Each entry lists input-schema table, behaviour,
concrete JSON example call + response body, and enumerated error
conditions. eval_model_on_split documents the feature-column
contract (adm2 + vif_scale0..3 + motion2, mos target) and the
'eval' extra install. Cross-tool error-convention table at the
bottom.
Discoveries vs prior README:
- Live tool count is 6 (vmaf-mcp/README.md and prior batch plan only
listed 4) — eval_model_on_split and compare_models shipped
alongside the tiny-AI work.
- VMAF_MCP_ASYNC env var was undocumented; server.py reads it to
pick the anyio backend.
Wired into docs/index.md (new MCP section) and mkdocs.yml nav
(new MCP tab above Tiny AI). mkdocs build --strict passes.
Refs ADR-0100.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…h 5)
Fifth batch of the project-wide doc-substance sweep. Closes the
remaining ADR-0100 surface bars:
- Build flag bar: docs/development/build-flags.md
- Every option in libvmaf/meson_options.txt with type, default, and
effect: enable_tests, enable_docs, enable_tools, enable_asm,
enable_avx512, built_in_models, enable_float, enable_cuda,
enable_nvtx, enable_nvcc, enable_sycl, sycl_compiler, enable_dnn.
- Flag-interaction notes (enable_avx512 auto-downgrade on nasm<2.14,
enable_cuda + enable_sycl coexistence, enable_float as additive
not replacement, enable_dnn=auto silent-skip pitfall).
- Standard Meson options that change the artifact (buildtype,
default_library, b_ndebug, b_sanitize, b_lto, c_args with the
VMAF_PICTURE_POOL gate).
- Recommended configurations block (fast-iteration CPU,
release+CUDA+NVTX, CI golden-gate matching, tiny-AI tests,
sanitizer run).
- Documents the undefined enable_hip option referenced by the
build-vmaf skill so future users don't pass it and wonder why
configure fails.
- GPU backend bar deepening (known gaps + ULP tolerance):
- docs/backends/cuda/overview.md — explicit bit-identity statement
vs CPU scalar (no fp slack), known-gaps list (no CUDA kernel for
CAMBI, CIEDE, SSIM, MS-SSIM, PSNR, PSNR-HVS, ANSNR, float_*).
- docs/backends/sycl/overview.md — same treatment, plus fp16-forced-
off note, dmabuf Linux-only caveat, HIP-via-DPC++ availability
note, programmatic profiling pointer.
- docs/backends/index.md — corrected the HIP row: claimed
'experimental' with a -Denable_hip flag, but the option does not
exist in meson_options.txt. Marked as 'not yet scaffolded' and
pointed at /add-gpu-backend hip.
Wired docs/development/build-flags.md into docs/index.md Development
section and mkdocs.yml nav.
Refs ADR-0100.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lag fix (ADR-0100 batch 6) Sixth batch of the project-wide doc-substance sweep. Closes two gaps: - docs/usage/ffmpeg.md: the libvmaf filter (shipped by upstream FFmpeg, wraps this repo's C API) had examples but no options reference. Added a table covering the user-tunable options (model, log_path, log_fmt, feature, pool, n_threads, n_subsample) with types + defaults, plus multi-feature / multi-model examples (pipe-separated `name=` / `version=` grammar) and a 'when to use the CLI instead' section for the surfaces the filter doesn't expose (--precision, tiny-AI flags, explicit --no_cuda / --no_sycl). Points at the FFmpeg source (libavfilter/vf_libvmaf.c) as authoritative. - docs/usage/docker.md: claimed the SYCL backend is 'selected at CLI level with --sycl'. No such flag exists — cli_parse.c only defines --no_sycl (opt-out) and --sycl_device N (pin). Fixed to describe the actual auto-select + opt-out behaviour and pointed at cli.md#backend-selection. This is the same class of phantom-flag fix as the VMAF_FORCE_BACKEND correction in batch 1 and the enable_hip row in batch 5 — users following the doc verbatim would get 'unrecognized option' errors. Surface those claims as they turn up. Refs ADR-0100. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Seventh batch of the project-wide doc-substance sweep. Continues the phantom-flag sweep started in batch 1 (VMAF_FORCE_BACKEND), batch 5 (enable_hip), batch 6 (docker --sycl). docs/reference/faq.md claimed 'To enable a backend at call time, use --cuda or --sycl on the vmaf CLI' — neither flag exists. cli_parse.c defines only --no_cuda (opt-out), --no_sycl (opt-out), and --sycl_device N (pin). GPU backends are auto-selected when compiled in. Also normalised the meson setup command to match CLAUDE.md §2 and the build-vmaf skill (cd libvmaf first, then meson setup build) instead of the <builddir> <sourcedir> form nothing else in the docs uses. Pointed at docs/development/build-flags.md (added in batch 5) and the updated docs/backends/index.md (HIP row corrected in batch 5) so the FAQ's backend answer now matches the per-surface pages. Refs ADR-0100. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eighth and final phantom-flag fix in this sweep. Grepped docs/ for --cuda / --sycl (as backend selectors, not the real --sycl_device) and cleaned the last two occurrences: - docs/usage/ffmpeg.md: claimed the SYCL backend is reached through 'vmaf ... via --sycl'. No such flag; SYCL auto-selects when built with -Denable_sycl=true. Replaced with the accurate auto-select + --sycl_device pin + --no_sycl opt-out description. - docs/models/overview.md: claimed models run on 'CUDA, SYCL, and HIP backends (select via --cuda / --sycl / build flags)'. HIP isn't scaffolded yet (batch 5 corrected that in backends/index.md) and --cuda / --sycl don't exist. Rewrote to: CUDA + SYCL only, enabled at build time, auto-selected at runtime, opt-out per invocation. After this commit `rg -- '--cuda[^_=]|--sycl[^_=]' docs/` returns only the three 'there is no --cuda/--sycl selector' disambiguation sentences we intentionally added in batches 6-8. No phantom-flag claims remain in the docs tree. Refs ADR-0100. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- docs/api/gpu.md: the `-Denable_hip=true` meson option does not exist (phantom flag). Normalise the HIP "Limitations" bullet to match backends/index.md's "planned — meson option does not exist yet" phrasing. - docs/mcp/tools.md: anchor `cli.md#output-modes` → `#output` (the real `## Output` heading in cli.md). - docs/metrics/features.md: anchor `api/index.md#vmaffeaturedictionary--tuning-extractors` → `#vmaffeaturedictionary` (the real heading slug). - docs/reference/faq.md: anchor `principles.md#netflix-golden-data-gate` → `#31-netflix-golden-data-gate` (the real H3 slug under "3. Quality gates"). `mkdocs build --strict` now reports zero "contains an anchor" errors against the three fixed links. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Verified against libvmaf/include/libvmaf/libvmaf_sycl.h and libvmaf/src/sycl/. The earlier page was accurate for CUDA but carried over assumptions that do not match SYCL's actual implementation: 1. "Simple path same semantics as CUDA" — wrong. On SYCL, `vmaf_sycl_preallocate_pictures` is a no-op stub and `vmaf_sycl_picture_fetch` just calls `vmaf_picture_alloc()` (regular host buffer). The `DEVICE`/`HOST` enum values are declared for CUDA-symmetry but are not acted upon (libvmaf/src/libvmaf.c:398-415). Document the stub behaviour and point users at the frame-buffer API as the real GPU-resident path. 2. "OpenCL-backend SYCL builds fall back silently to upload_plane" — wrong. `vmaf_sycl_dmabuf_import` and `vmaf_sycl_import_va_surface` call `sycl::get_native<ext_oneapi_level_zero>` directly (libvmaf/src/sycl/dmabuf_import.cpp:79). On a non-Level-Zero backend the SYCL exception is caught and returned as `-EIO` with an error log. No implicit fallback — the caller must choose `vmaf_sycl_upload_plane` themselves. 3. "D3D11 import is DMA-free on the source side" — wrong. The header comment says "creates staging texture → copies decoded surface → maps for CPU read → uploads via H2D memcpy" — that is GPU→CPU→GPU. Re-describe as a full host roundtrip; keep the NT-handle future-work note. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Continuing the ADR-0100 doc-substance audit surfaced two implementation bugs behind the SYCL public API: 1. [issue #26] `vmaf_sycl_preallocate_pictures` is a no-op stub that silently ignores the `VmafSyclPicturePreallocationMethod` enum. Callers requesting USM-device allocation get a regular host picture and no way to detect the mismatch. 2. [issue #27] `vmaf_sycl_import_d3d11_surface` is declared in the public header under `#ifdef _WIN32` but has zero implementations in-tree — ghost symbol. Any Windows caller would fail to link. Doc now labels both as known bugs with GitHub issue cross-refs so readers are not misled while the implementation questions are resolved. Re-wrote the D3D11 bullet list entry + Limitations entry to stop describing the (imaginary) staging-texture path as if it existed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per user correction: the Netflix golden-data gate applies to the CPU
scalar + fixed-point path only — not to CUDA, SYCL, or SIMD backends.
GPU paths have never been bit-exact with CPU (upstream or fork),
only close (~6 decimals). Different reduction orders, parallel-prefix
scans, and FMA contractions on parallel hardware introduce small
ULP-level deltas that cannot be eliminated without giving up the
throughput the GPU is there for in the first place.
Earlier batches had repeated "scores are bit-identical across
backends" / "CUDA integer kernels are bit-identical by design" /
"golden-data gate on all backends" — all wrong. docs/principles.md
§3.1 was already correct ("CPU only") so the source of truth didn't
drift; the derivative pages did.
Fixed pages:
- docs/backends/cuda/overview.md §"Numerical tolerance" — rewrite
to "close agreement, not bit-exact"; remove the "no fp slack on
VIF/ADM/Motion2" claim.
- docs/backends/sycl/overview.md §"Numerical tolerance" — same
rewrite; keep the fp16/fixed-iteration-order notes as
"reduces but does not eliminate" the deviation.
- docs/models/overview.md §"GPU and SIMD acceleration" — fix the
one-liner that claimed goldens enforce cross-backend identity.
- docs/development/release.md — gate runs CPU only; GPU/SIMD covered
by per-backend snapshot tests (testdata/scores_cpu_*.json +
netflix_benchmark_results.json) at ULP tolerance.
- docs/reference/faq.md — rewrite the golden-contract answer with
explicit CPU/GPU split.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- docs/ai/security.md: VMAF_TINY_MODEL_DIR is not implemented in vmaf_dnn_validate_onnx (no getenv call in libvmaf/src/dnn/model_loader.c). Replaced the claim with a "Planned (not yet implemented)" callout that points at tracking issue #28. Today the loader trusts caller-supplied paths once realpath + S_ISREG + readable checks pass; MCP callers retain their separate path allowlist. - docs/api/index.md: VmafConfiguration.gpumask is described as "disable CUDA" but libvmaf.c:694-698 treats any non-zero value as a boolean that disables BOTH CUDA and SYCL. Corrected the inline comment and added a caveat blockquote noting the field is not actually per-bit despite the uint64_t type. Per-backend opt-out lives on the CLI (--no_cuda / --no_sycl). Follow-up issues: #28 (tiny-model path-jail), #29 (picture_pool ghost).
- docs/api/gpu.md: vmaf_sycl_profiling_enable only flips a bool
(libvmaf/src/sycl/common.cpp:1053-1059). The SYCL queue's
enable_profiling property is fixed at vmaf_sycl_state_init() time, so
calling the enable-fn on a state that was inited without profiling
succeeds silently but triggers sycl::exception on later
get_profiling_info calls. Old wording ("use on a state that was inited
with the flag set") buried the lede; new wording states the requirement
plus the failure mode explicitly.
- docs/api/dnn.md: vmaf_dnn_session_run_luma8 also returns -EINVAL for
NULL args, -ENOTSUP when ORT under-fills the output buffer, and the
whole DNN surface degrades to -ENOSYS when built without ONNX Runtime
(enable_onnx=false). Documented all four cases.
No code changes.
Two low-severity audit findings from round-2 verification against libvmaf/src/sycl/common.cpp + libvmaf/src/sycl/dmabuf_import.cpp: - vmaf_sycl_list_devices enumerates device_type::gpu only and prints platform, vendor, driver version, and fp64 flag per device — not just the ordinal. Returns -EIO on sycl::exception. Previous one-line description understated the output. - Zero-copy ingest paths catch sycl::exception and convert to -EIO, but the log message is the generic exception text — callers can't string- match for "Level Zero required". Documented the correct graceful- degradation pattern: sniff queue::get_backend() up front rather than parsing the log. No behaviour changes — docs-only.
Round-3 audit against libvmaf/src/dnn/ort_backend.c:85-98 found that: - VmafDnnDevice AUTO / OPENVINO / ROCM all fall through to the default CPU EP. The switch only special-cases CUDA (under ORT_API_HAS_CUDA). No SessionOptionsAppendExecutionProvider_OpenVINO / _MIGraphX calls exist anywhere. - VmafDnnConfig.fp16_io is declared in the public header but never read anywhere in libvmaf/src/dnn/. It is a pure ghost field. Doc updates: - Device-config section: AUTO = CPU today; OPENVINO / ROCM / fp16_io explicitly labelled "accepted but ignored" with issue #30 backref. - Return-code tables: added -ENOMEM to both vmaf_use_tiny_model and vmaf_dnn_session_run — both are reachable (session open / tensor allocation failures). - ENOSPC semantics clarified: outputs[i].written = produced is set BEFORE the capacity check, so the required-size hint is always available on -ENOSPC. - Known-limitations: added fp16_io + OPENVINO/ROCM rows (previously claimed OPENVINO "requires an ORT built with the OpenVINO EP" which is only true *once* the EP wire-up lands). Follow-up code-fix: #30.
3823797 to
fb6da65
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements ADR-0100's per-surface doc bar across the four user-discoverable surface areas that had thin or missing documentation. Four batches, one commit each.
Batches
Batch 1 —
docs/usage/CLI surface +docs/backends/index.mdfixdocs/usage/cli.md(fullvmafreference: flags, models, tiny-AI, presets, exit codes, Netflix-golden worked example)docs/usage/bench.md(vmaf_benchharness)docs/usage/precision.md(--precisiondeep dive, round-trip rationale)docs/backends/index.md— removed phantomVMAF_FORCE_BACKENDenv var claim (grep showed nothing in C / Python reads it); documented real--no_cuda/--no_sycl/--sycl_devicemechanism instead.Batch 2 —
docs/api/public C API referencedocs/api/index.md— core (libvmaf.h+picture.h+model.h+feature.h): lifecycle, VmafConfiguration, picture ownership, feature dictionary, model collections, ABI tiers, thread-safety, end-to-end runnable C example.docs/api/dnn.md— tiny-AI session API (ADR-0040 named binding,-ENOSPCretry, runnable luma-filter example).docs/api/gpu.md— CUDA (libvmaf_cuda.h) + SYCL (libvmaf_sycl.h): zero-copy frame buffers, dmabuf / VA / D3D11 import, profiling,state_freevslist_devicesasymmetry.Batch 3 —
docs/metrics/features.mdexpansionlibvmaf/src/feature/.Batch 4 —
docs/mcp/per-tool referencedocs/mcp/index.md— MCP server overview: install, Claude Desktop config, env vars, security model (argv exec +Path.resolve()allowlist), when-not-to-use.docs/mcp/tools.md— per-tool reference for all 6 tools exposed bymcp-server/vmaf-mcp/(README only listed 4; live server.py has 6 — addedeval_model_on_split+compare_models). Input schema, behaviour, JSON example call + response, enumerated error codes per tool.All new pages wired into
docs/index.md+mkdocs.ymlnav.mkdocs build --strictpasses clean on every commit.Test plan
./.venv/bin/mkdocs build --strictclean on each commitVMAF_FORCE_BACKENDis not referenced by any runtime code (only stale docs + vestigial CI coverage export)libvmaf/include/libvmaf/libvmaf/src/feature/extractor.options[]arrays andprovided_featuresmcp-server/vmaf-mcp/src/vmaf_mcp/server.pylist_tools()handler🤖 Generated with Claude Code