docs(state): record 2026-05-02 session bug-status changes (7 closes)#307
Merged
docs(state): record 2026-05-02 session bug-status changes (7 closes)#307
Conversation
Per CLAUDE.md §12 r13, every PR that closes a bug, opens a bug, or rules a Netflix upstream report not-affecting-the-fork updates docs/state.md in the same PR. Today's session shipped 7 such PRs without state.md updates (most were drafted in subagent-driven runs where the bookkeeping slipped). This is the catch-up. Recently closed (added): - vmaf_tiny_v1.onnx external-data ref fix (PR #296) - kernel_template SSBO cap 8->16 (PR #288 + named constant in #292) - deliverables-check.sh backslash strip (PR #292) - CI workflows draft-skip guard (PR #300) - CLAUDE.md §12 r14 ffmpeg-patches gate fix (PR #297) - ADR slug-drift cleanup (PR #304) - 1.07e-3 CPU vmaf_v0.6.1 score drift bisect (PR #305) — confirmed inherited from upstream a44e5e6 motion edge-mirror fix; not a fork regression. Snapshot regen in separate PR aligns with fork. No code changes. State.md only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
95c3d06 to
99df632
Compare
There was a problem hiding this comment.
Pull request overview
This PR is a bookkeeping catch-up to keep the fork’s bug-status register (docs/state.md) consistent with the “every bug close/open/rule-out updates state.md” protocol, by adding 7 entries under Recently closed for PRs shipped on 2026-05-02.
Changes:
- Added 7 new “Recently closed” table rows documenting the bug, closing PR, and verification notes.
- Recorded session outcomes across tiny-AI, Vulkan kernel-template limits, CI workflow behavior, and doc hygiene items.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+44
to
+50
| | **`vmaf_tiny_v1.onnx` external-data filename ref broken on load** — ONNXRuntime fails with "External data path validation failed for initializer: 0.weight" because the v1 ONNX referenced `mlp_small_final.onnx.data` while only `vmaf_tiny_v1.onnx.data` was committed | PR #296 (draft, 2026-05-02) | — (artifact-only fix, no ADR) | `python3 -c 'import onnxruntime; onnxruntime.InferenceSession("model/tiny/vmaf_tiny_v1.onnx")'` loads cleanly; the v2-vs-v1 diff path in `validate_vmaf_tiny_v2.py` runs end-to-end (was erroring on v1 load before) | | ||
| | **`kernel_template.h` 8-SSBO binding cap blocked `float_adm_vulkan` (9 bindings)** — `vmaf_vulkan_kernel_pipeline_create()` returned `-EINVAL` at init, surfaced as "problem reading pictures" / "problem flushing context" in the cross-backend gate run | PR #288 / commit `bb9d772e` (2026-05-02) — bundled with the float_adm_vulkan migration (T-GPU-DEDUP-22) | — (template-extension fix; ADR-0221 covers the template proper) | Cap raised 8 → 16, named `VMAF_VULKAN_KERNEL_MAX_SSBO_BINDINGS` constant introduced in PR #292 (draft); float_adm_vulkan smoke run reports `adm2 mean = 0.934515` (was failing to extract) | | ||
| | **`scripts/ci/deliverables-check.sh` mis-stripped backslashes from heredoc-quoted PR bodies** — `gh pr create` heredocs add escaped-backtick sequences that survive `tr -d` (which only strips backticks/asterisks/underscores), breaking the `- [x].*ITEM` regex (~18 PRs affected this session before the diagnosis landed) | PR #292 (draft, 2026-05-02) | — (CI script hardening, no ADR) | Extended `tr -d` to also strip backslashes; a test fixture with literal escaped-backticks around AGENTS.md now prints `OK (ticked)` | | ||
| | **CI workflows ran on draft PRs, burning runner-minutes** — none of the 7 `pull_request`-triggered workflows filtered on the draft flag, silently violating single-active-CI policy whenever a subagent pushed a branch as draft | PR #300 (draft, 2026-05-02) | — (CI-infrastructure fix, no ADR) | 33 jobs across 7 workflows now carry a draft-skip guard (`if:` clause that allows `pull_request` events only when `pull_request.draft == false`). The `ready_for_review` event re-triggers CI on un-draft; push-to-master and `workflow_dispatch` are unaffected | | ||
| | **CLAUDE.md §12 r14 ffmpeg-patches reviewer command was wrong** — `for p in ffmpeg-patches/000*-*.patch; do git apply --check "$p"; done` only succeeds for patch 0001 because patches 0002–0006 build on each other; correct gate is `git am --3way` series replay against pristine `n8.1` | PR #297 (draft, 2026-05-02) | — (rule wording fix, no ADR) | 2026-05-02 `/refresh-ffmpeg-patches` skill run: per-patch `apply --check` failed on 4/6 patches; `git am --3way` series replay succeeded for all 6 | | ||
| | **`docs/state.md` + `CHANGELOG.md` carried 15 stale ADR slug refs** (slug renames where NNNN stayed but filename evolved, e.g. `0152-monotonic-index-rejection.md` → `0152-vmaf-read-pictures-monotonic-index.md`) | PR #304 (draft, 2026-05-02) | — (doc cleanup, no ADR) | mkdocs `--strict` build clean; spot-check verifies each rewritten ref points at the actual on-disk filename for that NNNN. 11 wrong-NNNN refs (different concept under same NNNN, e.g. `0221-gpu-kernel-template.md` while disk-0221 is `vmaf-roi-tool.md`) split into a separate per-ADR-review PR | | ||
| | **1.07e-3 CPU `vmaf_v0.6.1` score drift between `/usr/local/bin/vmaf` v3.0.0 and master tip** — surfaced by 2026-05-02 `/run-netflix-bench` subagent run; well within Netflix golden's `places=2` tolerance, so the gate did NOT fire, but the drift was stable + reproducible | PR #305 (draft, 2026-05-02) — bisect identifies upstream Netflix `a44e5e61` (motion edge-mirror bugfix, Kyle Swanson 2026-04-17) inherited at fork root. Per-feature isolation: drift is entirely `integer_motion` (-1.005e-3) + `integer_motion2` (-0.985e-3); ADM and VIF are bit-identical. Snapshot regen via separate PR aligns `testdata/netflix_benchmark_results.json` with the fork's actual behavior. | — (bisect triage, no ADR) | `/bisect-regression` predicate against `vmaf_v0.6.1.json` brackets fork root `41301496` ↔ master `4cd3a8d8`; "first bad" = fork root means drift was inherited, not introduced. Doc at `docs/development/cpu-score-drift-bisect-2026-05-02.md` | |
| | Bug | Closed by | ADR | Verification | | ||
| |---|---|---|---| | ||
| | **`vmaf_tiny_v1.onnx` external-data filename ref broken on load** — ONNXRuntime fails with "External data path validation failed for initializer: 0.weight" because the v1 ONNX referenced `mlp_small_final.onnx.data` while only `vmaf_tiny_v1.onnx.data` was committed | PR #296 (draft, 2026-05-02) | — (artifact-only fix, no ADR) | `python3 -c 'import onnxruntime; onnxruntime.InferenceSession("model/tiny/vmaf_tiny_v1.onnx")'` loads cleanly; the v2-vs-v1 diff path in `validate_vmaf_tiny_v2.py` runs end-to-end (was erroring on v1 load before) | | ||
| | **`kernel_template.h` 8-SSBO binding cap blocked `float_adm_vulkan` (9 bindings)** — `vmaf_vulkan_kernel_pipeline_create()` returned `-EINVAL` at init, surfaced as "problem reading pictures" / "problem flushing context" in the cross-backend gate run | PR #288 / commit `bb9d772e` (2026-05-02) — bundled with the float_adm_vulkan migration (T-GPU-DEDUP-22) | — (template-extension fix; ADR-0221 covers the template proper) | Cap raised 8 → 16, named `VMAF_VULKAN_KERNEL_MAX_SSBO_BINDINGS` constant introduced in PR #292 (draft); float_adm_vulkan smoke run reports `adm2 mean = 0.934515` (was failing to extract) | |
| | **`scripts/ci/deliverables-check.sh` mis-stripped backslashes from heredoc-quoted PR bodies** — `gh pr create` heredocs add escaped-backtick sequences that survive `tr -d` (which only strips backticks/asterisks/underscores), breaking the `- [x].*ITEM` regex (~18 PRs affected this session before the diagnosis landed) | PR #292 (draft, 2026-05-02) | — (CI script hardening, no ADR) | Extended `tr -d` to also strip backslashes; a test fixture with literal escaped-backticks around AGENTS.md now prints `OK (ticked)` | | ||
| | **CI workflows ran on draft PRs, burning runner-minutes** — none of the 7 `pull_request`-triggered workflows filtered on the draft flag, silently violating single-active-CI policy whenever a subagent pushed a branch as draft | PR #300 (draft, 2026-05-02) | — (CI-infrastructure fix, no ADR) | 33 jobs across 7 workflows now carry a draft-skip guard (`if:` clause that allows `pull_request` events only when `pull_request.draft == false`). The `ready_for_review` event re-triggers CI on un-draft; push-to-master and `workflow_dispatch` are unaffected | | ||
| | **CLAUDE.md §12 r14 ffmpeg-patches reviewer command was wrong** — `for p in ffmpeg-patches/000*-*.patch; do git apply --check "$p"; done` only succeeds for patch 0001 because patches 0002–0006 build on each other; correct gate is `git am --3way` series replay against pristine `n8.1` | PR #297 (draft, 2026-05-02) | — (rule wording fix, no ADR) | 2026-05-02 `/refresh-ffmpeg-patches` skill run: per-patch `apply --check` failed on 4/6 patches; `git am --3way` series replay succeeded for all 6 | | ||
| | **`docs/state.md` + `CHANGELOG.md` carried 15 stale ADR slug refs** (slug renames where NNNN stayed but filename evolved, e.g. `0152-monotonic-index-rejection.md` → `0152-vmaf-read-pictures-monotonic-index.md`) | PR #304 (draft, 2026-05-02) | — (doc cleanup, no ADR) | mkdocs `--strict` build clean; spot-check verifies each rewritten ref points at the actual on-disk filename for that NNNN. 11 wrong-NNNN refs (different concept under same NNNN, e.g. `0221-gpu-kernel-template.md` while disk-0221 is `vmaf-roi-tool.md`) split into a separate per-ADR-review PR | |
|
|
||
| | Bug | Closed by | ADR | Verification | | ||
| |---|---|---|---| | ||
| | **`vmaf_tiny_v1.onnx` external-data filename ref broken on load** — ONNXRuntime fails with "External data path validation failed for initializer: 0.weight" because the v1 ONNX referenced `mlp_small_final.onnx.data` while only `vmaf_tiny_v1.onnx.data` was committed | PR #296 (draft, 2026-05-02) | — (artifact-only fix, no ADR) | `python3 -c 'import onnxruntime; onnxruntime.InferenceSession("model/tiny/vmaf_tiny_v1.onnx")'` loads cleanly; the v2-vs-v1 diff path in `validate_vmaf_tiny_v2.py` runs end-to-end (was erroring on v1 load before) | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
no docs needed: bookkeeping-only state.md update.
Per CLAUDE.md §12 r13, every PR that closes / opens / rules out a bug updates docs/state.md in the same PR. Today's session shipped 7 such PRs without state.md updates (most slipped during subagent-driven runs). This is the catch-up.
Rows added under "Recently closed"
a44e5e61(motion edge-mirror bugfix, Kyle Swanson 2026-04-17). NOT a fork regression. Snapshot regen aligns testdata with fork's actual behavior in a separate PR.Test plan
Six deep-dive deliverables (ADR-0108)