BC5 readiness: pairwise BC4↔BC5 + check-bc5-compat orchestrator + scheduled CI#308
BC5 readiness: pairwise BC4↔BC5 + check-bc5-compat orchestrator + scheduled CI#308jeremy wants to merge 7 commits into
Conversation
Extends `conformance/schema.json` with `pairwiseAssertions[]` and four rule types: pairwiseSupersetArray, pairwiseSupersetKeys, pairwiseEqual, pairwiseDeltaAllowed. The first three enforce the additive-only invariant between BC4 and BC5 snapshots of the same operation; the fourth is a per-test allowlist with a required `reason` so a future public compatibility report can index accepted divergences. Per-backend schema validation alone can't catch this class of regression: with every new BC5 field marked optional, a hypothetical case where BC4 emits `memories: [items]` and BC5 emits `memories: []` passes each backend's schema independently. The pairwise rules close that loop. Path syntax: dotted identifiers relative to each snapshot. The empty string addresses the body root; `foo.bar` defaults to `pages[0].body.foo.bar` for single-page snapshots; explicit `pages[N].body.X` or aggregate `pages[*].body.X` are available when needed. Adds the canonical canary rule on `GetMyNotifications`: BC3 commit 64acf34 aliases BC5 memories[] to bubble_ups[] so BC4 API consumers keep receiving the same population on BC5; the pairwiseSupersetArray rule fires immediately if that alias ever drops. A pairwiseSupersetKeys rule on the body root catches any other key removal between backends.
Implements `scripts/compare-canary-runs.sh`, the executable side of the pairwise rules added in the prior commit. Reads BC4 and BC5 snapshot directories (one wire snapshot per operation, format established by the TS live-runner), looks up each test's `pairwiseAssertions` in `live-my-surface.json`, and applies the four rule types in turn. Path resolution: `""` → body root, `foo.bar` → `pages[0].body.foo.bar`, `pages[N].body.X` → explicit page, `pages[*].body.X` → aggregate across pages (collected into a list, so SupersetArray on the aggregate checks "number of pages emitting this", not a sum of inner lengths; that's an intentional semantic — multi-page aggregation is the escape hatch, single-page page-relative addressing is the common case). Exit codes: 0 clean, 1 violations, 2 operator error. Violation output groups by operation and includes the rule + display path so the specific invariant that fired is obvious without re-running. Identical account state across the two runs is mandatory for pairwise results to be meaningful; CONTRIBUTING.md documents this in the forthcoming docs commit.
Adds two opt-in targets that thread the live canary pipeline end-to-end. `make conformance-live` runs one backend's pass: TS captures canonical wire snapshots, then Ruby/Python/Go/Kotlin replay-decode them. Requires LIVE_RECORD_DIR + BASECAMP_BACKEND in the env; not invoked by `make check`. `make check-bc5-compat` is the top-level orchestrator: it wipes LIVE_RECORD_DIR (default `tmp/live-canary`), runs `conformance-live` once with BASECAMP_BACKEND=bc4, again with BASECAMP_BACKEND=bc5 against BC5_HOST, then invokes `scripts/compare-canary-runs.sh` on the two captured snapshot directories. Failure at any stage fails the whole target. The TS runner writes to `$LIVE_RECORD_DIR/$BASECAMP_BACKEND/wire/`, so one LIVE_RECORD_DIR root is reused across both passes and the per- backend distinction lives in BASECAMP_BACKEND. The plan documents this explicitly to avoid a separate-dir mental model that would diverge from what the TS runner actually does.
Adds `.github/workflows/live-canary.yml` — a nightly cron + manual workflow_dispatch trigger that runs `make check-bc5-compat`. Gated on repo secrets (BASECAMP_TOKEN, BASECAMP_ACCOUNT_ID, BC5_HOST); the job no-ops with a clear log message when the secrets aren't configured rather than failing the run, so forks and clones without canary access don't break. Sets up the full polyglot toolchain mirroring the existing Conformance Tests job — Go, Node, Ruby, Python, Java/Gradle — since the orchestrator invokes one capture + four replays. Uploads the entire `tmp/live-canary` tree as a workflow artifact with 14-day retention so failures can be inspected post-hoc without rerunning a costly live capture. CONTRIBUTING.md "Live canary" section gains a "Pairwise BC4↔BC5 comparison" subsection (the four rule types, the path syntax, the canonical `memories` example), an "Orchestrator" subsection (the make target wiring with the BC4-then-BC5-then-compare flow), and a "Scheduled CI" subsection (secret requirements + opt-in behavior). The stale "PR 4 lands the orchestrator" reference removed; the live infra is now the documented state, not a forward promise. Identical account state across the two runs is the load-bearing operational requirement and is called out explicitly in the docs — the pairwise rules assume the BC4 and BC5 backends see equivalent fixtures; without that, `unreads`-style natural drift will false-fail rules.
PR 4 is now shipping the orchestrator + pairwise comparison, so the forward-looking "PR 3 / PR 4" markers in CONTRIBUTING and COORDINATION no longer describe pending work — they describe what's in this commit's parent series. Rewrite the references to name the artifacts directly (`make conformance-*-replay`, `scripts/compare-canary-runs.sh`, `make check-bc5-compat`) so the docs stay accurate after PR sequencing shifts.
Sensitive Change Detection (shadow mode)This PR modifies control-plane files:
|
There was a problem hiding this comment.
Pull request overview
Adds the missing “BC4 ↔ BC5 additive-only invariant” layer to the live canary by introducing pairwise assertions, a snapshot comparison script, an end-to-end Makefile orchestrator, and a nightly GitHub Actions workflow to run the pipeline.
Changes:
- Extend the conformance test schema + live surface test fixture with
pairwiseAssertionsrules (superset arrays/keys, equality, and allowlisted deltas). - Add
scripts/compare-canary-runs.shto apply pairwise rules across two backend snapshot directories. - Add Makefile orchestration targets (
conformance-live,check-bc5-compat) and a scheduled workflow that runs the canary and uploads artifacts.
Tip
If you aren't ready for review, convert to a draft PR.
Click "Convert to draft" or run gh pr ready --undo.
Click "Ready for review" or run gh pr ready to reengage.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
scripts/compare-canary-runs.sh |
New pairwise snapshot comparator that enforces additive-only invariants between BC4 and BC5. |
Makefile |
Adds orchestration targets to run BC4 pass, BC5 pass, then pairwise compare. |
conformance/schema.json |
Adds schema support for pairwiseAssertions rule definitions. |
conformance/tests/live-my-surface.json |
Adds canonical pairwise rules for GetMyNotifications (memories array + top-level keys). |
.github/workflows/live-canary.yml |
Nightly + manual workflow to run make check-bc5-compat and upload snapshots as artifacts. |
CONTRIBUTING.md |
Documents pairwise comparison semantics, orchestrator usage, and scheduled CI behavior. |
COORDINATION.md |
Minor update reflecting that PR4 plan item is now landed here. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a663ad9301
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
5 issues found across 7 files
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Re-trigger cubic
… rule Post-launch reconciliation for the pairwise BC4<->BC5 canary. Clears the 14 review threads on #308 and corrects the memories rule to match what BC5 master actually shipped. scripts/compare-canary-runs.sh: - P0 (codex/cubic): match the TS live runner's snapshot filename scheme exactly -- sanitize the test name with [^A-Za-z0-9_-]+ -> "_" (case preserved), not lowercase/hyphen/operation-first. The old form matched no files, so check-bc5-compat could report clean without comparing anything. - P1 (cubic): a declared pairwise test missing a snapshot on either backend is now a hard error (exit 2), not a silent skip-and-pass. - empty `paths` loop (copilot): guard ${#RULE_PATHS[@]} instead of "${RULE_PATHS[@]:-}" so an empty array can't run against the body root. - pairwiseEqual (cubic): compare via jq deep-equality so object key-order differences don't false-fail. - exit-code contract (copilot): snapshot-read jq failures remap to the documented exit 2 instead of leaking jq's status under set -e. conformance/schema.json: pairwiseAssertions[].paths gains minItems:1. Makefile: - rm -rf "$(LIVE_RECORD_DIR)" guarded against empty / "/" / "..". - conformance-live validates LIVE_RECORD_DIR/BASECAMP_BACKEND before the TS live pass (invoked from the recipe after the guards) rather than after it. - check-bc5-compat comment: "reports all violations", not "fails on first". - new check-compare-canary target, wired into `make check`. .github/workflows/live-canary.yml: default BASECAMP_HOST in the workflow expression so an unset optional secret can't clobber the Makefile default. CONTRIBUTING.md: align the "all violations" wording; reframe the memories example to the waiver pattern. memories canary rule (conformance/tests/live-my-surface.json): the prior "BC3 commit 64acf34 aliases memories[] to bubble_ups[]" claim is counterfactual -- that commit does not exist, and BC5 master shipped `json.memories []` while BC4 four still populates it. The pairwiseSupersetArray rule now states the intended invariant and is waived by a temporary pairwiseDeltaAllowed entry pointing at spec/api-gaps/memories-emptied-regression.md (added on the reconciliation branch); remove the waiver once BC3 #10947 (which carries `json.memories @bubble_ups`) merges. scripts/test-compare-canary-runs.sh: network-free regression coverage for the filename scheme (incl. proof the old form finds nothing), missing-snapshot hard-fail, memories-waiver scoping, pairwiseEqual key-order, and the empty-paths guard.
The waiver `reason` strings read "BC4 four still populates it", which scans as a typo to anyone without the post-launch topology in hand (BC4 is the `four` branch). Reword to "BC4 (the four branch)" in the live-my-surface.json description + pairwiseDeltaAllowed reason and the CONTRIBUTING.md example. Addresses the Copilot re-review on PR #308.
There was a problem hiding this comment.
3 issues found across 7 files (changes from recent commits).
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
…s branch The brief's SDK-absorption runbook read as if the pairwiseDeltaAllowed waiver and conformance/tests/live-my-surface.json were present here, but those canary files are on PR #308's branch (this branch is the registry record only). Reword to state #308 introduces the waiver and that it gets removed from live-my-surface.json once #308 and BC3 #10947 both land. Also reframe the "no Smithy change" bullet to "no structural change" — this PR does realign the memories doc comment (documentation, not a shape change).
COORDINATION.md said the pairwiseSupersetArray rule "is waived via pairwiseDeltaAllowed", which reads as if the canary surface exists on this branch — it does not (conformance/tests/live-my-surface.json is on PR #308). Reword to attribute the rule + waiver to #308's branch, matching the same fix already made in the memories-emptied-regression brief.
Summary
Closes the BC5-readiness canary loop: PRs 1–3 built the per-backend live capture (TS) and cross-language decode (Ruby/Python/Go/Kotlin) infrastructure. PR 4 adds the pairwise BC4↔BC5 comparison that catches the additive-only invariant — and the orchestrator + scheduled CI that runs the whole pipeline end-to-end on a daily cron.
Stacked on #301 (PR 3 — cross-language wire-replay decoders). Once #301 merges, this PR retargets
main.Why pairwise
Per-backend schema validation alone can't detect a class of regression: with every new BC5 field marked optional, a case where BC4 emits
memories: [items]and BC5 emitsmemories: []passes each backend's schema independently — yet that's exactly the additive-only invariant the canary should catch. This is not hypothetical: BC5mastershipped precisely this regression (see below).The canonical canary rule lives on
GetMyNotifications. The intended invariant: BC5memories[]should remain a superset of BC4's. But BC5mastershippedjson.memories []while BC4 (thefourbranch) still populates it — a confirmed live regression, not the once-assumed alias (there is no commit 64acf34). The fix is written but unmerged: BC3 #10947 carriesjson.memories @bubble_ups(head9dc63e2e). So thepairwiseSupersetArray: ["memories"]rule is currently waived by a temporarypairwiseDeltaAllowed: ["memories"]entry pointing atspec/api-gaps/memories-emptied-regression.md; remove the waiver (restoring the hard-fail) once #10947 merges and provenance is repinned.What's in this PR
conformance/schema.json—pairwiseAssertions[]with four rule types:pairwiseSupersetArray,pairwiseSupersetKeys,pairwiseEqual,pairwiseDeltaAllowed(with requiredreasonso a future public compatibility report can index accepted divergences).conformance/tests/live-my-surface.json— thememoriesSupersetArray rule + a top-level SupersetKeys rule onGetMyNotifications.scripts/compare-canary-runs.sh— applies pairwise rules to a pair of snapshot directories. Exit 0 clean / 1 violation / 2 operator error.Makefile—make conformance-live(one backend's capture + 4 replays) andmake check-bc5-compat(BC4 pass + BC5 pass + pairwise). Both opt-in; not inmake check..github/workflows/live-canary.yml— nightly cron +workflow_dispatch, gated on repo secrets (no-ops with a log message when missing rather than failing the run). Uploads snapshots as a 14-day artifact so failures can be inspected post-hoc without rerunning.CONTRIBUTING.md— new "Pairwise BC4↔BC5 comparison", "Orchestrator", and "Scheduled CI" subsections under "Live canary".Path syntax (dotted identifiers on each snapshot)
""(empty string) addresses the body root.foo.bardefaults topages[0].body.foo.bar— the common single-page case.pages[N].body.Xaddresses a specific page.pages[*].body.Xaggregates across pages into a list (each page's value becomes one element; useful when you want a per-page count, not a sum of inner lengths).Test plan
jq empty conformance/schema.json— clean JSONjq empty conformance/tests/live-my-surface.json— clean JSONbash -n scripts/compare-canary-runs.sh— shell syntax cleanpairwiseSupersetArrayflags BC5 memories shorter than BC4pairwiseSupersetKeysflags BC5 missing a top-level key BC4 haspairwiseEqualflags value divergencepairwiseDeltaAllowedwaives a flagged pathpages[*].bodyaggregation collects per-page valuesmake -n check-bc5-compatresolves correctlyactionlint .github/workflows/live-canary.ymlcleanzizmor .github/workflows/live-canary.ymlcleanmake validate-api-gapscleanmake check-bucket-flat-paritycleanOperational requirement
Pairwise rules assume identical account state across the BC4 and BC5 runs. The same project list, same notifications, etc. Without that, naturally-drifting collections (
unreads, etc.) will false-fail rules. CONTRIBUTING.md calls this out explicitly; the canary needs a dedicated test account with stable, equivalent fixtures replicated to each backend before scheduled CI can pass meaningfully.Sequencing
After this lands, the BC5-readiness canary infrastructure is feature-complete (PRs 1–4). Remaining BC5 readiness work:
scripts/detect-api-gaps.sh+ weekly cron workflow (api-gap detection tooling; validation already shipped in PR 1).main).Summary by cubic
Adds pairwise BC4↔BC5 comparison to the live canary, a
check-bc5-compatorchestrator, and a nightly workflow that runs the full pipeline. This flags additive-only regressions (arrays shrinking, keys disappearing, or value drift) between backends.New Features
conformance/schema.json:pairwiseAssertions[]withpairwiseSupersetArray,pairwiseSupersetKeys,pairwiseEqual, andpairwiseDeltaAllowed(requiresreason).scripts/compare-canary-runs.sh(exits 0 clean / 1 violation / 2 operator error),Makefiletargetsconformance-liveandcheck-bc5-compat, and.github/workflows/live-canary.yml(nightly + manual, gated on secrets, uploads 14‑day artifacts).conformance/tests/live-my-surface.json: adds thememoriessuperset rule and a top-level superset-keys check onGetMyNotifications; thememoriesrule is currently waived viapairwiseDeltaAlloweddue to a confirmed BC5 regression, tracked in the API gaps doc.Bug Fixes
scripts/compare-canary-runs.sh: matches the TS snapshot filename scheme exactly, treats missing snapshots as a hard error (exit 2), uses jq deep-equality forpairwiseEqual(key-order agnostic), rejects emptypaths, and normalizes jq read failures to exit 2.scripts/test-compare-canary-runs.shandmake check-compare-canary(now part ofmake check); guardsrm -rfinMakefile; workflow defaultsBASECAMP_HOSTtohttps://3.basecampapi.com.live-my-surface.jsonandCONTRIBUTING.md.Written for commit 45ca130. Summary will update on new commits.
Review in cubic