Skip to content

BC5 readiness: pairwise BC4↔BC5 + check-bc5-compat orchestrator + scheduled CI#308

Open
jeremy wants to merge 7 commits into
bc5-readiness-wire-replay-decodersfrom
bc5-readiness-pairwise-orchestrator
Open

BC5 readiness: pairwise BC4↔BC5 + check-bc5-compat orchestrator + scheduled CI#308
jeremy wants to merge 7 commits into
bc5-readiness-wire-replay-decodersfrom
bc5-readiness-pairwise-orchestrator

Conversation

@jeremy
Copy link
Copy Markdown
Member

@jeremy jeremy commented May 16, 2026

Summary

Closes the BC5-readiness canary loop: PRs 1–3 built the per-backend live capture (TS) and cross-language decode (Ruby/Python/Go/Kotlin) infrastructure. PR 4 adds the pairwise BC4↔BC5 comparison that catches the additive-only invariant — and the orchestrator + scheduled CI that runs the whole pipeline end-to-end on a daily cron.

Stacked on #301 (PR 3 — cross-language wire-replay decoders). Once #301 merges, this PR retargets main.

Why pairwise

Per-backend schema validation alone can't detect a class of regression: with every new BC5 field marked optional, a case where BC4 emits memories: [items] and BC5 emits memories: [] passes each backend's schema independently — yet that's exactly the additive-only invariant the canary should catch. This is not hypothetical: BC5 master shipped precisely this regression (see below).

The canonical canary rule lives on GetMyNotifications. The intended invariant: BC5 memories[] should remain a superset of BC4's. But BC5 master shipped json.memories [] while BC4 (the four branch) still populates it — a confirmed live regression, not the once-assumed alias (there is no commit 64acf34). The fix is written but unmerged: BC3 #10947 carries json.memories @bubble_ups (head 9dc63e2e). So the pairwiseSupersetArray: ["memories"] rule is currently waived by a temporary pairwiseDeltaAllowed: ["memories"] entry pointing at spec/api-gaps/memories-emptied-regression.md; remove the waiver (restoring the hard-fail) once #10947 merges and provenance is repinned.

What's in this PR

  • conformance/schema.jsonpairwiseAssertions[] with four rule types: pairwiseSupersetArray, pairwiseSupersetKeys, pairwiseEqual, pairwiseDeltaAllowed (with required reason so a future public compatibility report can index accepted divergences).
  • conformance/tests/live-my-surface.json — the memories SupersetArray rule + a top-level SupersetKeys rule on GetMyNotifications.
  • scripts/compare-canary-runs.sh — applies pairwise rules to a pair of snapshot directories. Exit 0 clean / 1 violation / 2 operator error.
  • Makefilemake conformance-live (one backend's capture + 4 replays) and make check-bc5-compat (BC4 pass + BC5 pass + pairwise). Both opt-in; not in make check.
  • .github/workflows/live-canary.yml — nightly cron + workflow_dispatch, gated on repo secrets (no-ops with a log message when missing rather than failing the run). Uploads snapshots as a 14-day artifact so failures can be inspected post-hoc without rerunning.
  • CONTRIBUTING.md — new "Pairwise BC4↔BC5 comparison", "Orchestrator", and "Scheduled CI" subsections under "Live canary".

Path syntax (dotted identifiers on each snapshot)

  • "" (empty string) addresses the body root.
  • foo.bar defaults to pages[0].body.foo.bar — the common single-page case.
  • pages[N].body.X addresses a specific page.
  • pages[*].body.X aggregates across pages into a list (each page's value becomes one element; useful when you want a per-page count, not a sum of inner lengths).

Test plan

  • jq empty conformance/schema.json — clean JSON
  • jq empty conformance/tests/live-my-surface.json — clean JSON
  • bash -n scripts/compare-canary-runs.sh — shell syntax clean
  • Synthetic-fixture test: pairwiseSupersetArray flags BC5 memories shorter than BC4
  • Synthetic-fixture test: pairwiseSupersetKeys flags BC5 missing a top-level key BC4 has
  • Synthetic-fixture test: pairwiseEqual flags value divergence
  • Synthetic-fixture test: pairwiseDeltaAllowed waives a flagged path
  • Synthetic-fixture test: pages[*].body aggregation collects per-page values
  • make -n check-bc5-compat resolves correctly
  • actionlint .github/workflows/live-canary.yml clean
  • zizmor .github/workflows/live-canary.yml clean
  • make validate-api-gaps clean
  • make check-bucket-flat-parity clean
  • Real-backend run — requires BASECAMP_TOKEN/ACCOUNT_ID/BC5_HOST; CI will be the first opportunity once secrets are configured.

Operational requirement

Pairwise rules assume identical account state across the BC4 and BC5 runs. The same project list, same notifications, etc. Without that, naturally-drifting collections (unreads, etc.) will false-fail rules. CONTRIBUTING.md calls this out explicitly; the canary needs a dedicated test account with stable, equivalent fixtures replicated to each backend before scheduled CI can pass meaningfully.

Sequencing

After this lands, the BC5-readiness canary infrastructure is feature-complete (PRs 1–4). Remaining BC5 readiness work:

  • PR 5scripts/detect-api-gaps.sh + weekly cron workflow (api-gap detection tooling; validation already shipped in PR 1).
  • PR 6 — Go wrapper forward-compat audit + drift check (filed separately; unblocks basecamp-cli Phase 1).
  • PR 7+ — BC3 contract-drift handoff passes (5 passes targeting main).

Summary by cubic

Adds pairwise BC4↔BC5 comparison to the live canary, a check-bc5-compat orchestrator, and a nightly workflow that runs the full pipeline. This flags additive-only regressions (arrays shrinking, keys disappearing, or value drift) between backends.

  • New Features

    • conformance/schema.json: pairwiseAssertions[] with pairwiseSupersetArray, pairwiseSupersetKeys, pairwiseEqual, and pairwiseDeltaAllowed (requires reason).
    • Canary wiring: scripts/compare-canary-runs.sh (exits 0 clean / 1 violation / 2 operator error), Makefile targets conformance-live and check-bc5-compat, and .github/workflows/live-canary.yml (nightly + manual, gated on secrets, uploads 14‑day artifacts).
    • conformance/tests/live-my-surface.json: adds the memories superset rule and a top-level superset-keys check on GetMyNotifications; the memories rule is currently waived via pairwiseDeltaAllowed due to a confirmed BC5 regression, tracked in the API gaps doc.
  • Bug Fixes

    • Hardens scripts/compare-canary-runs.sh: matches the TS snapshot filename scheme exactly, treats missing snapshots as a hard error (exit 2), uses jq deep-equality for pairwiseEqual (key-order agnostic), rejects empty paths, and normalizes jq read failures to exit 2.
    • Adds scripts/test-compare-canary-runs.sh and make check-compare-canary (now part of make check); guards rm -rf in Makefile; workflow defaults BASECAMP_HOST to https://3.basecampapi.com.
    • Clarifies waiver reasons to reference "BC4 (the four branch)" in live-my-surface.json and CONTRIBUTING.md.

Written for commit 45ca130. Summary will update on new commits.

Review in cubic

jeremy added 5 commits May 16, 2026 10:15
Extends `conformance/schema.json` with `pairwiseAssertions[]` and four
rule types: pairwiseSupersetArray, pairwiseSupersetKeys, pairwiseEqual,
pairwiseDeltaAllowed. The first three enforce the additive-only
invariant between BC4 and BC5 snapshots of the same operation; the
fourth is a per-test allowlist with a required `reason` so a future
public compatibility report can index accepted divergences.

Per-backend schema validation alone can't catch this class of
regression: with every new BC5 field marked optional, a hypothetical
case where BC4 emits `memories: [items]` and BC5 emits `memories: []`
passes each backend's schema independently. The pairwise rules close
that loop.

Path syntax: dotted identifiers relative to each snapshot. The empty
string addresses the body root; `foo.bar` defaults to
`pages[0].body.foo.bar` for single-page snapshots; explicit
`pages[N].body.X` or aggregate `pages[*].body.X` are available when
needed.

Adds the canonical canary rule on `GetMyNotifications`: BC3 commit
64acf34 aliases BC5 memories[] to bubble_ups[] so BC4 API consumers
keep receiving the same population on BC5; the pairwiseSupersetArray
rule fires immediately if that alias ever drops. A pairwiseSupersetKeys
rule on the body root catches any other key removal between backends.
Implements `scripts/compare-canary-runs.sh`, the executable side of the
pairwise rules added in the prior commit. Reads BC4 and BC5 snapshot
directories (one wire snapshot per operation, format established by the
TS live-runner), looks up each test's `pairwiseAssertions` in
`live-my-surface.json`, and applies the four rule types in turn.

Path resolution: `""` → body root, `foo.bar` → `pages[0].body.foo.bar`,
`pages[N].body.X` → explicit page, `pages[*].body.X` → aggregate
across pages (collected into a list, so SupersetArray on the aggregate
checks "number of pages emitting this", not a sum of inner lengths;
that's an intentional semantic — multi-page aggregation is the escape
hatch, single-page page-relative addressing is the common case).

Exit codes: 0 clean, 1 violations, 2 operator error. Violation output
groups by operation and includes the rule + display path so the
specific invariant that fired is obvious without re-running.

Identical account state across the two runs is mandatory for pairwise
results to be meaningful; CONTRIBUTING.md documents this in the
forthcoming docs commit.
Adds two opt-in targets that thread the live canary pipeline end-to-end.

`make conformance-live` runs one backend's pass: TS captures canonical
wire snapshots, then Ruby/Python/Go/Kotlin replay-decode them. Requires
LIVE_RECORD_DIR + BASECAMP_BACKEND in the env; not invoked by
`make check`.

`make check-bc5-compat` is the top-level orchestrator: it wipes
LIVE_RECORD_DIR (default `tmp/live-canary`), runs `conformance-live`
once with BASECAMP_BACKEND=bc4, again with BASECAMP_BACKEND=bc5 against
BC5_HOST, then invokes `scripts/compare-canary-runs.sh` on the two
captured snapshot directories. Failure at any stage fails the whole
target.

The TS runner writes to `$LIVE_RECORD_DIR/$BASECAMP_BACKEND/wire/`, so
one LIVE_RECORD_DIR root is reused across both passes and the per-
backend distinction lives in BASECAMP_BACKEND. The plan documents this
explicitly to avoid a separate-dir mental model that would diverge from
what the TS runner actually does.
Adds `.github/workflows/live-canary.yml` — a nightly cron + manual
workflow_dispatch trigger that runs `make check-bc5-compat`. Gated on
repo secrets (BASECAMP_TOKEN, BASECAMP_ACCOUNT_ID, BC5_HOST); the job
no-ops with a clear log message when the secrets aren't configured
rather than failing the run, so forks and clones without canary access
don't break.

Sets up the full polyglot toolchain mirroring the existing Conformance
Tests job — Go, Node, Ruby, Python, Java/Gradle — since the orchestrator
invokes one capture + four replays. Uploads the entire `tmp/live-canary`
tree as a workflow artifact with 14-day retention so failures can be
inspected post-hoc without rerunning a costly live capture.

CONTRIBUTING.md "Live canary" section gains a "Pairwise BC4↔BC5
comparison" subsection (the four rule types, the path syntax, the
canonical `memories` example), an "Orchestrator" subsection (the make
target wiring with the BC4-then-BC5-then-compare flow), and a
"Scheduled CI" subsection (secret requirements + opt-in behavior). The
stale "PR 4 lands the orchestrator" reference removed; the live infra
is now the documented state, not a forward promise.

Identical account state across the two runs is the load-bearing
operational requirement and is called out explicitly in the docs — the
pairwise rules assume the BC4 and BC5 backends see equivalent fixtures;
without that, `unreads`-style natural drift will false-fail rules.
PR 4 is now shipping the orchestrator + pairwise comparison, so the
forward-looking "PR 3 / PR 4" markers in CONTRIBUTING and COORDINATION
no longer describe pending work — they describe what's in this commit's
parent series. Rewrite the references to name the artifacts directly
(`make conformance-*-replay`, `scripts/compare-canary-runs.sh`,
`make check-bc5-compat`) so the docs stay accurate after PR sequencing
shifts.
Copilot AI review requested due to automatic review settings May 16, 2026 17:17
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 16, 2026

Sensitive Change Detection (shadow mode)

This PR modifies control-plane files:

  • .github/workflows/live-canary.yml

Shadow mode — this check is informational only. When activated, changes to these paths will require approval from a maintainer.

@github-actions github-actions Bot added documentation Improvements or additions to documentation github-actions Pull requests that update GitHub Actions conformance Conformance test suite labels May 16, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed documentation Improvements or additions to documentation labels May 16, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the missing “BC4 ↔ BC5 additive-only invariant” layer to the live canary by introducing pairwise assertions, a snapshot comparison script, an end-to-end Makefile orchestrator, and a nightly GitHub Actions workflow to run the pipeline.

Changes:

  • Extend the conformance test schema + live surface test fixture with pairwiseAssertions rules (superset arrays/keys, equality, and allowlisted deltas).
  • Add scripts/compare-canary-runs.sh to apply pairwise rules across two backend snapshot directories.
  • Add Makefile orchestration targets (conformance-live, check-bc5-compat) and a scheduled workflow that runs the canary and uploads artifacts.

Tip

If you aren't ready for review, convert to a draft PR.
Click "Convert to draft" or run gh pr ready --undo.
Click "Ready for review" or run gh pr ready to reengage.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
scripts/compare-canary-runs.sh New pairwise snapshot comparator that enforces additive-only invariants between BC4 and BC5.
Makefile Adds orchestration targets to run BC4 pass, BC5 pass, then pairwise compare.
conformance/schema.json Adds schema support for pairwiseAssertions rule definitions.
conformance/tests/live-my-surface.json Adds canonical pairwise rules for GetMyNotifications (memories array + top-level keys).
.github/workflows/live-canary.yml Nightly + manual workflow to run make check-bc5-compat and upload snapshots as artifacts.
CONTRIBUTING.md Documents pairwise comparison semantics, orchestrator usage, and scheduled CI behavior.
COORDINATION.md Minor update reflecting that PR4 plan item is now landed here.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/compare-canary-runs.sh Outdated
Comment thread scripts/compare-canary-runs.sh Outdated
Comment thread scripts/compare-canary-runs.sh
Comment thread conformance/schema.json
Comment thread Makefile
Comment thread .github/workflows/live-canary.yml Outdated
Comment thread CONTRIBUTING.md Outdated
Comment thread Makefile Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a663ad9301

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/compare-canary-runs.sh Outdated
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 7 files

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Re-trigger cubic

Comment thread scripts/compare-canary-runs.sh Outdated
Comment thread scripts/compare-canary-runs.sh Outdated
Comment thread .github/workflows/live-canary.yml Outdated
Comment thread Makefile Outdated
Comment thread scripts/compare-canary-runs.sh Outdated
… rule

Post-launch reconciliation for the pairwise BC4<->BC5 canary. Clears the 14
review threads on #308 and corrects the memories rule to match what BC5
master actually shipped.

scripts/compare-canary-runs.sh:
- P0 (codex/cubic): match the TS live runner's snapshot filename scheme
  exactly -- sanitize the test name with [^A-Za-z0-9_-]+ -> "_" (case
  preserved), not lowercase/hyphen/operation-first. The old form matched no
  files, so check-bc5-compat could report clean without comparing anything.
- P1 (cubic): a declared pairwise test missing a snapshot on either backend
  is now a hard error (exit 2), not a silent skip-and-pass.
- empty `paths` loop (copilot): guard ${#RULE_PATHS[@]} instead of
  "${RULE_PATHS[@]:-}" so an empty array can't run against the body root.
- pairwiseEqual (cubic): compare via jq deep-equality so object key-order
  differences don't false-fail.
- exit-code contract (copilot): snapshot-read jq failures remap to the
  documented exit 2 instead of leaking jq's status under set -e.

conformance/schema.json: pairwiseAssertions[].paths gains minItems:1.

Makefile:
- rm -rf "$(LIVE_RECORD_DIR)" guarded against empty / "/" / "..".
- conformance-live validates LIVE_RECORD_DIR/BASECAMP_BACKEND before the TS
  live pass (invoked from the recipe after the guards) rather than after it.
- check-bc5-compat comment: "reports all violations", not "fails on first".
- new check-compare-canary target, wired into `make check`.

.github/workflows/live-canary.yml: default BASECAMP_HOST in the workflow
expression so an unset optional secret can't clobber the Makefile default.

CONTRIBUTING.md: align the "all violations" wording; reframe the memories
example to the waiver pattern.

memories canary rule (conformance/tests/live-my-surface.json): the prior
"BC3 commit 64acf34 aliases memories[] to bubble_ups[]" claim is
counterfactual -- that commit does not exist, and BC5 master shipped
`json.memories []` while BC4 four still populates it. The
pairwiseSupersetArray rule now states the intended invariant and is waived by
a temporary pairwiseDeltaAllowed entry pointing at
spec/api-gaps/memories-emptied-regression.md (added on the reconciliation
branch); remove the waiver once BC3 #10947 (which carries
`json.memories @bubble_ups`) merges.

scripts/test-compare-canary-runs.sh: network-free regression coverage for the
filename scheme (incl. proof the old form finds nothing), missing-snapshot
hard-fail, memories-waiver scoping, pairwiseEqual key-order, and the
empty-paths guard.
@jeremy jeremy requested a review from Copilot May 27, 2026 23:14
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label May 27, 2026
@github-actions github-actions Bot removed the documentation Improvements or additions to documentation label May 27, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comment thread conformance/tests/live-my-surface.json Outdated
Comment thread CONTRIBUTING.md Outdated
The waiver `reason` strings read "BC4 four still populates it", which scans as
a typo to anyone without the post-launch topology in hand (BC4 is the `four`
branch). Reword to "BC4 (the four branch)" in the live-my-surface.json
description + pairwiseDeltaAllowed reason and the CONTRIBUTING.md example.
Addresses the Copilot re-review on PR #308.
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 7 files (changes from recent commits).

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread CONTRIBUTING.md
Comment thread CONTRIBUTING.md Outdated
Comment thread conformance/tests/live-my-surface.json Outdated
@jeremy jeremy requested a review from Copilot May 27, 2026 23:20
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label May 27, 2026
@github-actions github-actions Bot removed the documentation Improvements or additions to documentation label May 27, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

jeremy added a commit that referenced this pull request May 28, 2026
…s branch

The brief's SDK-absorption runbook read as if the pairwiseDeltaAllowed waiver
and conformance/tests/live-my-surface.json were present here, but those canary
files are on PR #308's branch (this branch is the registry record only).
Reword to state #308 introduces the waiver and that it gets removed from
live-my-surface.json once #308 and BC3 #10947 both land. Also reframe the
"no Smithy change" bullet to "no structural change" — this PR does realign the
memories doc comment (documentation, not a shape change).
jeremy added a commit that referenced this pull request May 28, 2026
COORDINATION.md said the pairwiseSupersetArray rule "is waived via
pairwiseDeltaAllowed", which reads as if the canary surface exists on this
branch — it does not (conformance/tests/live-my-surface.json is on PR #308).
Reword to attribute the rule + waiver to #308's branch, matching the same fix
already made in the memories-emptied-regression brief.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conformance Conformance test suite enhancement New feature or request github-actions Pull requests that update GitHub Actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants