From 16170674dc2fd18674a97c201c025d8b397fda79 Mon Sep 17 00:00:00 2001 From: Mike Stankavich Date: Sat, 16 May 2026 10:12:56 -0500 Subject: [PATCH] =?UTF-8?q?fix(bb):=20TRA-743=20=E2=80=94=20check-deploy-l?= =?UTF-8?q?ag.sh=20compares=20against=20`preview`,=20not=20`main`?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two bugs in the script as shipped in PR #158: 1. **Wrong branch.** Cloudflare Pages deploys `docs.preview.trakrf.id` from the `preview` branch (force-pushed by `.github/workflows/sync-preview.yml` = main + all open non-draft PRs). The script was comparing the deployed docs commit against `origin/main`, which is the merge target, not the ref preview actually serves. With work in flight on PR branches synced to `preview`, the deployed commit would normally lead `origin/main` and the check would (silently) pass even when preview lagged behind newer PR work. 2. **Required a local docs checkout.** Used `git fetch origin main` + `git rev-parse origin/main`. `bb_cycle` copies `tests/blackbox/` to `/tmp/bb-N/` for the BB session — `/tmp` has no git context, so the script would silently exit 2 if ever run from there. The retired `check-spec-sync.sh` queried the GitHub public API for the branch tip instead; that pattern is location-independent and `trakrf/docs` is public so no auth needed. Switched to the GitHub-API-based check from the retired predecessor. Kept the graceful WARN-and-skip fallback for GitHub rate limits / network blips (exit 0 with a warning rather than fail-closed — the script's job is to fail-fast on real deploy lag, not on its own observability gap). Verified against live preview: `4d213d3` (deployed) matches `preview@4d213d3` (GitHub branch tip). Co-Authored-By: Claude Opus 4.7 (1M context) --- tests/blackbox/check-deploy-lag.sh | 78 ++++++++++++++++++++---------- 1 file changed, 53 insertions(+), 25 deletions(-) diff --git a/tests/blackbox/check-deploy-lag.sh b/tests/blackbox/check-deploy-lag.sh index 05219eb..bfaa468 100755 --- a/tests/blackbox/check-deploy-lag.sh +++ b/tests/blackbox/check-deploy-lag.sh @@ -2,25 +2,37 @@ # Verify the preview docs deploy has caught up to the published tip. # # Why this exists: BB cycles run against the integrator-visible preview -# environment. If `main` was just pushed but Cloudflare Pages is still +# environment. If `preview` was just pushed but Cloudflare Pages is still # building the new deploy, BB would exercise the previous deploy and any # fix shipped in the latest commit would appear absent. This check fails # fast (with a transient exit code) so the wrapper in justfile can retry # until the deploy catches up. # -# Pre-mirror-retirement (TRA-743) this script did a three-way diff across -# platform-source, docs-mirror, and app-live spec bodies. With the mirror -# gone the docs origin serves no spec body of its own — `/api/openapi.yaml` -# 302s to the platform. The only docs-side lag question that remains is -# "is the preview docs deploy on the current published commit?", which is -# what this version checks. Platform-side `spec_refreshed_at` lag is -# observable independently at `$API_TEST_APP_URL/health.json` and is not -# part of this preflight. +# Pre-mirror-retirement (TRA-743) this script's predecessor did a three-way +# diff across platform-source, docs-mirror, and app-live spec bodies. With +# the mirror gone the docs origin serves no spec body of its own — +# `/api/openapi.yaml` 302s to the platform. The only docs-side lag question +# that remains is "is the preview docs deploy on the current `preview` +# branch tip?", which is what this script checks. Platform-side +# `spec_refreshed_at` lag is observable independently at +# `$API_TEST_APP_URL/health.json` and is not part of this preflight. +# +# Why query GitHub API for the branch tip (rather than local `git +# rev-parse origin/preview`): the script gets copied into `/tmp/bb-N/` +# alongside the rest of `tests/blackbox/` for the BB session itself, where +# there is no git context. Querying the public GitHub API works regardless +# of caller location, and `trakrf/docs` is a public repo so no auth needed. +# +# Why `preview` not `main`: Cloudflare Pages deploys `docs.preview.trakrf.id` +# from the `preview` branch (force-pushed by .github/workflows/sync-preview.yml +# = main + all open non-draft PRs). `main` is the merge target; +# `docs.trakrf.id` (production) tracks it. For preview lag, the right ref +# is `preview`. # # Exit codes: # 0 preview docs deploy is current # 2 required env var missing (API_TEST_DOCS_URL) -# 3 could not fetch /health.json +# 3 could not fetch /health.json or GitHub API # 4 preview deploy is behind expected commit (transient — retry) set -euo pipefail @@ -30,18 +42,27 @@ if [ -z "$DOCS" ]; then echo "ERROR: API_TEST_DOCS_URL not set (load tests/blackbox/.env.local)" >&2 exit 2 fi +DOCS="${DOCS%/}" -# Expected commit: tip of the docs branch that feeds preview. Cloudflare -# Pages publishes `main` to the preview environment, so the expected SHA -# is the remote `origin/main` HEAD. -git fetch --quiet origin main 2>/dev/null || true -expected="$(git rev-parse --short=7 origin/main 2>/dev/null || echo "")" -if [ -z "$expected" ]; then - echo "ERROR: could not resolve origin/main commit. Is this a checkout of trakrf/docs?" >&2 - exit 2 +DOCS_REPO="${TRAKRF_DOCS_REPO:-trakrf/docs}" +DOCS_BRANCH="${TRAKRF_DOCS_BRANCH:-preview}" + +fetch() { + curl -fsSL --max-time 15 "$1" +} + +# Branch tip: full SHA from the GitHub public API. +preview_tip="$(fetch "https://api.github.com/repos/${DOCS_REPO}/branches/${DOCS_BRANCH}" 2>/dev/null \ + | jq -r '.commit.sha // empty' 2>/dev/null)" || preview_tip="" + +if [ -z "$preview_tip" ]; then + echo "WARN: could not query GitHub API for ${DOCS_REPO}@${DOCS_BRANCH} tip;" >&2 + echo " skipping preview-deploy-lag check (rate limit or network)." >&2 + exit 0 fi -health_json="$(curl -fsS "$DOCS/health.json" 2>/dev/null)" || { +# Deployed commit: short SHA from the docs site's own /health.json. +health_json="$(fetch "$DOCS/health.json")" || { echo "FAIL: could not fetch $DOCS/health.json" >&2 exit 3 } @@ -53,9 +74,16 @@ if [ -z "$deployed" ]; then exit 3 fi -if [ "$deployed" != "$expected" ]; then - echo "Preview docs on $deployed; origin/main is $expected. Deploy still catching up." >&2 - exit 4 -fi - -echo "OK: preview docs on $deployed (matches origin/main)." +# health.json carries the short (7-char) SHA; GitHub API returns the full +# SHA. Compare as prefix. +case "$preview_tip" in + "$deployed"*) + echo "OK: preview docs on ${deployed} (matches ${DOCS_BRANCH}@${preview_tip:0:7})." + ;; + *) + echo "Preview deploy still catching up:" >&2 + echo " ${DOCS_BRANCH} branch tip: ${preview_tip}" >&2 + echo " deployed docs.commit: ${deployed}" >&2 + exit 4 + ;; +esac