Skip to content

fix(cd): repair file change detection and switch CWS to draft-only upload#75

Merged
heznpc merged 1 commit intomainfrom
fix/cws-deploy-reliability
Apr 10, 2026
Merged

fix(cd): repair file change detection and switch CWS to draft-only upload#75
heznpc merged 1 commit intomainfrom
fix/cws-deploy-reliability

Conversation

@heznpc
Copy link
Copy Markdown
Owner

@heznpc heznpc commented Apr 10, 2026

Summary

Fixes two independent CD reliability issues that have caused chronic CD — Chrome Web Store Deploy failures since at least 2026-04-03 (15+ consecutive failed runs as of v3.5.4). PR #71's graceful skip is unrelated and works correctly — the failures here happen after the secrets check passes.

Layer 1 — fetch-depth fix (cd.yml AND cd-firefox.yml)

The "Check if deploy-relevant files changed" step uses:

CHANGED=$(git diff --name-only HEAD~1 HEAD 2>/dev/null || echo "manifest.json")

But actions/checkout@v4 defaults to fetch-depth: 1, so HEAD~1 is never fetched. git diff fails silently, the fallback fires, CHANGED is always set to "manifest.json", and the regex always matches. Result: should_deploy is always true regardless of what files actually changed.

This means CWS deploy attempts have been firing on every CI success on main, including PRs that only touched .github/, docs/, README.md, .claude/, or .gitignore. None of those should trigger a CWS deploy.

Fix: set fetch-depth: 0 so HEAD~1 actually exists. Same fix in cd-firefox.yml even though Firefox AMO is currently in graceful-skip mode (no secrets configured) — keeps the two workflows symmetric and correct for when AMO is eventually enabled.

Impact analysis — this session's PRs:

PR Touched Pre-fix CD CWS triggers? Post-fix?
#70 README/docs sync README.md, docs/ ✅ (false trigger)
#71 cd workflow skip .github/workflows/ ✅ (false trigger)
#72 store-update skill .claude/, store-assets/, .gitignore ✅ (false trigger)
#73 docs.yml permissions .github/workflows/ ✅ (false trigger)
#74 unicode labels src/lib/constants.js, docs/ ✅ (correct) ✅ (correct)

4 of 5 false triggers eliminated.

Layer 2 — switch CWS upload to draft-only (cd.yml only)

Even when the deploy correctly fires for a real release, the upload returns:

HTTPError: Response code 400 (Bad Request)
##[error]upload error - You will need to go to the Chrome Web Store
Developer Dashboard and upload it manually.

This 400 from CWS is notoriously unhelpful, but the most common causes all relate to publish state conflicts (existing version, review pending, listing not in publishable state). The maintainer is already publishing manually via the dashboard (per project memory: "215 installs, local code ahead of published version"), so auto-publish was never load-bearing.

Fix: change publish: truepublish: false. CD will continue to upload the zip on real releases, but it will land as a draft version on CWS, ready for the maintainer to review and publish from the dashboard. The upload itself is generally allowed even when publish would be blocked, so this should sidestep the 400 entirely.

If the 400 was actually from the upload step (not publish), a separate dashboard investigation will be needed — but Layer 1 alone already removes most of the noise.

Out of scope

  • The actual CWS dashboard state (review queue, OAuth refresh token validity, current published version) needs maintainer action and is not something this PR can address from CI alone.
  • node-version: 20 in cd-firefox.yml setup-node — left for the future repo-wide actions version sweep.

Test plan

  • CI green (test + validate)
  • After merge, this PR's own merge commit only touches .github/workflows/ so it should NOT trigger CD CWS at all → that itself validates Layer 1 in production
  • Next time someone changes src/, _locales/, assets/icons/, or manifest.json, CD CWS will run, attempt upload as draft, and succeed (validates Layer 2)
  • If draft upload still 400s, maintainer needs to check CWS Dashboard for review state

🤖 Generated with Claude Code

…load

Fixes two independent CD reliability issues that have been causing chronic
"CD — Chrome Web Store Deploy" failures since at least 2026-04-03 (15+
consecutive failed runs as of v3.5.4).

# Layer 1 — fetch-depth fix (cd.yml AND cd-firefox.yml)

The "Check if deploy-relevant files changed" step uses:

  CHANGED=$(git diff --name-only HEAD~1 HEAD 2>/dev/null || echo "manifest.json")

But actions/checkout@v4 defaults to fetch-depth: 1, so HEAD~1 is never
fetched. git diff fails silently, the fallback fires, CHANGED is always
set to "manifest.json", and the regex always matches. Result: should_deploy
is ALWAYS true regardless of what files actually changed.

This means CWS deploy attempts have been firing on every CI success on
main, including PRs that only touched .github/, docs/, README, .claude/,
or .gitignore. None of those should trigger a CWS deploy.

Fix: set fetch-depth: 0 in checkout so HEAD~1 actually exists. Same fix
in cd-firefox.yml even though Firefox AMO is currently in graceful-skip
mode (no secrets configured) — keeps the two workflows symmetric and
correct for when AMO is eventually enabled.

Impact: with this fix, only PRs that actually change manifest.json, src/,
_locales/, or assets/icons/ will trigger CWS deploy. Reduces failure
noise dramatically.

# Layer 2 — switch CWS upload to draft-only (cd.yml only)

Even when the deploy correctly fires for a real release, the upload step
returns:

  HTTPError: Response code 400 (Bad Request)
  upload error - You will need to go to the Chrome Web Store Developer
  Dashboard and upload it manually.

This 400 from CWS is notoriously unhelpful, but the most common causes
all relate to publish state conflicts (existing version, review pending,
listing not in publishable state). The maintainer is already publishing
manually via the dashboard (per project memory: "215 installs, local code
ahead of published version"), so auto-publish was never load-bearing.

Fix: change publish: true -> publish: false. CD will continue to upload
the zip on real releases, but it will land as a draft version on CWS,
ready for the maintainer to review and publish from the dashboard. The
upload itself is generally allowed even when publish would be blocked,
so this should sidestep the 400 entirely.

If the 400 was actually from the upload step (not publish), a separate
dashboard investigation will be needed — but Layer 1 alone already
removes most of the noise, so the only remaining failure surface is at
real release time.

Out of scope:
- The actual CWS dashboard state (review queue, OAuth refresh token
  validity, current published version) needs maintainer action.
- node-version: 20 in cd-firefox.yml setup-node — left for the future
  repo-wide actions version sweep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@heznpc heznpc merged commit 3f052fc into main Apr 10, 2026
2 checks passed
@heznpc heznpc deleted the fix/cws-deploy-reliability branch April 10, 2026 11:56
heznpc added a commit that referenced this pull request Apr 10, 2026
Six small simplifications surfaced by a code-quality review pass over
the workflow changes from #71 / #73 / #75. Each fix is independent and
small; bundled here because they all touch the same three workflow files
and have the same trivial risk profile (lint + tests still 309/309 green).

cd.yml and cd-firefox.yml:

1. Reorder steps so the file-change check runs BEFORE the secrets check.
   The change check is near-free (one git diff); the secrets check now
   only fires when there is actually something to deploy. This means
   the misleading "secrets not configured" notice no longer fires on
   every unrelated push to main — only at real release time.

2. With the reorder above, every downstream step's `if:` collapses from
     steps.secrets.outputs.configured == 'true' && steps.changes.outputs.should_deploy == 'true'
   to just
     steps.secrets.outputs.configured == 'true'
   because secrets is now itself gated on should_deploy. Six condition
   chains shortened per workflow.

3. fetch-depth: 0 -> fetch-depth: 2. The change-detection step only
   needs HEAD and HEAD~1; cloning the entire repo history every CD run
   was wasteful (PR #75 set it to 0 to fix a different bug; 2 is the
   minimum sufficient value).

4. Drop the dead `2>/dev/null || echo "manifest.json"` fallback on the
   git-diff line. With fetch-depth: 2, git diff always succeeds on this
   long-history repo, so the fallback branch is unreachable. Keeping it
   would silently let the script continue with bogus data on a future
   regression — fail-loud is safer.

5. Tighten a half-stale comment: "Only deploy when CI succeeded and
   relevant files changed" -> "Only run when the CI workflow succeeded".
   The "relevant files changed" part moved into a step in #71, so the
   comment was already half-lying.

cd-firefox.yml only:

6. node-version: 20 -> 22. The recent CI upgrade (5862443) and #73's
   docs.yml fix both standardized on Node 22; cd-firefox.yml was the
   only laggard.

docs.yml:

7. Wrap `git push` inside the same `||` brace block as `git commit` so
   it only runs when there is actually a new commit. Previously the push
   ran unconditionally and just printed "Everything up-to-date" on every
   no-op idempotent run, costing one network roundtrip per docs trigger.

Skipped findings (intentional):
- Composite-action / reusable-workflow extraction for the deploy-paths
  detection. With only two callers, the DRY benefit doesn't outweigh
  the extra file + indirection.
- Migration to `dorny/paths-filter@v3`. Cleaner than the hand-rolled
  git-diff but adds a third-party dependency and changes semantics on
  workflow_run events.
- Switching the CD trigger from `workflow_run: ["CI"]` to `push: paths:
  [...]`. Eliminates job startup cost on irrelevant pushes but loses
  the "only deploy after CI passes" guarantee.
- Native `if: ${{ secrets.X != '' }}` on the job. Loses the informative
  ::notice:: message that PR #71 added.

All four are tracked as future considerations rather than this PR.

Verification:
- npm run lint   — clean
- npm test       — 309/309 passing
- Mental dry-run on a no-op push (only .github/workflows/ touched):
  checkout -> changes-check sets should_deploy=false -> secrets-check
  gated and skipped -> all downstream gated and skipped -> exit 0.
- Mental dry-run on a real release (manifest.json bump):
  checkout -> changes-check matches -> secrets-check fires ->
  configured=true -> all downstream run -> deploy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant