Skip to content

ci: fix recurring Verify Examples workflow failures#745

Merged
bnusunny merged 1 commit into
aws:mainfrom
JamBalaya56562:fix-verify-examples-ci
Jun 1, 2026
Merged

ci: fix recurring Verify Examples workflow failures#745
bnusunny merged 1 commit into
aws:mainfrom
JamBalaya56562:fix-verify-examples-ci

Conversation

@JamBalaya56562
Copy link
Copy Markdown
Contributor

Summary

The Verify Examples workflow (.github/workflows/examples.yaml) has been failing on essentially every run (push and Dependabot PRs alike). Analysis across many recent runs showed the failures are not a single flake but three independent root causes, addressed here:

1. validate job — EOL runtime lint failure (deterministic)

examples/nextjs-zip and examples/remix-zip pinned Runtime: nodejs20.x, which reached end-of-life on 2026-04-30. sam validate --lint therefore matched cfn-lint rule W2531 and failed on every run.

  • Bumped both templates to nodejs22.x.
  • EOL deprecations are intentionally left to fail CI so example runtimes get updated promptly as they age out (no lint suppression added).

2. test-image(fasthtml) / test-zip(fasthtml-zip)NameError: Card (deterministic)

examples/fasthtml/app/requirements.txt and examples/fasthtml-zip/app/requirements.txt left python-fasthtml unpinned. Releases >=0.14.0 dropped Card from fasthtml.common, so the app raised NameError: name 'Card' is not defined and returned HTTP 500, failing the verify step.

  • Pinned to python-fasthtml==0.13.4 — the latest release that still exports Card and installs cleanly on python:3.12-slim.
  • Verified end-to-end: the real app's index route returns HTTP 200 with 0.13.4.

3. test-image / test-zip build steps — toomanyrequests: Rate exceeded (intermittent)

sam build pulls base images from public.ecr.aws, which throttles unauthenticated pulls at ~1 req/s per source IP. The matrix runs many jobs from shared runner IPs, so concurrent pulls burst past the limit and 429.

  • Wrapped both sam build invocations in a retry loop with exponential backoff and random jitter (~10/20/40/80s + 0–15s), so retries across jobs desynchronize and avoid a thundering herd re-colliding on the same boundary.

Verification

  • python-fasthtml==0.13.4: clean install on python:3.12-slim; the actual fasthtml app returns HTTP 200 (Card renders as <article>).
  • nodejs22.x: no other example template currently triggers W2531, so validate goes green.
  • Backoff/jitter arithmetic validated locally.

🤖 Generated with Claude Code

The "Verify Examples" workflow had been failing on every run. Log
analysis across many runs surfaced three independent root causes:

1. validate job: nextjs-zip and remix-zip pinned the EOL runtime
   nodejs20.x (deprecated 2026-04-30), so `sam validate --lint` matched
   cfn-lint rule W2531 and failed every run. Bumped both to nodejs22.x.
   EOL deprecations are intentionally left to fail CI so example runtimes
   get updated promptly when they age out.

2. test-image(fasthtml) / test-zip(fasthtml-zip): requirements.txt left
   python-fasthtml unpinned. Newer releases (>=0.14.0) dropped `Card` from
   fasthtml.common, so the app raised `NameError: name 'Card' is not
   defined` and returned HTTP 500. Pinned to ==0.13.4 (latest release that
   still exports Card and installs cleanly on python:3.12-slim). Verified
   end-to-end: the real app's index route now returns HTTP 200.

3. test-image/test-zip build steps: intermittent `toomanyrequests: Rate
   exceeded` while sam build pulled base images from public.ecr.aws.
   Wrapped both `sam build` invocations in a 3-attempt retry loop.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@JamBalaya56562
Copy link
Copy Markdown
Contributor Author

✅ Verified on a fork via workflow_dispatch

I ran the patched Verify Examples workflow on my fork (same ubuntu-24.04 runners) to confirm the fix end-to-end. All 18 jobs passed — including the three that had been failing on every run on main:

Job Before After
validate ❌ every run (W2531: nodejs20.x EOL on nextjs-zip/remix-zip) ✅ pass (nodejs22.x)
test-image (fasthtml) ❌ every run (NameError: name 'Card' is not defined → HTTP 500) ✅ pass (python-fasthtml==0.13.4)
test-zip (fasthtml-zip) ❌ every run (same Card error) ✅ pass
build-layer + remaining 14 matrix jobs ✅ all pass

Run: https://github.com/JamBalaya56562/aws-lambda-web-adapter/actions/runs/26710902404 (conclusion: success)

Notes:

  • python-fasthtml==0.13.4 is the latest release that still exports Card from fasthtml.common and installs cleanly on python:3.12-slim; verified locally that the example app's index route returns HTTP 200.
  • No lint suppression was added — EOL runtimes are intentionally left to fail CI so example runtimes get bumped promptly when they age out.
  • The sam build retry uses exponential backoff + jitter so the matrix jobs (which share runner egress IPs) don't re-collide on public.ecr.aws's ~1 req/s unauthenticated pull limit.

@bnusunny
Copy link
Copy Markdown
Contributor

bnusunny commented Jun 1, 2026

Thanks for the help!

@bnusunny bnusunny merged commit 50a7c10 into aws:main Jun 1, 2026
21 checks passed
@JamBalaya56562 JamBalaya56562 deleted the fix-verify-examples-ci branch June 1, 2026 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants