Skip to content

ci: skip docker image build when unused, harden e2e changes checkout#59780

Merged
arnohillen merged 9 commits into
masterfrom
ahillen/ci-skip-docker-and-harden-e2e-changes-checkout
May 28, 2026
Merged

ci: skip docker image build when unused, harden e2e changes checkout#59780
arnohillen merged 9 commits into
masterfrom
ahillen/ci-skip-docker-and-harden-e2e-changes-checkout

Conversation

@arnohillen
Copy link
Copy Markdown
Contributor

Summary

Two unrelated CI fixes I noticed while debugging slow CI on a 1-line test PR (#59779). Both touch only .github/workflows/, both are easy to revert independently.

1. Skip Build Docker image when no consumer waits for it

container-images-ci.yml's posthog_build runs ~15–25 min on depot-ubuntu-latest-4 and is gated by:

docker_files:
  - '**'
  - '!rust/**'
  - '!livestream/**'

i.e. anything except rust/livestream — basically every PR. But the only consumer of the resulting image is ci-hobby.yml's wait-for-docker-image job (it polls this workflow by check name "Build Docker image" via fountainhead/action-wait-for-check). When ci-hobby would skip — the common case for FE/BE PRs — the build is pure waste.

This PR mirrors ci-hobby.yml's hobby: paths-filter directly into container-images-ci.yml and gates posthog_build on the union of:

  • hobby: (matches ci-hobby.yml's hobby: filter — must stay in sync)
  • build_self: (changes to the build workflow / build-n-cache-image action)
  • hobby-preview PR label
  • merge_group (preserved — image still built before merge)

Also adds labeled/unlabeled to the pull_request trigger so adding hobby-preview retroactively kicks off a build (matches ci-hobby.yml's trigger types — without this, the label would arm hobby but never trigger the image build that hobby waits for).

I confirmed the only in-repo consumer is hobby — ci-e2e-playwright.yml and the other "Wait for Docker services" steps use bin/ci-wait-for-docker launch (docker-compose for ephemeral services), not the posthog_build image. Hobby already handles BUILD_CONCLUSION = "skipped" gracefully (ci-hobby.yml line 291–293, exit 0).

Expected impact: ~15–25 min off most FE / BE / docs / test-only PRs.

2. Harden Determine need to run E2E checks against the v6.0.2 promisor-fetch flake

The changes job in ci-e2e-playwright.yml intermittently fails its actions/checkout@v6.0.2 step:

[command]/usr/bin/git checkout --progress --force refs/remotes/pull/59779/merge
##[error]fatal: could not read Username for 'https://github.com': terminal prompts disabled
##[error]fatal: could not fetch <SHA> from promisor remote
##[error]The process '/usr/bin/git' failed with exit code 128

Root cause: filter: blob:none makes the post-fetch git checkout lazy-fetch every blob in HEAD from the promisor remote, and v6.0.2's per-blob credential helper lookup occasionally races with the includeIf-scoped auth file write. Result: any PR can randomly hit a hard BLOCKED merge state until the job is retried (this exact failure on #59779 earlier today).

The job only reads .github/clickhouse-versions.json plus a few git ref operations. Adding sparse-checkout: .github/clickhouse-versions.json collapses the lazy fetch to a single blob — no longer enough surface for the credential issue to manifest, with no impact on the existing fast path.

This is a targeted, surgical fix for the one failing checkout. The same pattern exists in ci-backend.yml and ci-dagster.yml's changes jobs and may want similar treatment if they ever flake — leaving that as a follow-up to keep this PR small.

Test plan

  • CI green on this PR. The Build Docker image job should appear as skipped on this PR (workflow-only change, no hobby files touched, no hobby-preview label) — that's the whole point.
  • Once merged, on the next FE-only PR (e.g. anything touching frontend/), confirm Build Docker image skips.
  • Add the hobby-preview label to a test PR and confirm Build Docker image re-runs.
  • Push a Dockerfile change in a follow-up PR and confirm Build Docker image runs.

Out of scope (for follow-up PRs)

  • Jest test selection (--findRelatedTests) for ci-frontend.yml — biggest remaining FE win.
  • Flipping shadow-story-selection from advisory to authoritative on ci-storybook.yml.
  • Tightening frontend_code filter to exclude **/*.test.{ts,tsx} so test-only edits don't trigger bundle-size / typecheck / VR.
  • semgrep-* per-language gating (semgrep-js is 3m+ on every PR regardless of touched languages).

Made with Cursor

The "Build Docker image" job in container-images-ci.yml was running on every
non-rust/livestream PR (~15-25 min on depot-ubuntu-latest-4) but is only
consumed by ci-hobby.yml's `wait-for-docker-image`. When ci-hobby would
skip, the build is wasted work — most FE/BE PRs never trigger it. Restrict
the build to the same trigger conditions ci-hobby uses (mirror of its
`hobby:` filter + `hobby-preview` label), plus changes to the build workflow
itself so workflow edits are still exercised pre-merge. Add `labeled`/
`unlabeled` to the workflow trigger so adding `hobby-preview` retroactively
still kicks off a build (matches ci-hobby.yml's trigger types).

Separately, the `Determine need to run E2E checks` job intermittently fails
its actions/checkout@v6.0.2 step with `fatal: could not read Username for
'https://github.com'` from the post-fetch `git checkout` (e.g. PR #59779
on 2026-05-23, which blocked merge until the job was retried). The cause is
v6.0.2 lazy-fetching every blob in HEAD from the promisor remote when
`filter: blob:none` is set, and the per-blob credential helper lookup
occasionally racing with the includeIf-scoped auth file write. The job only
reads `.github/clickhouse-versions.json` plus a few git ref operations, so a
sparse-checkout pinned to that one file collapses the lazy fetch to a
single blob — small enough to no longer trip the credential issue, with no
impact on the existing fast path.
@assign-reviewers-posthog assign-reviewers-posthog Bot requested a review from a team May 23, 2026 13:13
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 23, 2026

Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
.github/workflows/container-images-ci.yml:5-8
The inline comment claims the two workflows "use the same trigger types" and "stay in lock-step", but `ci-hobby.yml` uses `[opened, synchronize, labeled, unlabeled]` — it does not include `reopened`. As a result, re-opening a PR that touches hobby files will trigger an image build here while ci-hobby never wakes up to consume it, producing a ~15–25 min build that the PR description explicitly aims to eliminate.

```suggestion
        # `labeled`/`unlabeled` lets `hobby-preview` retroactively trigger
        # an image build. Types are kept in sync with ci-hobby.yml; note that
        # ci-hobby.yml does not include `reopened`, so neither does this list.
        types: [opened, synchronize, labeled, unlabeled]
```

Reviews (1): Last reviewed commit: "ci: skip docker image build when unused,..." | Re-trigger Greptile

Comment thread .github/workflows/container-images-ci.yml Outdated
@arnohillen arnohillen enabled auto-merge (squash) May 23, 2026 13:16
arnohillen and others added 4 commits May 23, 2026 15:17
ci-hobby.yml uses [opened, synchronize, labeled, unlabeled] (no `reopened`),
so mirroring it means dropping `reopened` here too. Otherwise reopening a PR
that touches hobby paths would build an image that ci-hobby never wakes up
to consume — exactly the wasted ~20 min build this PR aims to eliminate.

Caught by Greptile review.

Co-authored-by: Cursor <cursoragent@cursor.com>
Renames the secondary filter `build_self` -> `build_inputs` and adds
`pnpm-lock.yaml` to it. ci-hobby's filter intentionally ignores frontend
dep changes (they're irrelevant to the smoke test), but they can still
break the production Docker build (different prod-only install flags,
multi-arch). With just the `hobby:` filter, Renovate-style lockfile bumps
would skip the build on the PR and only fail at merge_group / master push.

Two extra paths is a small price for catching these on the PR rather than
turning master red.

Co-authored-by: Cursor <cursoragent@cursor.com>
@arnohillen arnohillen requested review from rnegron and webjunkie and removed request for rnegron and webjunkie May 24, 2026 13:23
@gantoine
Copy link
Copy Markdown
Member

gantoine commented May 26, 2026

But the only consumer of the resulting image is ci-hobby.yml's wait-for-docker-image job

Not sure this is true, ci-e2e-playwright has this comment at the top: This workflow runs CI E2E tests with Playwright. It relies on the container image built by 'container-images-ci.yml'..

@arnohillen
Copy link
Copy Markdown
Contributor Author

I see, I'm not an expert in this so happy to close this PR if it doesn't make sense. I made it while waiting for CI to finish :D

Copy link
Copy Markdown
Member

@rnegron rnegron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnaudhillen in this PR, can we also update ci-e2e-playwright.yml? i dont believe

#
# This workflow runs CI E2E tests with Playwright.
#
# It relies on the container image built by 'container-images-ci.yml'.
#

is correct anymore

# Keep `hobby:` in sync with the same-named filter in ci-hobby.yml —
# drift means hobby waits for an image we never build (or vice versa).
hobby:
- Dockerfile
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we add .Dockerignore?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — added .dockerignore to build_inputs: rather than hobby:. .dockerignore changes the build context but doesn't affect hobby's runtime, and hobby: is supposed to mirror ci-hobby.yml's filter byte-for-byte (which doesn't list .dockerignore either).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry Cursor replied, but does this make sense?

arnohillen and others added 3 commits May 26, 2026 22:19
The top-of-file comment claimed `container-images-ci.yml` produced the
image consumed here, but the only image this workflow pulls is
`posthog-node` from `ci-nodejs-container.yml` (see the
`Resolve Node.js container image tag` step). Fix the comment so the
dependency is searchable.

Co-authored-by: Cursor <cursoragent@cursor.com>
`.dockerignore` silently changes the build context — a broken change
should fail on the PR, not at merge_group. Putting it under
`build_inputs:` (not `hobby:`) keeps the `hobby:` filter byte-for-byte
in sync with `ci-hobby.yml`, which doesn't watch `.dockerignore` either.

Co-authored-by: Cursor <cursoragent@cursor.com>
@webjunkie
Copy link
Copy Markdown
Contributor

tagging @gantoine since we looked at e2e this week

Copy link
Copy Markdown
Member

@gantoine gantoine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming these last CI checks pass this looks good

@arnohillen arnohillen merged commit 4fc0639 into master May 28, 2026
150 checks passed
@arnohillen arnohillen deleted the ahillen/ci-skip-docker-and-harden-e2e-changes-checkout branch May 28, 2026 15:56
@deployment-status-posthog
Copy link
Copy Markdown

deployment-status-posthog Bot commented May 28, 2026

Deploy status

Environment Status Deployed At Workflow
dev ✅ Deployed 2026-05-28 16:34 UTC Run
prod-us ✅ Deployed 2026-05-28 17:00 UTC Run
prod-eu ✅ Deployed 2026-05-28 17:02 UTC Run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants