perf(docker): cache CGO dep compile via go.sum-keyed prewarm layer#654
Merged
Conversation
Docker image builds recompile the heavy CGO dependencies — duckdb-go
(libduckdb) and pg_query_go (libpg_query) — from scratch on every PR.
The Go build cache that makes the non-Docker CI jobs fast (restored by
actions/setup-go, keyed on go.sum) does not persist inside Docker
builds, and `COPY . .` busts the build layer on any source edit, so
the multi-minute C/C++ compile re-runs every time.
Observed CD build times (GHA, native arm64 runners):
worker image ~367s ×4 matrix
main image ~288s ×2
controlplane ~241s ×2
Add a dep-prewarm RUN layer after `go mod download`, keyed only on
go.mod/go.sum, that compiles duckdb-go + pg_query_go into the build
cache:
RUN CGO_ENABLED=1 go build github.com/duckdb/duckdb-go/v2 \
github.com/pganalyze/pg_query_go/v6 2>/dev/null || true
On a source-only PR (the common case) this layer is a GHA layer-cache
hit (cache-to: type=gha,mode=max is already configured on all image
workflows), so the libduckdb/libpg_query compile is skipped and the
final `go build` recompiles only changed first-party packages.
- Dockerfile: prewarm duckdb-go + pg_query_go.
- Dockerfile.controlplane: prewarm pg_query_go only (CP doesn't link
libduckdb).
- Dockerfile.worker: restructure to copy go.mod/go.sum first → pin →
download → prewarm → COPY source, so the prewarm is cacheable. The
per-DuckDB-version pin correctness the old COPY-source-first layout
guarded is preserved by stashing the pinned go.mod/go.sum and
restoring them after COPY . . (local copy, no network). The existing
bindings-version verify layers still fail the build on any drift.
cache-proxy Dockerfile unchanged — it's CGO_ENABLED=0, no C deps.
Follow-ups (separate PRs): PR-triggered image builds are arm64-only
(mw-dev is arm64); trim the worker bindings download from 5 OS/arch to
the target arch.
The 5 bundled-extension downloads (httpfs, ducklake, json, postgres_scanner, iceberg) ran AFTER `COPY . .` and the app build, so `COPY . .` busting on any source edit re-ran all 5 curls every build. They depend only on the extension version/tag args, not on source. Move the download block (and, in the worker, the arg-only extension/bindings version cross-check) ahead of `COPY . .` so the layer is keyed on the version args alone and stays a GHA layer-cache hit across source-only PRs. The binary-dependent bindings-pin verify stays after the build (it inspects the compiled binary). Pairs with the CGO dep-prewarm in the previous commit: now every expensive builder layer — module download, CGO dep compile, extension fetch — sits before the source COPY and caches independently of first-party code changes.
benben
added a commit
that referenced
this pull request
Jun 2, 2026
…rce (#655) The prewarm restructure in #654 moved the per-DuckDB-version pin into a module-files-only stage (COPY go.mod go.sum; go get; go mod tidy) ahead of COPY . .. With no .go files present, `go mod tidy` sees zero imports and prunes EVERY require directive, emptying go.mod. The stashed go.mod.pinned was therefore empty too, and restoring it after COPY . . gave the final build a go.mod with no dependencies: duckdbservice/appender_init.go:7:2: no required module provides package github.com/duckdb/duckdb-go/v2 All four worker matrix builds failed on main (container-image-worker-cd). Main + control-plane images were unaffected — they never run tidy. Fix: pin with `go get pkg@ver` (which records the require regardless of imports) but do NOT tidy in the source-less stage; drop the stash/restore. Re-run the pin + `go mod tidy` AFTER COPY . ., once the real import set is visible, so tidy keeps every needed require. The prewarm + module caches from the pre-COPY layers stay valid (same versions → go get is a fast metadata no-op). Verified locally: `go mod tidy` with only go.mod/go.sum prunes duckdb-go to zero requires; `go get duckdb-go/v2@v2.10502.0` without tidy keeps and pins it. Default-path worker binary builds clean.
benben
added a commit
that referenced
this pull request
Jun 2, 2026
The prewarm added in #654 was premised on a multi-minute CGO compile of duckdb-go. That premise was wrong: duckdb-go-bindings ships PREBUILT static libs (libduckdb_*.a etc, ~1.7GB in the module cache) — there is no DuckDB C++ compile to cache. The other prewarmed dep, pg_query_go, cold-compiles in ~4s. So the prewarm layer saved single-digit seconds while adding a confusing extra build step. Measured on warm main builds after #654: main 288s -> 264s controlplane 241s -> 229s The ~10% saving came from the LAYER REORDERING (module + extension downloads moved ahead of COPY . ., so they cache on source-only PRs), not from the prewarm. The dominant remaining cost is the final CGO *link* of the prebuilt static libs (~68s), which caching can't touch because it changes with every first-party source edit. Remove the prewarm RUN from all three Dockerfiles. Keep the valuable reordering and the worker pin-without-tidy fix from #655. Net build-time wins still live here: - module download + extension fetch cached before COPY . . Real PR-latency lever (separate change): build only linux/arm64 in PR CI (mw-dev is arm64), halving the worker matrix.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Docker image builds redo two expensive things on every PR that the
non-Docker CI jobs already cache:
duckdb-go(libduckdb) +pg_query_go(libpg_query).
actions/setup-gocaches this for the test jobs(keyed on
go.sum), but the cache doesn't reach Docker, andCOPY . .busts the build layer on any source edit, so themulti-minute C/C++ compile re-runs each time.
postgres_scanner, iceberg) that ran after
COPY . .+ build, sothey re-fetched on every source edit despite depending only on the
version args.
Measured CD build times (GHA, native arm64 runners):
--mount=type=cachewould not help on ephemeral GHA runners — theGHA cache backend doesn't persist BuildKit cache mounts. Layer caching
(
cache-to: type=gha,mode=max, already configured on all imageworkflows) does, so the fix is to put every expensive step in a layer
keyed on
go.sum/ version args, ahead of the sourceCOPY.Fix
1. CGO dep-prewarm after
go mod download:RUN CGO_ENABLED=1 go build github.com/duckdb/duckdb-go/v2 \ github.com/pganalyze/pg_query_go/v6 2>/dev/null || true2. Move extension downloads before
COPY . .(keyed on version args).Net: every builder layer that costs time — module download, CGO dep
compile, extension fetch — now sits before the source
COPYand cachesindependently of first-party code. A source-only PR (the common case)
hits cache for all of them; the final
go buildrecompiles only changedpackages.
Per-file
go.mod/go.sumfirst → pin → download → prewarm → extension download →COPYsource. Per-DuckDB-version pin correctness (the old copy-source-first layout) preserved by stashing the pinnedgo.mod/go.sumand restoring afterCOPY . .(local copy, no network). Binary-dependent bindings-verify stays after the build; arg-only version cross-check moved up with the downloads.CGO_ENABLED=0, no C deps, no extensions.Expected: warm-cache source-only PRs drop CGO compile (~3-5 min) + 5 extension fetches to near-zero; first build after a
go.sum/ extension-version bump pays it once.Not included (deferred, separate PRs)
$TARGETARCH— now low-value since the pin+download layer is itself cached, and risksgo mod tidy(which is exhaustive across GOOS/GOARCH) breakage.🤖 Generated with Claude Code