ci: cache FetchContent deps + boost mirror swap#508
Conversation
Two narrow, locally-verifiable changes to reduce FetchContent flakiness: 1. boost URL: sourceforge → archives.boost.io (URL_HASH unchanged). Addresses the most flaky cold-start dep — sourceforge has been a recurring 504 source (e.g. PR DTVMStack#499 run 25897803413 fell over on a FetchContent download). Boost.org's official archive returns HTTP 200 with the canonical 1.67.0 tarball (content-length matches; URL_HASH validates byte-identity). Drop DOWNLOAD_NAME since the new URL already ends in the canonical filename. 2. Top-level CMakeLists.txt honors FETCHCONTENT_BASE_DIR from env when no `-D` is passed on the cmake command line. Mirrors the pattern already at .worktrees/feat-gas-check-placement/CMakeLists.txt:8-15. Lets local developers share a populated cache across worktrees and clean builds via `export FETCHCONTENT_BASE_DIR=~/.cache/cmake-fetchcontent`. docs/start.md adds a "Build dependency cache" section documenting the local convention and the SGX local-cache caveat (asmjit gets patched under SGX; mixing patched/unpatched in one cache breaks). A complementary follow-up to pre-bake the deps into `dtvmdev1/dtvm-dev-x64:main` was scoped out of this PR because Docker is not available in the implementation environment for end-to-end verification. Design preserved in the change-doc's "Deferred" section for a future PR. Validation (local): - curl HTTP 200 + correct content-length on archives.boost.io URL. - env-hook test: FETCHCONTENT_BASE_DIR=/tmp/fc cmake -S . -B build populates /tmp/fc/<name>-src/ (note: BASE_DIR layout has no _deps/ segment — that's only the default). - Populated boost-src/boost/version.hpp contains BOOST_VERSION 106700. - tools/format.sh check pass. Refs: docs/changes/2026-05-15-fetchcontent-cache/README.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `actions/cache@v4` step to each container-image job in EVM (10) and WASM (4) workflows, plus workflow-level `FETCHCONTENT_BASE_DIR` env, so populated FetchContent sources persist across CI runs. First run is a miss (full downloads, ~820MB save); subsequent runs with unchanged `third_party/AddDeps.cmake` hit the cache and skip all 8 downloads. Builds on commit 96707a2 (CMakeLists env hook + boost URL swap): - The env hook (CMakeLists.txt:8-18) reads `FETCHCONTENT_BASE_DIR` from the workflow-level env when no `-D` is passed on cmake cmd line, so `.ci/run_test_suite.sh` and inline-cmake jobs both pick it up without modification. - boost URL swap already addresses the most flaky cold-start dep (sourceforge → archives.boost.io). Cache key composition: `${{ runner.os }}-fc-${{ github.workflow }}-v1- ${{ hashFiles('third_party/AddDeps.cmake') }}`. The `github.workflow` segment is necessary — EVM runs `SINGLEPASS_JIT=OFF` (no asmjit), WASM runs `SINGLEPASS_JIT=ON` (needs asmjit); sharing one key across workflows would cause WASM to re-download asmjit every run because actions/cache@v4 skips same-key saves (the "first writer wins" behavior). Workflow-prefixed keys avoid this. No `restore-keys` — partial cache hits across different dep versions can silently yield stale source. Coverage: 14 container jobs total. The replace_all on the standard `submodules: "true"` → `Code Format Check` boundary handled 12 of them; two manual inserts handled the Hunter-cache job (after Hunter cache step) and the perf-regression-check job (after its `fetch-depth: 0` checkout `with:` clause). Validation: - Format check pass. - YAML lint pass (python yaml.safe_load) on both workflows. - Cache step count: 10 (EVM) + 4 (WASM) = 14. - Cannot test cache behavior locally — actions/cache is GH-runner-side. PR CI will exercise it (AC-A first-run miss → save; AC-B re-run hit). Spec went through Phase 0 → Phase 0.5 (both REFINE absorbed) → Phase 1 → Phase 2 R1 (Opus PASS + Codex REVISE wording fixes absorbed). Hard cap not hit; iter=2 of Phase 0.5 skipped because refinements were spec-text only. Refs: docs/changes/2026-05-15-fetchcontent-cache/README.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
CI infrastructure change that caches the populated CMake FetchContent dependency tree across runs to eliminate per-run downloads (and to harden against upstream 504s), plus a swap of the boost tarball mirror to a more reliable host. A small CMake hook allows the workflow-level env var to flow through to cmake -S . without modifying .ci/run_test_suite.sh, and the local convention is documented in docs/start.md.
Changes:
- Add an
actions/cache@v4step (path/github/home/.fetchcontent, key includesrunner.os,github.workflow, av1namespace, andhashFiles('third_party/AddDeps.cmake')) plus a workflow-levelFETCHCONTENT_BASE_DIRenv to all 10 EVM and 4 WASM container jobs. - Swap the boost 1.67 download URL from
sourceforge.nettoarchives.boost.io(URL_HASH unchanged) and add a top-levelCMakeLists.txtblock that honorsFETCHCONTENT_BASE_DIRfrom the environment when not already set in cache. - New design doc under
docs/changes/2026-05-15-fetchcontent-cache/and a new "Build dependency cache" section indocs/start.md.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
CMakeLists.txt |
Honor FETCHCONTENT_BASE_DIR from env when not passed via -D. |
third_party/AddDeps.cmake |
Note about the new env hook + boost URL swap to archives.boost.io. |
.github/workflows/dtvm_evm_test_x86.yml |
Workflow env FETCHCONTENT_BASE_DIR + 10 actions/cache steps. |
.github/workflows/dtvm_wasm_test_x86.yml |
Workflow env FETCHCONTENT_BASE_DIR + 4 actions/cache steps. |
docs/start.md |
New section documenting local cache convention and SGX caveat. |
docs/changes/2026-05-15-fetchcontent-cache/README.md |
Design doc, risks, acceptance criteria. |
Comments suppressed due to low confidence (2)
.github/workflows/dtvm_evm_test_x86.yml:42
- The cache key uses
${{ github.workflow }}, which expands to the workflow'sname:field — here"DTVM-EVM test CI in x86-64"(and"DTVM-WASM test CI in x86-64"in the WASM file). That embeds spaces into the cache key (e.g.Linux-fc-DTVM-EVM test CI in x86-64-v1-<hash>), which is technically accepted byactions/cache@v4but is awkward to grep for, and any future rename of the workflow's display name will silently invalidate every cached entry. Acceptance criterion AC-E indocs/changes/2026-05-15-fetchcontent-cache/README.mdalso asserts the key containsDTVM-EVM/DTVM-WASM"with hyphen", which won't match the actual key string. Consider using a stable short literal per workflow (e.g.evm/wasm) instead ofgithub.workflow, or update the AC wording. Same applies to every cache step in both workflows.
key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }}
.github/workflows/dtvm_evm_test_x86.yml:42
- The cache key only hashes
third_party/AddDeps.cmake, butCMakeLists.txtnow also affects FetchContent behavior (the new env-honoring block, and any future option toggles such asSINGLEPASS_JITorZEN_ENABLE_SGXthat gate conditionalFetchContent_Declarecalls). Risk R4 in the change doc already acknowledges this. If a future commit moves aFetchContent_Declareout ofAddDeps.cmake(e.g. intoCMakeLists.txtor a subproject) the cache will silently serve stale sources. Consider including'CMakeLists.txt'in thehashFiles(...)glob, or extracting all FetchContent declarations into a single file enforced by convention.
key: ${{ runner.os }}-fc-${{ github.workflow }}-v1-${{ hashFiles('third_party/AddDeps.cmake') }}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- CMakeLists.txt env-hook comment: fix path ref. The comment previously cited `/opt/cmake-fetchcontent` (a vestige from the abandoned image-bake design); the workflows in this PR use `/github/home/.fetchcontent`, and local convention uses `~/.cache/cmake-fetchcontent`. Reword to describe the actual intended consumers. - docs/start.md: "8 deps" → "up to 8" with explicit list of which deps are conditional on which `ZEN_ENABLE_*` flag, so readers can reconcile with `AddDeps.cmake` content. - docs/start.md SGX caveat: clarify that no current CI job builds with `ZEN_ENABLE_SGX=ON`, so the workflow-level cache key (which does not distinguish SGX) is sound today; flag it as a contract to revisit if SGX is ever added to CI. Deferred (with rationale on the PR review thread): - Composite action for the 14 duplicated cache steps. Acknowledged duplication; sed-replace of a single block on a key bump (e.g. `v1` → `v2`) is still 1 line per workflow, while a composite action adds a new file + indirection. Will reconsider if the cache step grows beyond 4 lines. - Adding `CMakeLists.txt` to `hashFiles(...)`. Bumping the cache key on every CMakeLists.txt edit (frequent and dep-unrelated) outweighs the rare case where a new `FetchContent_Declare` is added outside `AddDeps.cmake`. R4 in the change-doc already calls this out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
⚡ Performance Regression Check Results✅ Performance Check Passed (interpreter)Performance Benchmark Results (threshold: 25%)
Summary: 194 benchmarks, 0 regressions ✅ Performance Check Passed (multipass)Performance Benchmark Results (threshold: 25%)
Summary: 194 benchmarks, 0 regressions |
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit 7a4d2a5)
Summary
spdlog/asmjit/CLI11/intx/boost/rapidjson+ conditionalgoogletest/yaml-cpp) across CI runs viaactions/cache@v4, keyed onhashFiles('third_party/AddDeps.cmake'). First run: full download + save (~820MB). Subsequent runs with unchanged deps: zero downloads.boostURL from sourceforge (recurring 504 source — e.g., PR perf(compiler): thread ValueRange through OR/XOR/SHR_U + narrow U64/U128 compare paths #499 run 25897803413 died on a FetchContent download) to the officialarchives.boost.ioarchive.URL_HASHunchanged so byte-identity preserved.FETCHCONTENT_BASE_DIRenv hook in top-levelCMakeLists.txtso the workflow-level env (/github/home/.fetchcontent) flows through.ci/run_test_suite.shtocmake -S .without touching the script.export FETCHCONTENT_BASE_DIR=~/.cache/cmake-fetchcontent) indocs/start.md.Cache key uses
${{ github.workflow }}so EVM and WASM caches are independent — EVM runsSINGLEPASS_JIT=OFF(noasmjit), WASM runsSINGLEPASS_JIT=ON(needsasmjit). Sharing one key would cause WASM to re-downloadasmjitevery run becauseactions/cache@v4skips same-key saves.No
restore-keys— partial cache hits across different dep versions can silently yield stale source. Cold start (cache miss) is the lesser evil.Coverage: 14 container jobs total (10 EVM + 4 WASM), all using
dtvmdev1/dtvm-dev-x64:main. Non-container jobs (perf_pr_comment,commit-lint, image-release) don't run cmake and are not modified.Stale image / image churn note: if
dtvmdev1/dtvm-dev-x64:mainis rebuilt with materially different CMake/compiler/Ninja, manually bump thev1namespace in the cache key (becomesv2, etc.). The mutable:maintag does not auto-invalidate the cache. Documented in the change-doc.Design details and risk discussion:
docs/changes/2026-05-15-fetchcontent-cache/README.md.Test plan
tools/format.sh checkpass.python -c "import yaml; yaml.safe_load(...)") pass on both workflows.libdtvmapi.sobuild succeeds (127/127 compile targets) withLLVM_SYS_150_PREFIX=/opt/llvm15+FETCHCONTENT_BASE_DIR=~/.cache/cmake-fetchcontent.Cache not found for input keysfollowed byCache saved with key: Linux-fc-DTVM-EVM ... -v1-...and... DTVM-WASM ... -v1-....Cache restored from key: ..., zero^-- Downloadinglines in build output.🤖 Generated with Claude Code