CNTRLPLANE-3531: Remove EFS-backed Go build cache from CI runners#8637
Conversation
Benchmarks show reading cache entries from the EFS-backed PVC over NFS takes 4m03s — 2x slower than compiling from scratch (2m05s). The Go build cache consists of thousands of small files, and each lookup requires NFS stat + read round-trips that dominate wall-clock time. The cp -a fallback was also timing out because the cache grew too large. Remove all EFS cache infrastructure: - Delete cache-warming CronJob and warm-go-cache composite action - Remove EFS PVC volume mount from runner pod spec - Remove warm-go-cache step from all reusable workflows gocacheprog stays in the runner image for use with the planned node-local DaemonSet cache (CNTRLPLANE-3530). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@celebdor: This pull request references CNTRLPLANE-3531 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (8)
💤 Files with no reviewable changes (8)
📝 WalkthroughWalkthroughThis PR removes Go build cache warming infrastructure from the Hypershift CI/CD pipeline. The warm-go-cache composite GitHub Action that previously used EFS-backed read-only caching with local overlay is no longer invoked by five reusable workflow files (envtest-kube, envtest-ocp, lint, test, verify). The lint workflow replaces this step with direct provisioning of pre-built lint tools. The associated Kubernetes CronJob responsible for periodic cache population is deleted. The runner container configuration in the Helm values file is simplified by relocating the resources block. Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 11✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Scheduling tests matching the |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bryan-cox, celebdor The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
I now have the complete picture. Here is the analysis: Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryAll 7 CI jobs fail instantly at workflow setup — before any Go code compiles or any test runs. The PR deletes Root CauseThis is a chicken-and-egg problem caused by how GitHub Actions resolves reusable workflows for PRs in the same repository. The mechanism:
In short: The reusable workflow from This is not a flaky failure or infrastructure issue — it will fail deterministically on every run of this PR as long as RecommendationsOption A — Two-phase merge (recommended):
Option B — Self-referencing reusable workflow ref: Option C — Make the action a no-op first:
Option A is the cleanest approach. Split this PR into two: one that removes the references, and a follow-up that deletes the files. Evidence
|
Summary
Benchmark data: https://gist.github.com/celebdor/7c73e9e3aee02d77f8879f251b354606
Cluster cleanup (after merge)
oc delete cronjob -n arc-runners go-cache-warmeroc delete pvc -n arc-runners go-cache-pvcTest plan
.github/actions/warm-go-cache/cache/go-build🤖 Generated with Claude Code
Summary by CodeRabbit