ci: rfq: migrate linux CI jobs to namespace#35144
ci: rfq: migrate linux CI jobs to namespace#35144willcl-ark wants to merge 8 commits intobitcoin:masterfrom
Conversation
willcl-ark
commented
Apr 23, 2026
- Migrate linux CI jobs over to namespace.so
- Configure docker to use namespace's remote docker builders (and implicit shared docker buildkit cache)
- Configure caches to use the namespace cache volumes
- Fixup the kernel headers needed is the ASAN job for the USDT tests.
- The namespace hypervisor/host is using a newer kernel which does not have headers in the Ubuntu version in the container, but they do provide in-kernel headers, which we can mount into the container.
- Ensure CI still works on GHA for forks
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
With the announcement that Cirrus-app is closing down after being acquired by OpenAI, we need to take action on our runners (again), sadly. As previously there are two main options:
I have investigated self-hosting twice now, and don't think it's right for us. If we use a single (xl) machine, we can natively share docker buildkit, ccache, depends, etc. caches. One more than 1 machine, you start to need additional servers spun up to host these services, or have sub-optimal caching. On the security side, the github runner service needs to be isolated using a vm, and all the plumbing this entails. I don't think this is worth it unless we want to commit a few engineers to manage this with some decent chunk of their time. (I do have a nix configuration for On the hosted side, there are many options still about. I looked at Runs-on, WarpBuild and Namespace, as these 3 had the most-acceptable github app permission requirements, which rules many others out (they usually wanted Runs-onRuns on will be the cheapest. (Probably) requires the most hands-on maintenance, and has the least-likeable permissions. Runs-on uses your own AWS account to provision runners using AWS instances, in combination with a CloudFormation configuration they supply, to effectively run the CI on "your own (AWS)" machines, which you pay cost price for. You can't really get any cheaper, without self-hosting. The downsides are that:
WarpBuildDecent outfit. Runners are good, fast and highly concurrent. Drop-in caching and docker builder solutions. Simply pay as you go. I don't think there is a limit to concurrency. No dangerous org permissions needed, management UI is nice, but they are expensive. I did not contact them for volume pricing, because they already came out more than double namespace rates , at which point I moved my focus on. NamespaceSeem to be ~ equivalent to WarpBuild feature-wise, offering drop in caching*( see below), docker builders and concurrent runners. The concurrency is limited by the contract, but can be adjusted. You can pay as you go, or enter contracted annual minute/cache allowances. namespace cachingThis is a little different to the other two; instead of saving and restoring cache blobs, you get volume mounts. I have found the cache hitrate of these to be mixed, which is a known property of these. Apparently this is due to a volume per runner (or group?) which is then mounted. As these gradually warm up for all jobs, cache hitrate should improve. I think there are perhaps other approaches we could use to try and improve this, but worth noting, especially if we are being billed /minute! I do think this sounds like cache hitrate will never be 100% on say a doc change PR, unless you luck out and all jobs go on the same runners, or something. TBD. Cache mounts (and saves) are ~instant though, so you could save 30-60 seconds per job there. And perhaps in long-running operation these hitrates all rise up to about 90-something % and this is not an issue. Overall, my conclusions are:
|
|
Shameless plug, but given the criteria you've laid out, I think it's worth adding Ubicloud to the list. I am a software engineer there, so take this with the appropriate grain of salt. On price. Pricing is public and multiple times cheaper than GitHub-hosted runners at equivalent specs, including on ARM64. Happy to talk volume/committed-use for a project at Bitcoin's scale. On caching. We offer a drop-in GitHub Actions cache backend (swap the action, keep your workflow) that stores blobs rather than volume-mounting, so the hit-rate pathology you described for Namespace doesn't apply. A doc-only PR will get the same hits any other runner would. On usage. We run CI for a bunch of OSS projects with similar shapes to Bitcoin Core, so the heavy-C++-build / depends-cache / ccache workflow isn't unfamiliar territory. Usage is the usual Feel free to reach out to me furkan[at]ubicloud[dot]com or support[at]ubicloud[dot]com |
|
Thank you for the detailed analysis of the available options. Could you additionally provide details on the supported architectures for them? |
I did not test Ubicloud as I saw in your https://www.ubicloud.com/docs/github-actions-integration/quickstart that the github app needed Is there a way around that requirement? |
@hebasto I reviewed them manually previously, but had claude put together a table, which I fixed up a little myself:
Notes:
If you want to know more details let me know. There are even more providers about, but these are the ones I looked at due to their app permission requirements. |
The action now also chooses runner-specific Docker setup rather than just a cache backend. Rename the interface so the workflow can describe whether a job runs on GitHub-hosted or Namespace infrastructure.
06890e4 to
859a6c7
Compare
Replace Cirrus-specific provider names and runner labels with Namespace profiles.
Namespace cannot persist the old cache layout under runner.temp due to the way that caches are now volume mounts: https://namespace.so/docs/solutions/github-actions/caching Put reusable state under stable cache paths while leaving the working tree and build directory in the temporary workspace, and mount those exact paths into the container so the build layout stays unchanged.
Keep the local restore and save cache actions, but switch them to Namespace mounts on Namespace runners and GitHub path caches on GitHub-hosted runners. This preserves Namespace-native cache volumes in the main repository while restoring ccache, depends, and previous-release reuse for fallback Linux jobs such as 32-bit ARM and forks.
Update the CI README to match the Namespace-based workflow: required runner profiles, required actions, cache setup, and the remaining jobs that still run on GitHub-hosted infrastructure. Importantly, record that the runner profile's cache allow-list should restrict updates to the default branch when the desired behavior is restore on pull requests but persist only from the main branch.
b8ff5ba to
52ee88b
Compare