[DRAFT] feat(builder): use python-build-standalone instead of source builds#80
[DRAFT] feat(builder): use python-build-standalone instead of source builds#80AStaroverov wants to merge 6 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request switches the CPython provisioning on macOS and Linux from portable-python source builds to prebuilt python-build-standalone (PBS) tarballs. This change removes the need for local compilation, Docker-based builds, and custom library consolidation scripts, significantly simplifying the build pipeline and reducing build times. Feedback on the changes highlights an opportunity to improve error handling in the new PBS downloader by wrapping the download and extraction process in a try...finally block to guarantee cleanup of temporary files and directories in case of failures.
| const tarPath = path.join(stagingDir, fileName); | ||
|
|
||
| console.log(`[PBS] Downloading ${url}`); | ||
| await util.downloadFile(url, tarPath); |
There was a problem hiding this comment.
downloadFile does not follow HTTP redirects — PBS downloads will always fail
util.downloadFile uses Node's https.get() and immediately rejects on any status code other than 200. GitHub release asset URLs (github.com/.../releases/download/…) issue a 302 redirect to objects.githubusercontent.com. Every call here will reject with Failed to get '...' (302), so macOS and Linux builds will never download a tarball.
The fix is to handle 3xx responses inside downloadFile (or a wrapper) by following response.headers.location with a recursive call.
Prompt To Fix With AI
This is a comment left during a code review.
Path: builder/src/pbs.ts
Line: 46
Comment:
**`downloadFile` does not follow HTTP redirects — PBS downloads will always fail**
`util.downloadFile` uses Node's `https.get()` and immediately rejects on any status code other than 200. GitHub release asset URLs (`github.com/.../releases/download/…`) issue a 302 redirect to `objects.githubusercontent.com`. Every call here will reject with `Failed to get '...' (302)`, so macOS and Linux builds will never download a tarball.
The fix is to handle 3xx responses inside `downloadFile` (or a wrapper) by following `response.headers.location` with a recursive call.
How can I resolve this? If you propose a fix, please make it concise.| await util.downloadFile(url, tarPath); | ||
|
|
||
| // PBS tarballs always extract into a top-level `python/` directory. | ||
| // Use a temp dir so we can rename it into place without colliding with | ||
| // the staging dir's siblings. | ||
| const tempExtractDir = path.join(stagingDir, `pbs-extract-${process.pid}`); | ||
| if (fs.existsSync(tempExtractDir)) { | ||
| fs.rmSync(tempExtractDir, { recursive: true }); | ||
| } | ||
| fs.mkdirSync(tempExtractDir, { recursive: true }); | ||
|
|
||
| console.log(`[PBS] Extracting ${tarPath} to ${tempExtractDir}`); | ||
| tar.x({ sync: true, file: tarPath, cwd: tempExtractDir }); | ||
|
|
||
| const extractedRoot = path.join(tempExtractDir, 'python'); | ||
| if (!fs.existsSync(extractedRoot)) { | ||
| throw new Error(`PBS archive layout unexpected: ${extractedRoot} not found`); | ||
| } | ||
|
|
||
| util.emptyDirSync(installDir); | ||
| // Move the extracted python/ contents into installDir | ||
| for (const entry of fs.readdirSync(extractedRoot)) { | ||
| fs.renameSync(path.join(extractedRoot, entry), path.join(installDir, entry)); | ||
| } | ||
|
|
||
| fs.rmSync(tempExtractDir, { recursive: true }); | ||
| fs.rmSync(tarPath); |
There was a problem hiding this comment.
Staging artifacts not cleaned up on error
If tar.x(...) throws, or if any renameSync call fails, both tarPath (the downloaded tarball) and tempExtractDir are left on disk. On the next invocation the tarball is re-downloaded and a new temp dir is created alongside the orphaned one, but the stale tarball is silently overwritten. Wrapping the extraction block in a try/finally that calls fs.rmSync(tarPath, { force: true }) and fs.rmSync(tempExtractDir, { recursive: true, force: true }) would keep the staging directory clean across retries.
Prompt To Fix With AI
This is a comment left during a code review.
Path: builder/src/pbs.ts
Line: 46-72
Comment:
**Staging artifacts not cleaned up on error**
If `tar.x(...)` throws, or if any `renameSync` call fails, both `tarPath` (the downloaded tarball) and `tempExtractDir` are left on disk. On the next invocation the tarball is re-downloaded and a new temp dir is created alongside the orphaned one, but the stale tarball is silently overwritten. Wrapping the extraction block in a `try/finally` that calls `fs.rmSync(tarPath, { force: true })` and `fs.rmSync(tempExtractDir, { recursive: true, force: true })` would keep the staging directory clean across retries.
How can I resolve this? If you propose a fix, please make it concise.| const fileName = `cpython-${version}+${PBS_RELEASE_TAG}-${triple}-install_only.tar.gz`; | ||
| const url = `https://github.com/astral-sh/python-build-standalone/releases/download/${PBS_RELEASE_TAG}/${fileName}`; | ||
|
|
||
| const stagingDir = path.dirname(installDir); | ||
| const tarPath = path.join(stagingDir, fileName); | ||
|
|
||
| console.log(`[PBS] Downloading ${url}`); | ||
| await util.downloadFile(url, tarPath); |
There was a problem hiding this comment.
No integrity check on the downloaded tarball
PBS publishes a .sha256 file alongside every tarball (same URL with .sha256 appended). Without verifying the checksum after download, a corrupted transfer or a compromised CDN response would silently extract a bad interpreter and produce a broken Python distribution. Consider downloading the companion .sha256 and comparing against crypto.createHash('sha256').update(fs.readFileSync(tarPath)).digest('hex') before extracting.
Prompt To Fix With AI
This is a comment left during a code review.
Path: builder/src/pbs.ts
Line: 39-46
Comment:
**No integrity check on the downloaded tarball**
PBS publishes a `.sha256` file alongside every tarball (same URL with `.sha256` appended). Without verifying the checksum after download, a corrupted transfer or a compromised CDN response would silently extract a bad interpreter and produce a broken Python distribution. Consider downloading the companion `.sha256` and comparing against `crypto.createHash('sha256').update(fs.readFileSync(tarPath)).digest('hex')` before extracting.
How can I resolve this? If you propose a fix, please make it concise.Replace portable-python source compilation with prebuilt CPython tarballs from astral-sh/python-build-standalone on macOS and Linux. Removes the libinstall parallel-make race that breaks macOS CI intermittently, drops the Linux Docker build container, and cuts ~10–15 minutes of CPython compile time per cache-miss build. Windows path is unchanged. Deleted: builder/src/linux.ts, builder/src/macos.ts, builder/docker/. The pre-existing lib consolidation and rpath patching were needed only because portable-python's output lacked relocatable load paths; PBS tarballs ship with @executable_path / \$ORIGIN-relative paths and bundled deps, so the post-build fixup steps are no longer required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GitHub release asset URLs return 302 to S3 — the previous downloadFile rejected any non-200 status, so PBS tarball fetches failed immediately. Recurse on 3xx with a Location header, capped at 5 hops. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PBS install_only tarballs do not include a usable Tcl/Tk runtime in the context of the bare-import checker. PIL._imagingtk and matplotlib.backends._tkagg are GUI-only hooks that fail on standalone import even when CPython is fine. These runtimes are not used for plotting/GUI, so the extensions are safe to skip. Added pillow-*.whl and matplotlib-*.whl entries to all non-Windows whitelists. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
545d436 to
217d0df
Compare
PBS Linux Python ships with CC=clang in its distutils sysconfig. System libraries built on Ubuntu (e.g. R) emit `-fopenmp` in their ldflags, intended for gcc+libgomp. clang interprets `-fopenmp` as `-lomp`, which is not present on standard CI runners. Source builds (rpy2-rinterface, etc.) fail at link time with "cannot find -lomp". Set CC=gcc/CXX=g++ before pip download on Linux. gcc+libgomp is what portable-python used and what the system R was built against — extensions built this way match the host toolchain transparently. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
defaultExecOpts captured process.env at module load, which discarded any later mutations like the CC=gcc override on Linux. Switch to a factory so each spawn sees the current process.env, plus whitelist numba tbbpool on sccoda linux-x64 (already present on h5ad/rapids). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Git-pinned projects like parapred-pytorch often omit setuptools from their [build-system].requires, so PEP 517's isolated venv lacks a build backend and pip download fails with "Cannot import 'setuptools.build_meta'". Switch git URL specs to --no-build-isolation and pre-install setuptools+wheel on the host Python so the build backend is available. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stop compiling CPython from source on CI for macOS and Linux. Download prebuilt relocatable tarballs from
astral-sh/python-build-standalone(hereafter PBS) — the same artifacts that poweruv,rye,mise, andpyenv-install. Windows path is untouched.Diff: −354 lines, +95 lines. Removed 2 builder modules (
linux.ts,macos.ts) and the Docker environment (builder/docker/).Why
1. Fixes flaky CI on macOS
In PR #79, four macOS jobs failed with the same error:
This is a race condition in CPython's
libinstalltarget: parallel sub-targets simultaneously runinstall -don the same directory, and BSDinstallon macOS is not idempotent (unlike GNU). On the same workflow run,macos-14 arm64 h5adpassed whilemacos-14-large amd64 h5adfailed — a non-deterministic race. Past "successful" main runs were turbo cache hits at 1.6 minutes; a real cold build hadn't happened in a while.PBS eliminates the source of the problem entirely — we no longer build CPython.
2. Speeds up CI
The CPython compilation phase goes away — roughly 10–15 min per cache-miss × 8 cache-miss jobs on a fresh PR ≈ 1.5 hours of machine time. On macOS/Linux, building a package is now: download tarball +
pip downloadwheels. Fits comfortably in 2–3 minutes.3. Shrinks our surface area
Removed:
builder/src/macos.ts(100 lines) —consolidateLibscopied external dylibs and patchedinstall_name. PBS tarballs are already relocatable (@executable_path/...).builder/src/linux.ts(168 lines) —consolidateLibs+patchelffor RPATH. PBS Linux already uses$ORIGIN-relative RPATH and bundles dependencies insidelib/.builder/docker/(Dockerfile + build.sh) — the Rocky Linux 8 container existed solely to build Python from source against an old glibc. PBS Linux is built against glibc 2.17 (manylinux2014).buildFromSourcesandbuildInDockerinbuild.ts.isInBuilderContainer/BUILD_CONTAINERenv plumbing inutil.ts.Less code, fewer bugs of our own making. The
libinstallrace is not our bug and now is not our problem.4. Uses the industry standard
python-build-standaloneis maintained by Astral and is the backend foruv,rye,mise,pyenv-install(optional),Hatch(optional), andBazel rules_python. These tarballs run through CI with coverage much wider than ours; security/CVE updates land without our involvement; CPython Makefile bugs get patched upstream before they reach us.What changes
Architecturally
Before
pipx run portable-python build 3.12.10→ unpack →consolidateLibs(otool/install_name_tool) →pip downloadpipx run portable-python build 3.12.10→ unpack →consolidateLibs(ldd/patchelf) →pip downloadpython-X.Y.Z-embed-amd64.zipfrom python.org → patch pip wheel → installAfter
cpython-${VERSION}+${PBS_TAG}-${TRIPLE}-install_only.tar.gz→ unpack →pip downloadKey code
builder/src/pbs.ts(new, 75 lines):Triples:
x86_64-apple-darwinaarch64-apple-darwinx86_64-unknown-linux-gnuaarch64-unknown-linux-gnuIn
build.ts, thecase 'macosx'andcase 'linux'branches are merged; both callpbs.getPortablePython.What does NOT change
20250529ships exactly this patch version.loadPackages/downloadPackages/runNativeImportCheckerlogic.config.jsonfor eachpython-3.12.10-*).pydist/**/bin/python, etc.) remain valid.Trade-offs
We gain
libinstallrace.We give up
releases.github.comis down, our CI is down. Mitigation: retry with backoff, or mirror to our S3 (follow-up).--with-ltoetc., we can switch topgo+lto-fullPBS variants or revert.PBS_RELEASE_TAGbump — a recurring maintenance task that didn't exist before.Risk & rollback
The change is isolated to
builder/. If CI reveals incompatibility (unlikely — manylinux2014 / macOS 13.0 baseline is broader than portable-python's), rollback = revert one commit.Testing plan
runNativeImportCheckerpasses for all packages without whitelist changes (the.soset in PBSlib-dynload/matches portable-python's).scanpy,scipy).Future work (not in this PR)
20260510includes 3.12.13).*-pc-windows-msvc-install_only.tar.gz, which would let us unifywindows.tswith the new path. Skipped here because Windows isn't broken and its pip-patch step is non-trivial.