Fix Android mobile-forge support artifact relocation#5
Conversation
… normalize_mobile_forge_install.py
The published python-android-mobile-forge-*.tar.gz bakes /home/runner/ndk/<letter>/... paths into sysconfigdata and ships libpython3.so as an ELF stub. We hold the line in two places until flet-dev/python-build#5 ships a new tarball that fixes both upstream: 1. install_ndk.sh: also symlink the sdkmanager install at $HOME/ndk/<letter>/ so the baked sysconfigdata paths resolve to a real toolchain. Only fires when input is letter-form (the workflow's FORGE_NDK_VERSION=r27d path); component-form callers opt out. 2. build-wheels.yml: keep the inline libpython3.so → linker-script loop since the published tarball still ships the ELF stub. Both blocks are wrapped in `CLEANUP-AFTER: flet-dev/python-build#5` markers — grep that string post-merge and delete the bracketed sections in one pass. They're guarded by idempotent checks so they no-op when the new tarball lands, but the dead code is worth removing for cleanliness.
CPython's Makefile rule for libpython skips `-Wl,-soname` when INSTSONAME == LDLIBRARY, which is the case on Android (both are `libpython3.X.so`). The shipped libpython3.X.so therefore has no DT_SONAME entry. The visible consequence is that consumer wheels' DT_NEEDED entries get whatever name the linker was asked for — notably, any link path that opens `libpython3.so` via a real-lib symlink writes `libpython3.so` into DT_NEEDED instead of `libpython3.12.so`, and the device fails to dlopen at runtime. Setting SONAME via patchelf right after `make install` makes consumer wheels' DT_NEEDED entries correct regardless of how the link was reached (stub, symlink, direct -lpython3.12, etc.). 3.13+ uses CPython's official Android tooling, which sets SONAME natively — this fix only applies to the legacy 3.12 build path.
|
Pushed WhyWhile investigating an actual CPython's When SONAME is missing,
Interaction with the existing fix in this PRThe
Scope notes
|
flet-dev/python-build#5 has landed. The new v3.12 tarball relocates sysconfigdata's baked NDK paths at runtime AND ships libpython3.so as a proper linker script (with libpython3.12.so now carrying its DT_SONAME, per Feodor's follow-up patchelf fix). Both consumer-side band-aids in this repo become no-ops against that tarball — remove them. Removed: 1. `.ci/install_ndk.sh`: the `$HOME/ndk/<letter>` compat-symlink block. Existed because sysconfigdata baked `/home/runner/ndk/<letter>/...` compiler paths, so crossenv looked for clang there. PR flet-dev#5's tarball-side relocation consults `$NDK_HOME` (and a fallback list) at sysconfig import time, so the symlink is no longer needed. 2. `.github/workflows/build-wheels.yml`: the `for libpy in ...` loop that replaced libpython3.so with a GNU ld linker script `INPUT ( -lpython3.12 )`. PR flet-dev#5's `replace_libpython_stub` writes the same linker script in-place at tarball-assembly time, and the new SONAME on libpython3.12.so closes the link-name-fallback path that originally made the workaround necessary. The loop is now a no-op against any extracted tarball — delete it. bash -n + YAML validation both pass.
CrossVEnv.cross_kwargs() builds the env dict it later passes to
subprocess.run(env=…). It used to start from venv_kwargs.get("env", {})
— i.e. {} when the caller didn't supply env — and add PATH and
VIRTUAL_ENV on top. Because subprocess.run(env=<dict>) REPLACES (not
augments) the child env, the child process saw a near-empty environment
(PATH + VIRTUAL_ENV only). NDK_HOME, HOME, LANG, and everything else
the parent had were silently dropped.
That broke any downstream tool that reads os.environ from a forge
cross subprocess. Most recently: python-build's sysconfigdata
relocator (flet-dev/python-build#5), which crossenv exec_module()'s
when forge spawns it (forge/cross.py:334 → crossenv/__init__.py:217).
The relocator wants NDK_HOME to rewrite the build-time path
'/home/runner/ndk/r27d/…' baked into _sysconfigdata__linux_.py to the
local NDK on the runner. With the wipe in place the lookup returned
None, fell through to ~/ndk/<letter>/ legacy compat (which our
.ci/install_ndk.sh used to symlink), and only worked because of that
symlink. Once we dropped the symlink (e09628a "CLEANUP-AFTER:
flet-dev/python-build#5"), every Android wheel build failed with
ERROR: Cannot find cross-compiler (
'/home/runner/ndk/r27d/toolchains/llvm/prebuilt/linux-x86_64/
bin/i686-linux-android24-clang')!
at crossenv/__init__.py:407 because the relocator's NDK_HOME branch
never fired.
# Options considered
Option A — Revert e09628a, restore the ~/ndk/<letter>/ legacy symlink
in install_ndk.sh.
+ One-line, immediate.
- Keeps the band-aid; the relocator's NDK_HOME branch stays dead
in CI.
- Doesn't fix any future tool that wants HOME / LANG / etc.
Rejected: just hides the bug.
Option B — Surgical allowlist in cross_kwargs. Add only the env vars
downstream tooling needs:
for var in ("NDK_HOME", "ANDROID_NDK_HOME", "HOME"):
if var in os.environ:
env.setdefault(var, os.environ[var])
+ Minimal blast radius — three vars, every existing strip stays.
+ Direct, easy to defend in an upstream PR ("these three are
required by python-build relocation").
+ setdefault preserves caller-supplied env semantics.
- Partial: next tool that needs USER / XDG_CACHE_HOME / LANG /
whatever brings us right back here.
- Couples mobile-forge to the relocator's contract.
- Doesn't answer "why is env scrubbed at all?".
Option C — Inherit-by-default (this commit). Change the default branch
of `venv_kwargs.get("env", …)` from {} to os.environ.copy(), so the
env-replace becomes opt-in (pass env=<dict> to wipe) rather than the
accidental default. PATH / VIRTUAL_ENV / PYTHONHOME overrides on top
unchanged.
+ Fixes the root cause: subprocess sees a sensible env, matching
ordinary subprocess.run semantics.
+ Future-proofs every tool that reads os.environ — no more
whack-a-mole.
+ Removes a foot-gun (the {} default was almost certainly an
oversight: PATH has a justification comment, the wipe doesn't).
+ Net-shorter diff than B (the try/except PYTHONHOME also
collapses to .pop()).
- Broader behavior change: every cross subprocess now inherits the
runner env. A build that accidentally relied on a var being
ABSENT would now break.
- Harder to land upstream — reviewers will want a regression
survey.
# Why C now
Picked C because:
* It actually fixes the mechanism instead of patching its symptom.
* The wipe was an accident (the PATH override comment justifies PATH,
nothing in the code justifies stripping the rest).
* If a recipe build was relying on accidental env scrubbing, that's
a latent bug we want surfaced now, not later.
* If C ever needs to be partially reverted, B's allowlist remains
the documented migration path — re-add the {} default and
re-introduce the surgical allowlist.
Fix flet-dev/mobile-forge#60