Skip to content

fix(git-env): strip LD_LIBRARY_PATH from git subprocess env to prevent shared lib conflicts#1536

Merged
danielmeppiel merged 4 commits into
microsoft:mainfrom
pngdeity:fix/ld-library-path-leak
May 30, 2026
Merged

fix(git-env): strip LD_LIBRARY_PATH from git subprocess env to prevent shared lib conflicts#1536
danielmeppiel merged 4 commits into
microsoft:mainfrom
pngdeity:fix/ld-library-path-leak

Conversation

@pngdeity
Copy link
Copy Markdown
Contributor

Description

Strips LD_LIBRARY_PATH from the git subprocess environment to prevent PyInstaller-bundled shared libraries from leaking into child processes.

PyInstaller's bootloader sets LD_LIBRARY_PATH so the bundled Python can find libpython3.12.so. This leaks into every subprocess.run() call, including materialize_from_bare() in bare_cache.py. Bundled shared libs (libreadline.so.8, libz.so.1, libsqlite3.so.0, etc.) shadow system libraries, breaking /bin/sh:

/bin/sh: symbol lookup error: /bin/sh: undefined symbol: rl_print_keybinding
fatal: Could not read from remote repository.

This causes git clone --local --shared --no-checkout to exit 128 in the shared clone cache path (SharedCloneCache).

Fix 1 (primary): Add LD_LIBRARY_PATH to _STRIP_GIT_VARS in git_env.py. This sanitizer exists precisely to clean ambient variables from git subprocess environments — LD_LIBRARY_PATH leaking is a sanitization failure.

Fix 2 (defense-in-depth): Expand the apm.spec exclusion list to remove all non-Python shared libs from the bundle. The original exclusion only covered OpenSSL (issue #462). This extends it to libreadline, libz, liblzma, libsqlite3, libffi, libbz2, libuuid, and libtinfo — preventing any future env-propagation regression.

Fixes #1534

Type of change

  • Bug fix
  • New feature
  • Documentation
  • Maintenance / refactor

Testing

  • Tested locally (built patched binary, validated in real APM project with 4 virtual subdirectory deps from the same repo — previously failed with exit 128, now installs clean)
  • All existing tests pass (15,668 passed; 4 files skipped due to pre-existing git config issues in test environment unrelated to this PR)
  • Added tests for new functionality (test_strips_ld_library_path in test_git_env.py)

Spec conformance (OpenAPM v0.1)

  • N/A -- this PR does not change OpenAPM-observable behaviour.

…ared lib conflicts

PyInstaller's bootloader sets LD_LIBRARY_PATH to _internal/ so
libpython3.12.so is discoverable. This leaks into every subprocess
spawned by APM, including git clone in materialize_from_bare().
Bundled shared libs (libreadline.so.8, libz.so.1, libsqlite.so.0,
etc.) shadow system libraries, breaking /bin/sh with undefined
symbol errors (e.g. rl_print_keybinding), which causes git clone
--local --shared to exit 128.

Fix 1: Add LD_LIBRARY_PATH to _STRIP_GIT_VARS in git_env.py so
git_subprocess_env() strips it before spawning git.

Fix 2: Expand the apm.spec exclusion list to remove all
non-Python shared libs from the bundle (originally only OpenSSL
was excluded for issue microsoft#462). Defense-in-depth against future
env-propagation regressions.

Closes: similar to microsoft#462 but broader
Copilot AI review requested due to automatic review settings May 28, 2026 18:56
@pngdeity
Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree

1 similar comment
@pngdeity
Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree

@danielmeppiel
Copy link
Copy Markdown
Collaborator

APM Review Panel: ship_with_followups

This community fix addresses a real Linux packaged-binary failure by preventing PyInstaller library paths from leaking into git subprocesses.

cc @pngdeity @sergio-sisternes-epam -- a fresh advisory pass is ready for your review.

The panel agrees the bug is real and the proposed direction is valuable: packaged Linux users should not see git clone fail because bundled shared libraries shadow system libraries. The strongest signal is architectural, not about the report itself: this repository already has external_process_env() as the central PyInstaller child-process environment sanitizer, including _ORIG restoration semantics. Adding LD_LIBRARY_PATH directly to git_env.py fixes the immediate failure, but creates a second sanitizer with different behavior.

No panelist raised a blocking-severity finding. The main recommendation is to keep the fix, but route git subprocess environments through the central helper so the PyInstaller cleanup semantics stay in one place. The CHANGELOG item is also worth folding because this closes a visible Linux reliability issue from a community contributor.

Aligned with: secure by default: child processes should not inherit bundled shared libraries accidentally; pragmatic as npm: packaged installs should behave like normal system git; oss community driven: the PR turns a community-reported platform failure into a visible reliability improvement.

Growth signal. Once merged, this is a useful Linux reliability note for the next release: it shows platform parity and responsiveness to packaged-binary users.

Panel summary

Persona B R N Takeaway
Python Architect 0 1 2 LD_LIBRARY_PATH strip in git_env.py duplicates and contradicts the _ORIG-restoration pattern already in subprocess_env.py; recommend composing the two.
CLI Logging Expert 0 0 0 No CLI output, logging, or diagnostic rendering surfaces touched; ship.
DevX UX Expert 0 0 0 No CLI command surface, help text, error wording, or user-facing flow changed; ship.
Supply Chain Security Expert 0 0 2 No exploitable regression; two defense-in-depth hardening notes on soname matching and env-var sanitization.
OSS Growth Hacker 0 1 1 Solid community fix that removes a Linux adoption blocker; recommend a CHANGELOG entry to surface it as a trust signal for packaged-binary users.
Test Coverage Expert 0 0 1 LD_LIBRARY_PATH stripping has unit coverage; _exclude_libs membership in build/apm.spec lacks a guard test (nit).

B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.

Top 4 follow-ups

  1. [Python Architect] Compose git_subprocess_env() from external_process_env(), then filter git ambient state -- This keeps PyInstaller _ORIG restoration and git-specific cleanup in their existing layers, avoiding two LD_LIBRARY_PATH sanitizers with different semantics.
  2. [OSS Growth Hacker] Add a CHANGELOG entry for the Linux packaged-binary git clone fix -- This turns a closed adoption blocker into a discoverable reliability signal for users who do not read commit logs.
  3. [Test Coverage Expert] Consider a unit guard for the expanded build/apm.spec exclusion list -- The primary LD_LIBRARY_PATH behavior is tested, but the defense-in-depth exclusion list can silently shrink without a CI signal.
  4. [Supply Chain Security Expert] Track broader dynamic-loader env hardening as a follow-up -- A future hardening pass can decide centrally whether variables such as LD_PRELOAD or platform equivalents should be sanitized for subprocesses.

Architecture

classDiagram
    direction LR
    class external_process_env {
        <<Pure Function>>
        +external_process_env(base) dict
    }
    class git_subprocess_env {
        <<Pure Function>>
        +git_subprocess_env() dict
    }
    class _STRIP_GIT_VARS {
        <<FrozenSet>>
        GIT_DIR, GIT_WORK_TREE, ...
    }
    class GitCache {
        +clone_repo()
        +fetch_ref()
    }
    git_subprocess_env ..> _STRIP_GIT_VARS : filters git state
    git_subprocess_env ..> external_process_env : should compose
    GitCache ..> git_subprocess_env : calls
Loading
flowchart TD
    A[apm install] --> B[Git clone/cache path]
    B --> C[git_subprocess_env]
    C --> D[external_process_env should restore LD_LIBRARY_PATH_ORIG]
    D --> E[strip ambient GIT_* state]
    E --> F[subprocess.run git]
    G[PyInstaller bootloader] --> H[sets LD_LIBRARY_PATH]
    H --> D
Loading

Recommendation

Ship with follow-ups. The immediate bug fix is valuable, especially for Linux packaged-binary users, but the safest shape is to fold the central-helper composition and the CHANGELOG entry before merge if maintainer bandwidth allows. The remaining items can be tracked as normal hardening/test polish.


Full per-persona findings

Python Architect

  • [recommended] git_subprocess_env unconditionally strips LD_LIBRARY_PATH instead of delegating to external_process_env which correctly restores _ORIG values at src/apm_cli/utils/git_env.py:53
    src/apm_cli/utils/subprocess_env.py already centralizes PyInstaller child-process environment restoration: when frozen, it restores LD_LIBRARY_PATH from LD_LIBRARY_PATH_ORIG or removes the bootloader value, preserving legitimate user-set values. Adding LD_LIBRARY_PATH to _STRIP_GIT_VARS creates a second sanitizer with different semantics and can discard valid user library paths for git subprocesses. The correct architecture is external_process_env first, then git ambient-state stripping.
    Suggested: Build git_subprocess_env() from external_process_env(), then filter _STRIP_GIT_VARS. Remove LD_LIBRARY_PATH from _STRIP_GIT_VARS.
  • [nit] Module docstring mixes PyInstaller library-path cleanup with ambient git-state cleanup at src/apm_cli/utils/git_env.py:12
    LD_LIBRARY_PATH is not git ambient state. Keeping PyInstaller restoration in subprocess_env.py avoids category drift in git_env.py.
  • [nit] libz.so.1 exclusion in apm.spec could use a short note about Python zlib safety at build/apm.spec:306
    Future maintainers may worry that excluding libz breaks Python's zlib support. A short comment would reduce revert risk.

CLI Logging Expert

No findings.

DevX UX Expert

No findings.

Supply Chain Security Expert

  • [nit] Exact soname exclusion may miss future or older library sonames at build/apm.spec
    The apm.spec filter uses exact string matches for sonames. If a future platform uses a different soname version suffix, the defense-in-depth exclusion can silently stop covering that library family.
    Suggested: Consider prefix or fnmatch matching for library families where soname churn is expected.
  • [nit] Consider whether LD_PRELOAD and platform dynamic-loader injection vars need sanitizer coverage at src/apm_cli/utils/subprocess_env.py
    LD_PRELOAD and similar platform variables can inject libraries into child processes. This PR fixes the PyInstaller LD_LIBRARY_PATH failure; a follow-up threat-model pass could decide whether broader dynamic-loader sanitization belongs in subprocess_env.py.
    Suggested: If in scope, add a narrowly documented denylist for loader injection variables in the central subprocess env helper, not only git_env.py.

OSS Growth Hacker

  • [recommended] PR body is release-note-ready but CHANGELOG.md is not updated -- mine this fix for a story beat at CHANGELOG.md
    This fix resolves a hard blocker for Linux packaged-binary users (git clone exit 128). A one-liner in CHANGELOG under [Unreleased] converts it into a discoverable trust signal for future Linux adopters.
    Suggested: Add under [Unreleased] > Fixed: Fixed git clone failures on Linux when running the packaged binary (bundled shared libraries no longer shadow system libs) ([BUG] LD_LIBRARY_PATH leaks bundled shared libs into git subprocesses, breaking shared-clone materialization #1534).
  • [nit] Issue references in apm.spec comment should use full https URLs at build/apm.spec:294
    Clickable URLs help future maintainers reach context quickly.
    Suggested: Use full URLs for issues 462 and 1534.

Auth Expert -- inactive

PR does not touch token management, credential resolution, AuthResolver, HostInfo, AuthContext, or any remote host authentication surface.

Doc Writer -- inactive

PR does not touch docs/src/content/docs/ or any user-facing documentation surface; no documentation work required.

Test Coverage Expert

  • [nit] No unit guard asserts that newly excluded libs in build/apm.spec stay in the _exclude_libs list at tests/unit/test_build_spec.py
    tests/unit/test_build_spec.py compiles and inspects build/apm.spec helper behavior, but grep found no assertions for _exclude_libs or the new sonames. If future edits drop an exclusion entry, no unit test calls it out. Severity is nit because git_env.py carries the primary fix and has a direct unit test.
    Suggested: Add an AST-based unit test that extracts _exclude_libs from build/apm.spec and asserts the expected sonames remain present.
    Proof (missing at): tests/unit/test_build_spec.py::test_linux_exclude_libs_contains_expected_libraries -- proves: The defense-in-depth packaged-binary exclusion list is not accidentally shrunk by future edits. [secure-by-default,devx]
    assert expected_libs <= exclude_libs

Performance Expert -- inactive

Touched files affect packaging and subprocess env setup, not cache layout, transport, parallelism, resolution, materialization, or wall-time instrumentation.

This panel is advisory. It does not block merge. Re-apply the panel-review label after addressing feedback to re-run.

danielmeppiel and others added 3 commits May 30, 2026 17:47
Compose the git subprocess environment from the shared external-process sanitizer before stripping ambient git state. This keeps pngdeity's packaged-binary fix while preserving legitimate user library paths restored from PyInstaller *_ORIG variables, and adds the missing changelog note for microsoft#1534.

Addresses CEO follow-ups from the Phase 4 panel prior.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rewrite the microsoft#1534 changelog entry around the user-visible standalone-binary clone failure while preserving the technical cause. This folds the growth/doc polish from the second panel pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel danielmeppiel merged commit 68718de into microsoft:main May 30, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] LD_LIBRARY_PATH leaks bundled shared libs into git subprocesses, breaking shared-clone materialization

2 participants