Skip to content

Security concern: runtime binary download with self-signed integrity check #46

@omeridel

Description

@omeridel

Summary

The bootstrap introduced in a3s-code v3.2.1 fetches a compiled native extension from GitHub Releases at import time and verifies it against a SHA256 manifest hosted on the same GitHub account. We flagged this pattern during security monitoring and wanted to share the findings constructively — no malicious code was found, but the trust model has some gaps worth addressing.

Details

_bootstrap.py downloads _native.<abi>.so from github.com/AI45Lab/Code/releases on first import a3s_code, then validates the SHA256 against python-native-manifest.json — also hosted on the same account and release tag.

Three specific concerns:

1. Circular integrity check
The manifest and the binary are controlled by the same party. An attacker who compromises the AI45Lab GitHub account can swap the binary and update the hash simultaneously. The check provides no protection against account compromise.

2. Silent skip on manifest failure

except BootstrapError as exc:
    sys.stderr.write("...skipping hash check\n")
    return None  # proceeds without verification

If the manifest is temporarily unreachable (404 or a network error), the bootstrap loads the binary it downloaded without integrity checks.

3. Import-time download
Downloading at import time rather than pip install time means the fetch bypasses lockfile pinning, dependency scanners, and air-gapped CI environments that would otherwise catch an unexpected network call.

Risk scenario

If the AI45Lab GitHub account is ever compromised, an attacker can replace the release binary and update the manifest hash in a single operation. Every machine that runs import a3s_code afterward fetches and executes the new binary with full Python process access — with no change to the PyPI artifact, no diff in requirements.txt, and nothing for scanners to detect.

Suggested mitigations

  • Validate against a pinned hash in the PyPI wheel itself — the expected hash lives in the published, immutable wheel rather than a mutable GitHub file
  • Move the download to install time (via a build backend hook or pip plugin) rather than import time, so it's visible to dependency scanners and respects air-gapped environments
  • Restore the A3S_CODE_OFFLINE env var from v1.x to allow sandboxed environments to block the network call

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions