feat: replace picklescan with Rust-native engine#990
Merged
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the old in-repo Python pickle opcode scanner with the Rust-native
modelaudit-picklescanpackage and wires ModelAudit to use that Rust scanner as the only pickle engine. The rootPickleScanneris now a thin integration layer for file/container handling, bounded ModelAudit compatibility checks, andScanResultadaptation.This PR also removes the Python picklescan engine-selection path, deletes the old Python opcode-support module, adds Rust CI/build/smoke coverage, and keeps the public Python API/report models for users of the standalone package.
What Changed
modelaudit_picklescan._rust.MODELAUDIT_PICKLESCAN_ENGINE=pythonselection path; the Python package now calls Rust directly.eval/exec,__import__, dangerous protocol-0 globals, copyreg extensionREDUCE, non-allowlisted__main__references,GLOBAL __dict__import-only references, encoded execution strings, and malformed/truncated pickles.importlib# ...text as critical and tightens the CVESETITEMcompatibility check so plainsystembytes do not look like aSETITEMopcode.Equivalence Evidence
Before removing temporary comparison harnesses from the branch, I ran a historical equivalence suite against
origin/mainacross standalone package bytes, standalone package stream, root stream, and root file surfaces.Final full-corpus run:
The two operational improvements are the negative stream-size cases: old main treated
size=-1as empty input or failed closed, while the Rust-backed path now treats negative size as unknown size and scans the actual stream. That directly addresses the review finding around negative stream sizes.The strengthened results are additional detections or stricter fail-closed behavior, including base64/encoded execution strings, malformed missing-STOP inputs, dangerous alias globals, and root compatibility aliases. No main-detected malicious signal was lost, and no safe fixture regressed to malicious in the equivalence corpus.
Current branch fixture gate before cleanup also passed across 34 committed pickle fixtures and 102 package/adapter/root comparisons.
Benchmark Results
Latest incremental benchmark for this commit (macOS arm64, Python 3.11, release-built editable maturin extension via
uv run --with 'maturin>=1.9,<2' maturin develop --release --manifest-path packages/modelaudit-picklescan/Cargo.toml, thenuv run --with pytest-benchmark pytest tests/benchmarks/test_picklescan_benchmarks.py --benchmark-json /tmp/modelaudit-picklescan-bench-after-release.json --benchmark-min-rounds=16 -q):long_benign_stringsafe_largechunked_streamhidden_suspicious_string_budgetos.system('id')nested_hexnested_base64malicious_reducestack_globalA user-facing CLI subprocess smoke benchmark using
.venv/bin/modelaudit scan --no-cache -q -f jsonmeasured median 277.1 ms for the 1 MiB benign literal and 221.8 ms for a tiny safe pickle, which shows the current wall time is dominated by Python process startup/root CLI orchestration rather than native pickle scanning.Latest local benchmark after the hot-path work using Python 3.12.12 and the rebuilt editable maturin extension. Each value is the median of 5 runs through both standalone
modelaudit_picklescan.scan_bytesand the user-facingPickleScanner().scan(path)path.safe_8m_bytessafe_8m_stringhidden_mid_stringS101suspicious stringA pre-optimization profile showed the Python CLI path was dominated by duplicated compatibility/root detectors rather than Rust itself. The hot-path changes move the root CLI path much closer to standalone Rust while preserving the compatibility signals above.
Validation
Latest local validation after rebasing onto the PR branch tip and pushing commit
d6e05ac:cargo fmt --manifest-path packages/modelaudit-picklescan/Cargo.toml -- --checkcargo check --manifest-path packages/modelaudit-picklescan/Cargo.tomlcargo clippy --manifest-path packages/modelaudit-picklescan/Cargo.toml --all-targets -- -D warningscargo test --manifest-path packages/modelaudit-picklescan/Cargo.toml(77 passed)PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest packages/modelaudit-picklescan/tests/test_api.py packages/modelaudit-picklescan/tests/test_adversarial_pickle_oracle.py -q(567 passed)uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/(381 files already formatted)uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/(All checks passed)uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/(Success: no issues found in 436 source files)PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1(4224 passed, 13 skipped, 21 warnings)