Skip to content

fix: harden pickle nested bypass detection#1027

Merged
mldangelo-oai merged 4 commits intomainfrom
mdangelo/codex/fix-pickle-bypass-policy-20260415
Apr 16, 2026
Merged

fix: harden pickle nested bypass detection#1027
mldangelo-oai merged 4 commits intomainfrom
mdangelo/codex/fix-pickle-bypass-policy-20260415

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

Summary

  • Expand nested pickle prefix probing to detect valid no-PROTO binary opcode streams in raw bytes, base64, and hex literals.
  • Fail closed with an inconclusive critical finding when raw nested pickle probe candidates exceed the bounded probe budget.
  • Flag loader-termination, crash, resource-limit, and low-level process primitives in both Rust and Python policy tables.
  • Add regression coverage for the reported bypass payload classes and update the changelog.

Security fixes

  • Finding 1: detects nested pickle payloads that start with opcodes such as SHORT_BINUNICODE instead of a PROTO header.
  • Finding 2: treats exhausted nested-probe budgets as unsafe/incomplete instead of silently stopping before later payloads.
  • Finding 3: flags builtins.exit and builtins.quit as dangerous callables.
  • Finding 4: flags faulthandler crash helpers, resource.setrlimit, and _posixsubprocess.fork_exec.

Validation

  • uv run --with maturin maturin develop --manifest-path packages/modelaudit-picklescan/Cargo.toml
  • cargo test --manifest-path packages/modelaudit-picklescan/Cargo.toml
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest packages/modelaudit-picklescan/tests/test_rust_engine.py packages/modelaudit-picklescan/tests/test_adversarial_pickle_oracle.py tests/scanners/test_pickle_scanner.py tests/scanners/test_picklescan_adapter.py -q
  • uv run ruff check --fix modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1
  • git diff --check

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 15, 2026

Workflow run and artifacts

Performance Benchmarks

Compared 18 shared benchmarks with a regression threshold of 15%.
Status: 1 regressions, 4 improved, 13 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 550.24ms -> 541.80ms (-1.5%).

Top regressions:

  • tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle +16.0% (99.5us -> 115.4us, safe_model.pkl, size=49.4 KiB, files=1)

Top improvements:

  • tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string] -30.7% (898.5us -> 623.1us, long_benign_string, size=1.0 MiB, files=1)
  • tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex] -24.8% (84.9us -> 63.8us, nested_hex, size=130 B, files=1)
  • tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large] -20.8% (3.67ms -> 2.91ms, safe_large, size=278.2 KiB, files=1)
Benchmark Target Size Files Baseline Current Change Status
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string] long_benign_string 1.0 MiB 1 898.5us 623.1us -30.7% improved
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex] nested_hex 130 B 1 84.9us 63.8us -24.8% improved
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large] safe_large 278.2 KiB 1 3.67ms 2.91ms -20.8% improved
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_base64] nested_base64 98 B 1 77.9us 65.3us -16.2% improved
tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle safe_model.pkl 49.4 KiB 1 99.5us 115.4us +16.0% regression
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[malicious_reduce] malicious_reduce 52 B 1 52.6us 46.0us -12.6% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_stream chunked_stream 278.2 KiB 1 5.86ms 5.21ms -11.1% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[stack_global] stack_global 21 B 1 43.3us 38.6us -10.8% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_hidden_suspicious_string_budget hidden_suspicious_string 8.0 KiB 1 412.1us 450.1us +9.2% stable
tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip state_dict.pt 1.5 MiB 1 32.8us 34.9us +6.5% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_raw] nested_raw 78 B 1 63.6us 61.6us -3.2% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_opcode_budget_tail_payload opcode_budget_tail 14 B 1 44.5us 43.1us -3.2% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle safe_model.pkl 49.4 KiB 1 22.71ms 22.13ms -2.6% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_multi_stream_padded_payload multi_stream_padded 4.1 KiB 1 83.1us 84.9us +2.1% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory duplicate-corpus 840.0 KiB 81 382.03ms 376.38ms -1.5% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory mixed-corpus 1.7 MiB 54 106.83ms 106.22ms -0.6% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_small] safe_small 68 B 1 37.4us 37.2us -0.4% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip state_dict.pt 1.5 MiB 1 27.21ms 27.29ms +0.3% stable

@mldangelo-oai mldangelo-oai marked this pull request as ready for review April 15, 2026 23:31
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8789a88bb7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1134 to +1136
if limit_exceeded {
self.record_nested_probe_limit_exceeded("raw", value.len(), position);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Only raise probe-limit when skipped offsets remain

limit_exceeded is computed before skip_offsets_before filtering, then always emitted here. If offset 0 is a valid full nested pickle, many inner bytes can still count as prefix candidates (e.g., repeated 0x80 0x04 inside string data), trip the 64-offset cap, and force an inconclusive critical result even though all relevant bytes were already covered by the parsed payload.

Useful? React with 👍 / 👎.

@mldangelo-oai mldangelo-oai merged commit c3a3b9d into main Apr 16, 2026
31 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/fix-pickle-bypass-policy-20260415 branch April 16, 2026 06:34
This was referenced Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant