Skip to content

fix(picklescan): close encoded protocol0 probe gaps#1594

Merged
mldangelo-oai merged 8 commits into
mainfrom
mdangelo/codex/fix-protocol0-encoded-probes
Jun 9, 2026
Merged

fix(picklescan): close encoded protocol0 probe gaps#1594
mldangelo-oai merged 8 commits into
mainfrom
mdangelo/codex/fix-protocol0-encoded-probes

Conversation

@mldangelo-oai

@mldangelo-oai mldangelo-oai commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Summary

  • detect encoded protocol 0 pickles after long line operands for F, I, L, P, S, V, g, and p
  • cover all three decoded base64 offsets while preserving lenient separators and wrapper shifts
  • reconstruct bounded context for memo-dependent g/p candidates without charging synthetic framing to max_nested_pickle_bytes
  • keep unterminated long-line near-matches clean and fail closed when a real terminator or candidate exceeds bounded coverage
  • preserve the newer structural-prefix behavior from fix: restore cross-platform scanner baselines #1597 while ensuring specialized long-line probes run first
  • add malicious positives, benign near-match negatives, exact-budget regressions, and an [Unreleased] changelog entry

Follow-up to merged #1583 and its post-merge review findings:

Critical review fixes

  • removed the decoded-offset modulo assumption that missed payloads beginning one or two decoded bytes into a base64 alignment
  • broadened coverage from only I/S/V to every protocol 0 line opcode that can hide later dangerous opcodes
  • added bounded contextual recovery for memo reads/writes and capped both candidates and parsing steps
  • fixed synthetic PROTO accounting so exact nested-byte limits remain complete and reported sizes never include the two internal framing bytes
  • reordered unterminated-line handling to avoid repeated contextual parsing before a line terminator is known
  • reconciled fix: restore cross-platform scanner baselines #1597's structural-prefix gate so unterminated long lines remain clean and accurate long-line windows are emitted before generic embedded candidates

All three existing review threads are resolved.

Validation

Scoped to the changed picklescan surfaces, per review policy:

  • cargo test --manifest-path packages/modelaudit-picklescan/Cargo.toml (154 passed)
  • focused Python tests: test_protocol0_line_operands.py plus test_nested_budget_limits.py (425 passed)
  • cargo check, strict Clippy, and Cargo fmt (clean)
  • Ruff check/format and mypy for the focused Python tests (clean)
  • git diff --check (clean)
  • no full local repository suite; CI owns broader coverage

Published revision

  • head: 354ab74599c5b74acf787e9b91d454aeccab277f
  • tree: 69ade858021436fee8a1381f67555ed4326a8e9b
  • synced with main at 493436e795be760b8dcec031d4d60e9a044091f1

Detect base64 protocol 0 payloads that begin with long scalar operands, and continue bounded probing after lenient prefixes unless the entire literal decodes to exactly one pickle.
@mldangelo-oai mldangelo-oai requested a review from mldangelo June 9, 2026 05:40
@mldangelo-oai mldangelo-oai enabled auto-merge (squash) June 9, 2026 05:40
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Workflow run and artifacts

Performance Benchmarks

Compared 12 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 12 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 1.254s -> 1.257s (+0.3%).

Workload Benchmark Target Size Files Baseline Current Change Status
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_hex] nested_hex 130 B 1 494.0us 521.3us +5.5% stable
warm-cache-rescan tests/benchmarks/test_scan_benchmarks.py::test_scan_warm_cached_repository_rescan release-candidate 547.3 KiB 32 82.88ms 79.99ms -3.5% stable
direct-malicious-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload malicious_reduce 52 B 1 425.9us 434.2us +2.0% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64] nested_base64 98 B 1 459.4us 468.3us +1.9% stable
suspicious-pickle-intake tests/benchmarks/test_scan_benchmarks.py::test_scan_suspicious_pickle_intake suspicious-intake 183.8 KiB 4 107.47ms 108.88ms +1.3% stable
clean-training-checkpoint tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_clean_training_checkpoint safe_large 278.2 KiB 1 106.81ms 108.17ms +1.3% stable
single-checkpoint-preflight tests/benchmarks/test_scan_benchmarks.py::test_scan_single_checkpoint_before_load single_checkpoint.pkl 183.0 KiB 1 60.36ms 61.03ms +1.1% stable
chunked-upload-stream tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_upload_stream chunked_stream 278.2 KiB 1 109.99ms 111.19ms +1.1% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw] nested_raw 78 B 1 453.3us 457.6us +1.0% stable
duplicate-heavy-registry tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_registry_snapshot registry-snapshot 915.2 KiB 13 358.63ms 361.11ms +0.7% stable
padded-multi-stream-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_padded_multi_stream_upload multi_stream_padded 4.1 KiB 1 513.6us 516.5us +0.6% stable
mixed-model-repository tests/benchmarks/test_scan_benchmarks.py::test_scan_release_candidate_repository release-candidate 547.3 KiB 32 425.11ms 424.53ms -0.1% stable

Sync current main and preserve bounded long-scalar coverage across wrapper alignments and lenient base64 separators.

Copy link
Copy Markdown
Contributor Author

Critical review follow-up published at 51772bc840783dcb469c0c93439b08f7a2388a9f.

I reproduced and fixed three clean-verdict false-negative classes in the original branch:

  • one-character wrapper shifts (A / =) before a malicious base64 protocol 0 payload
  • lenient separator insertion, including spaces and !
  • confirmed parseable payloads beginning beyond the bounded 1 MiB mid-scan window

The revised scanner performs bounded four-alignment discovery, requires an actual parseable pickle before promoting a candidate, and uses the existing incomplete/fail-closed signal when a confirmed candidate is beyond bounded coverage. Benign long-scalar near-matches remain clean.

Scoped QA is clean: 150 Rust tests, 48 focused Python tests, a 36-case malicious/benign adversarial matrix, Cargo fmt/clippy, Ruff, mypy, and a post-main-sync rerun. The remote tree was fetched back and matches the locally verified tree exactly. No inline review threads are open at publication time.

Fresh CI is now running; moving on to the next randomly selected PR rather than waiting here.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 51772bc840

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/modelaudit-picklescan/rust/src/nested.rs Outdated
Comment thread packages/modelaudit-picklescan/rust/src/nested.rs Outdated
@mldangelo-oai mldangelo-oai marked this pull request as draft June 9, 2026 08:11
auto-merge was automatically disabled June 9, 2026 08:11

Pull request was converted to draft

@mldangelo-oai mldangelo-oai marked this pull request as ready for review June 9, 2026 08:31
@mldangelo-oai mldangelo-oai enabled auto-merge (squash) June 9, 2026 08:31

@mldangelo-oai mldangelo-oai left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical review found and fixed a false negative in the new long protocol-0 base64 probe: ignored separators could inflate the raw literal past the 1 MiB probe input budget while leaving the compact encoded payload small, causing a 1.5 MiB malicious os.system pickle to return complete/clean. The probe now budgets retained compact base64 characters after the bounded start region, and complete long protocol-0 lenient literals use the same bounded single-pickle coverage proof to avoid a false truncation notice.

Validation on the exact pushed tree:

  • 51/51 associated Python tests passed
  • 37/37 nested Rust unit tests passed
  • focused encoded-probe fail-closed state test passed
  • 32-case adversarial separator matrix passed across I, S, and V, including benign unterminated near-matches up to the 8 MiB literal limit
  • Ruff format/check, full mypy (466 files), Cargo fmt/check/Clippy, and package lock check passed
  • 8 MiB benign timing remained approximately 0.04s dense / 0.17s separated locally

All existing review threads remain resolved.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f365c0a75d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/modelaudit-picklescan/rust/src/nested.rs Outdated
@mldangelo-oai mldangelo-oai marked this pull request as draft June 9, 2026 09:53
auto-merge was automatically disabled June 9, 2026 09:53

Pull request was converted to draft

@mldangelo-oai mldangelo-oai marked this pull request as ready for review June 9, 2026 10:27
@mldangelo-oai mldangelo-oai enabled auto-merge (squash) June 9, 2026 10:27
@mldangelo-oai mldangelo-oai marked this pull request as draft June 9, 2026 10:28
auto-merge was automatically disabled June 9, 2026 10:28

Pull request was converted to draft

@mldangelo-oai mldangelo-oai marked this pull request as ready for review June 9, 2026 12:46
@mldangelo-oai mldangelo-oai merged commit dfaedd1 into main Jun 9, 2026
28 of 30 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/fix-protocol0-encoded-probes branch June 9, 2026 12:46

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 354ab74599

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1226 to +1227
let Some(value) = base64_value(*byte) else {
continue;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop base64 scanning at padding

Skipping = lets padded unterminated protocol-0 base64 scalars consume later base64; if that yields \n, benign input becomes limit-exceeded/malicious though normal decoding stops at padding. Stop at padding per guidance.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant