fix(picklescan): close encoded protocol0 probe gaps by mldangelo-oai · Pull Request #1594 · promptfoo/modelaudit

mldangelo-oai · 2026-06-09T05:39:37Z

Summary

detect encoded protocol 0 pickles after long line operands for F, I, L, P, S, V, g, and p
cover all three decoded base64 offsets while preserving lenient separators and wrapper shifts
reconstruct bounded context for memo-dependent g/p candidates without charging synthetic framing to max_nested_pickle_bytes
keep unterminated long-line near-matches clean and fail closed when a real terminator or candidate exceeds bounded coverage
preserve the newer structural-prefix behavior from fix: restore cross-platform scanner baselines #1597 while ensuring specialized long-line probes run first
add malicious positives, benign near-match negatives, exact-budget regressions, and an [Unreleased] changelog entry

Follow-up to merged #1583 and its post-merge review findings:

Critical review fixes

removed the decoded-offset modulo assumption that missed payloads beginning one or two decoded bytes into a base64 alignment
broadened coverage from only I/S/V to every protocol 0 line opcode that can hide later dangerous opcodes
added bounded contextual recovery for memo reads/writes and capped both candidates and parsing steps
fixed synthetic PROTO accounting so exact nested-byte limits remain complete and reported sizes never include the two internal framing bytes
reordered unterminated-line handling to avoid repeated contextual parsing before a line terminator is known
reconciled fix: restore cross-platform scanner baselines #1597's structural-prefix gate so unterminated long lines remain clean and accurate long-line windows are emitted before generic embedded candidates

All three existing review threads are resolved.

Validation

Scoped to the changed picklescan surfaces, per review policy:

cargo test --manifest-path packages/modelaudit-picklescan/Cargo.toml (154 passed)
focused Python tests: test_protocol0_line_operands.py plus test_nested_budget_limits.py (425 passed)
cargo check, strict Clippy, and Cargo fmt (clean)
Ruff check/format and mypy for the focused Python tests (clean)
git diff --check (clean)
no full local repository suite; CI owns broader coverage

Published revision

head: 354ab74599c5b74acf787e9b91d454aeccab277f
tree: 69ade858021436fee8a1381f67555ed4326a8e9b
synced with main at 493436e795be760b8dcec031d4d60e9a044091f1

Detect base64 protocol 0 payloads that begin with long scalar operands, and continue bounded probing after lenient prefixes unless the entire literal decodes to exactly one pickle.

github-actions · 2026-06-09T05:41:31Z

Workflow run and artifacts

Performance Benchmarks

Compared 12 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 12 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 1.254s -> 1.257s (+0.3%).

Workload	Benchmark	Target	Size	Files	Baseline	Current	Change	Status
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_hex]`	`nested_hex`	130 B	1	494.0us	521.3us	+5.5%	stable
`warm-cache-rescan`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_warm_cached_repository_rescan`	`release-candidate`	547.3 KiB	32	82.88ms	79.99ms	-3.5%	stable
`direct-malicious-upload`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload`	`malicious_reduce`	52 B	1	425.9us	434.2us	+2.0%	stable
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64]`	`nested_base64`	98 B	1	459.4us	468.3us	+1.9%	stable
`suspicious-pickle-intake`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_suspicious_pickle_intake`	`suspicious-intake`	183.8 KiB	4	107.47ms	108.88ms	+1.3%	stable
`clean-training-checkpoint`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_clean_training_checkpoint`	`safe_large`	278.2 KiB	1	106.81ms	108.17ms	+1.3%	stable
`single-checkpoint-preflight`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_single_checkpoint_before_load`	`single_checkpoint.pkl`	183.0 KiB	1	60.36ms	61.03ms	+1.1%	stable
`chunked-upload-stream`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_upload_stream`	`chunked_stream`	278.2 KiB	1	109.99ms	111.19ms	+1.1%	stable
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw]`	`nested_raw`	78 B	1	453.3us	457.6us	+1.0%	stable
`duplicate-heavy-registry`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_registry_snapshot`	`registry-snapshot`	915.2 KiB	13	358.63ms	361.11ms	+0.7%	stable
`padded-multi-stream-upload`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_padded_multi_stream_upload`	`multi_stream_padded`	4.1 KiB	1	513.6us	516.5us	+0.6%	stable
`mixed-model-repository`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_release_candidate_repository`	`release-candidate`	547.3 KiB	32	425.11ms	424.53ms	-0.1%	stable

Sync current main and preserve bounded long-scalar coverage across wrapper alignments and lenient base64 separators.

mldangelo-oai · 2026-06-09T07:46:07Z

Critical review follow-up published at 51772bc840783dcb469c0c93439b08f7a2388a9f.

I reproduced and fixed three clean-verdict false-negative classes in the original branch:

one-character wrapper shifts (A / =) before a malicious base64 protocol 0 payload
lenient separator insertion, including spaces and !
confirmed parseable payloads beginning beyond the bounded 1 MiB mid-scan window

The revised scanner performs bounded four-alignment discovery, requires an actual parseable pickle before promoting a candidate, and uses the existing incomplete/fail-closed signal when a confirmed candidate is beyond bounded coverage. Benign long-scalar near-matches remain clean.

Scoped QA is clean: 150 Rust tests, 48 focused Python tests, a 36-case malicious/benign adversarial matrix, Cargo fmt/clippy, Ruff, mypy, and a post-main-sync rerun. The remote tree was fetched back and matches the locally verified tree exactly. No inline review threads are open at publication time.

Fresh CI is now running; moving on to the next randomly selected PR rather than waiting here.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 51772bc840

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…otocol0-encoded-probes

mldangelo-oai

Critical review found and fixed a false negative in the new long protocol-0 base64 probe: ignored separators could inflate the raw literal past the 1 MiB probe input budget while leaving the compact encoded payload small, causing a 1.5 MiB malicious os.system pickle to return complete/clean. The probe now budgets retained compact base64 characters after the bounded start region, and complete long protocol-0 lenient literals use the same bounded single-pickle coverage proof to avoid a false truncation notice.

Validation on the exact pushed tree:

51/51 associated Python tests passed
37/37 nested Rust unit tests passed
focused encoded-probe fail-closed state test passed
32-case adversarial separator matrix passed across I, S, and V, including benign unterminated near-matches up to the 8 MiB literal limit
Ruff format/check, full mypy (466 files), Cargo fmt/check/Clippy, and package lock check passed
8 MiB benign timing remained approximately 0.04s dense / 0.17s separated locally

All existing review threads remain resolved.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f365c0a75d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…otocol0-encoded-probes

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 354ab74599

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-09T12:53:53Z

+        let Some(value) = base64_value(*byte) else {
+            continue;


Stop base64 scanning at padding

Skipping = lets padded unterminated protocol-0 base64 scalars consume later base64; if that yields \n, benign input becomes limit-exceeded/malicious though normal decoding stops at padding. Stop at padding per guidance.

Useful? React with 👍 / 👎.

fix(picklescan): close encoded protocol0 probe gaps

ba210ce

Detect base64 protocol 0 payloads that begin with long scalar operands, and continue bounded probing after lenient prefixes unless the entire literal decodes to exactly one pickle.

mldangelo-oai requested a review from mldangelo June 9, 2026 05:40

mldangelo-oai enabled auto-merge (squash) June 9, 2026 05:40

fix(picklescan): close encoded protocol0 probe gaps

51772bc

Sync current main and preserve bounded long-scalar coverage across wrapper alignments and lenient base64 separators.

chatgpt-codex-connector Bot reviewed Jun 9, 2026

View reviewed changes

Comment thread packages/modelaudit-picklescan/rust/src/nested.rs Outdated

Comment thread packages/modelaudit-picklescan/rust/src/nested.rs Outdated

mldangelo-oai marked this pull request as draft June 9, 2026 08:11

auto-merge was automatically disabled June 9, 2026 08:11
Pull request was converted to draft

mldangelo-oai added 2 commits June 9, 2026 01:28

fix(picklescan): align encoded probe coverage bounds

c01240a

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-pr…

7cde8eb

…otocol0-encoded-probes

mldangelo-oai marked this pull request as ready for review June 9, 2026 08:31

mldangelo-oai enabled auto-merge (squash) June 9, 2026 08:31

fix(picklescan): cover sparse encoded protocol 0 operands

f365c0a

mldangelo-oai commented Jun 9, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jun 9, 2026

View reviewed changes

Comment thread packages/modelaudit-picklescan/rust/src/nested.rs Outdated

mldangelo-oai marked this pull request as draft June 9, 2026 09:53

auto-merge was automatically disabled June 9, 2026 09:53
Pull request was converted to draft

mldangelo-oai added 2 commits June 9, 2026 02:55

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-pr…

537a811

…otocol0-encoded-probes

fix(picklescan): fail closed on truncated scalar probes

931823b

mldangelo-oai marked this pull request as ready for review June 9, 2026 10:27

mldangelo-oai enabled auto-merge (squash) June 9, 2026 10:27

mldangelo-oai marked this pull request as draft June 9, 2026 10:28

auto-merge was automatically disabled June 9, 2026 10:28
Pull request was converted to draft

Merge main and address encoded protocol 0 review findings

354ab74

mldangelo-oai marked this pull request as ready for review June 9, 2026 12:46

mldangelo-oai merged commit dfaedd1 into main Jun 9, 2026
28 of 30 checks passed

mldangelo-oai deleted the mdangelo/codex/fix-protocol0-encoded-probes branch June 9, 2026 12:46

chatgpt-codex-connector Bot reviewed Jun 9, 2026

View reviewed changes

mldangelo-oai mentioned this pull request Jun 9, 2026

fix(picklescan): scan encoded byte literals #1602

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(picklescan): close encoded protocol0 probe gaps#1594

fix(picklescan): close encoded protocol0 probe gaps#1594
mldangelo-oai merged 8 commits into
mainfrom
mdangelo/codex/fix-protocol0-encoded-probes

mldangelo-oai commented Jun 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

mldangelo-oai commented Jun 9, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

mldangelo-oai left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mldangelo-oai commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Critical review fixes

Validation

Published revision

Uh oh!

github-actions Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Benchmarks

Uh oh!

mldangelo-oai commented Jun 9, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

mldangelo-oai left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mldangelo-oai commented Jun 9, 2026 •

edited

Loading

github-actions Bot commented Jun 9, 2026 •

edited

Loading