fix: close archive payload scan gaps by mldangelo-oai · Pull Request #1042 · promptfoo/modelaudit

mldangelo-oai · 2026-04-17T00:20:10Z

Summary

sniff hidden PyTorch ZIP pickle members even when data.pkl exists, including storage-looking paths, and fail closed on discovery/JIT coverage limits
detect extensionless protocol 0/1 pickles in 7z probes and mark PyTorch/manifest timeouts inconclusive
keep active payload/CVE/path traversal/incomplete findings from being downgraded by HuggingFace whitelist provenance
scan generic ZIP/TAR Python members for dangerous handlers while preserving benign source and storage near-match negatives

Validation

uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_pytorch_zip_scanner.py tests/scanners/test_sevenzip_scanner.py tests/scanners/test_zip_scanner.py tests/scanners/test_tar_scanner.py tests/scanners/test_manifest_scanner.py tests/scanners/test_base_scanner.py packages/modelaudit-picklescan/tests/test_api.py --maxfail=1
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1

github-actions · 2026-04-17T00:21:23Z

Performance Benchmarks

Compared 19 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 2 improved, 17 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 170.83ms -> 148.68ms (-13.0%).

Top improvements:

tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip -34.0% (28.86ms -> 19.06ms, state_dict.pt, size=1.5 MiB, files=1)
tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory -16.9% (68.61ms -> 57.03ms, mixed-corpus, size=1.7 MiB, files=54)

Benchmark	Target	Size	Files	Baseline	Current	Change	Status
`tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip`	`state_dict.pt`	1.5 MiB	1	28.86ms	19.06ms	-34.0%	improved
`tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory`	`mixed-corpus`	1.7 MiB	54	68.61ms	57.03ms	-16.9%	improved
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_stream`	`chunked_stream`	278.2 KiB	1	6.45ms	6.21ms	-3.8%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large]`	`safe_large`	278.2 KiB	1	3.37ms	3.26ms	-3.2%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_multi_stream_padded_payload`	`multi_stream_padded`	4.1 KiB	1	115.6us	118.9us	+2.9%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[stack_global]`	`stack_global`	21 B	1	60.4us	61.7us	+2.2%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_small]`	`safe_small`	68 B	1	51.2us	52.3us	+2.1%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle`	`safe_model.pkl`	49.4 KiB	1	21.4us	21.9us	+2.0%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[malicious_reduce]`	`malicious_reduce`	52 B	1	70.3us	71.4us	+1.5%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_skip_filter_plain_text_files`	`-`	4.6 KiB	256	10.70ms	10.55ms	-1.4%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_hidden_suspicious_string_budget`	`hidden_suspicious_string`	8.0 KiB	1	552.7us	558.6us	+1.1%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex]`	`nested_hex`	130 B	1	97.6us	98.5us	+1.0%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_opcode_budget_tail_payload`	`opcode_budget_tail`	14 B	1	66.7us	67.3us	+0.9%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory`	`duplicate-corpus`	840.0 KiB	81	40.24ms	39.98ms	-0.6%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle`	`safe_model.pkl`	49.4 KiB	1	10.20ms	10.18ms	-0.3%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string]`	`long_benign_string`	1.0 MiB	1	1.14ms	1.14ms	-0.3%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_raw]`	`nested_raw`	78 B	1	90.6us	90.4us	-0.2%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_base64]`	`nested_base64`	98 B	1	92.5us	92.6us	+0.1%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip`	`state_dict.pt`	1.5 MiB	1	37.8us	37.8us	+0.1%	stable

mldangelo-oai · 2026-04-17T00:50:39Z

Superseded by the six split draft PRs: #1043, #1044, #1045, #1046, #1047, and #1048.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 85effaabf4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-17T00:54:30Z

+        if _zip_entry_looks_like_pickle(archive, entry):
+            add_entry(entry)


Guard hidden-pickle probing against ZIP member read failures

Second-pass discovery probes every non-selected ZIP entry via _zip_entry_looks_like_pickle without exception handling. archive.open(...).read(...) can raise RuntimeError (e.g., encrypted members) or decompression errors, and this path is outside scan_file's handled exceptions (OSError, BadZipFile). A single unreadable non-pickle member can abort scan_file instead of returning a structured report.

Useful? React with 👍 / 👎.

fix: close archive payload scan gaps

85effaa

mldangelo-oai marked this pull request as ready for review April 17, 2026 00:48

mldangelo-oai closed this Apr 17, 2026

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: close archive payload scan gaps#1042

fix: close archive payload scan gaps#1042
mldangelo-oai wants to merge 1 commit intomainfrom
mdangelo/codex/fix-undetectable-payloads

mldangelo-oai commented Apr 17, 2026

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

mldangelo-oai commented Apr 17, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if _zip_entry_looks_like_pickle(archive, entry):
		add_entry(entry)

Conversation

mldangelo-oai commented Apr 17, 2026

Summary

Validation

Uh oh!

github-actions bot commented Apr 17, 2026

Performance Benchmarks

Uh oh!

mldangelo-oai commented Apr 17, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant