Skip to content

fix: close archive payload scan gaps#1042

Closed
mldangelo-oai wants to merge 1 commit intomainfrom
mdangelo/codex/fix-undetectable-payloads
Closed

fix: close archive payload scan gaps#1042
mldangelo-oai wants to merge 1 commit intomainfrom
mdangelo/codex/fix-undetectable-payloads

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

Summary

  • sniff hidden PyTorch ZIP pickle members even when data.pkl exists, including storage-looking paths, and fail closed on discovery/JIT coverage limits
  • detect extensionless protocol 0/1 pickles in 7z probes and mark PyTorch/manifest timeouts inconclusive
  • keep active payload/CVE/path traversal/incomplete findings from being downgraded by HuggingFace whitelist provenance
  • scan generic ZIP/TAR Python members for dangerous handlers while preserving benign source and storage near-match negatives

Validation

  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_pytorch_zip_scanner.py tests/scanners/test_sevenzip_scanner.py tests/scanners/test_zip_scanner.py tests/scanners/test_tar_scanner.py tests/scanners/test_manifest_scanner.py tests/scanners/test_base_scanner.py packages/modelaudit-picklescan/tests/test_api.py --maxfail=1
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1

@github-actions
Copy link
Copy Markdown
Contributor

Workflow run and artifacts

Performance Benchmarks

Compared 19 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 2 improved, 17 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 170.83ms -> 148.68ms (-13.0%).

Top improvements:

  • tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip -34.0% (28.86ms -> 19.06ms, state_dict.pt, size=1.5 MiB, files=1)
  • tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory -16.9% (68.61ms -> 57.03ms, mixed-corpus, size=1.7 MiB, files=54)
Benchmark Target Size Files Baseline Current Change Status
tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip state_dict.pt 1.5 MiB 1 28.86ms 19.06ms -34.0% improved
tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory mixed-corpus 1.7 MiB 54 68.61ms 57.03ms -16.9% improved
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_stream chunked_stream 278.2 KiB 1 6.45ms 6.21ms -3.8% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large] safe_large 278.2 KiB 1 3.37ms 3.26ms -3.2% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_multi_stream_padded_payload multi_stream_padded 4.1 KiB 1 115.6us 118.9us +2.9% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[stack_global] stack_global 21 B 1 60.4us 61.7us +2.2% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_small] safe_small 68 B 1 51.2us 52.3us +2.1% stable
tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle safe_model.pkl 49.4 KiB 1 21.4us 21.9us +2.0% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[malicious_reduce] malicious_reduce 52 B 1 70.3us 71.4us +1.5% stable
tests/benchmarks/test_scan_benchmarks.py::test_skip_filter_plain_text_files - 4.6 KiB 256 10.70ms 10.55ms -1.4% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_hidden_suspicious_string_budget hidden_suspicious_string 8.0 KiB 1 552.7us 558.6us +1.1% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex] nested_hex 130 B 1 97.6us 98.5us +1.0% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_opcode_budget_tail_payload opcode_budget_tail 14 B 1 66.7us 67.3us +0.9% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory duplicate-corpus 840.0 KiB 81 40.24ms 39.98ms -0.6% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle safe_model.pkl 49.4 KiB 1 10.20ms 10.18ms -0.3% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string] long_benign_string 1.0 MiB 1 1.14ms 1.14ms -0.3% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_raw] nested_raw 78 B 1 90.6us 90.4us -0.2% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_base64] nested_base64 98 B 1 92.5us 92.6us +0.1% stable
tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip state_dict.pt 1.5 MiB 1 37.8us 37.8us +0.1% stable

@mldangelo-oai mldangelo-oai marked this pull request as ready for review April 17, 2026 00:48
@mldangelo-oai
Copy link
Copy Markdown
Contributor Author

Superseded by the six split draft PRs: #1043, #1044, #1045, #1046, #1047, and #1048.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 85effaabf4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +385 to +386
if _zip_entry_looks_like_pickle(archive, entry):
add_entry(entry)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard hidden-pickle probing against ZIP member read failures

Second-pass discovery probes every non-selected ZIP entry via _zip_entry_looks_like_pickle without exception handling. archive.open(...).read(...) can raise RuntimeError (e.g., encrypted members) or decompression errors, and this path is outside scan_file's handled exceptions (OSError, BadZipFile). A single unreadable non-pickle member can abort scan_file instead of returning a structured report.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant