Skip to content

fix: detect hidden pytorch zip pickles#1043

Merged
mldangelo-oai merged 6 commits intomainfrom
mdangelo/codex/fix-pytorch-hidden-pickles
Apr 17, 2026
Merged

fix: detect hidden pytorch zip pickles#1043
mldangelo-oai merged 6 commits intomainfrom
mdangelo/codex/fix-pytorch-hidden-pickles

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

Summary

  • always bounded-sniff unselected PyTorch ZIP members for hidden pickle streams, even after discovering data.pkl
  • route extensionless and storage-path pickle payloads into the pickle scanner while keeping benign storage blobs and pickle-ish text clean
  • mirror hidden-member discovery in the standalone modelaudit-picklescan PyTorch ZIP path

Finding

Fixes finding 1: hidden PyTorch ZIP pickles are skipped after data.pkl.

Validation

  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_pytorch_zip_scanner.py packages/modelaudit-picklescan/tests/test_api.py --maxfail=1
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 17, 2026

Workflow run and artifacts

Performance Benchmarks

Compared 19 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 19 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 190.45ms -> 192.14ms (+0.9%).

Benchmark Target Size Files Baseline Current Change Status
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_small] safe_small 68 B 1 44.8us 48.1us +7.3% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_multi_stream_padded_payload multi_stream_padded 4.1 KiB 1 103.8us 107.2us +3.3% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip state_dict.pt 1.5 MiB 1 32.16ms 33.04ms +2.8% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_raw] nested_raw 78 B 1 80.0us 78.0us -2.5% stable
tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle safe_model.pkl 49.4 KiB 1 23.9us 24.3us +1.7% stable
tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip state_dict.pt 1.5 MiB 1 43.2us 43.9us +1.6% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large] safe_large 278.2 KiB 1 3.77ms 3.72ms -1.4% stable
tests/benchmarks/test_scan_benchmarks.py::test_skip_filter_plain_text_files - 4.6 KiB 256 10.14ms 10.02ms -1.3% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_opcode_budget_tail_payload opcode_budget_tail 14 B 1 54.9us 55.6us +1.2% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory mixed-corpus 1.7 MiB 54 77.06ms 77.91ms +1.1% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_stream chunked_stream 278.2 KiB 1 6.73ms 6.66ms -1.0% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[malicious_reduce] malicious_reduce 52 B 1 61.3us 60.7us -1.0% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle safe_model.pkl 49.4 KiB 1 12.03ms 11.95ms -0.7% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory duplicate-corpus 840.0 KiB 81 46.15ms 46.43ms +0.6% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_base64] nested_base64 98 B 1 80.4us 79.9us -0.6% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_hidden_suspicious_string_budget hidden_suspicious_string 8.0 KiB 1 597.4us 599.8us +0.4% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[stack_global] stack_global 21 B 1 52.0us 52.1us +0.3% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string] long_benign_string 1.0 MiB 1 1.17ms 1.18ms +0.1% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex] nested_hex 130 B 1 83.8us 83.7us -0.1% stable

@mldangelo-oai mldangelo-oai marked this pull request as ready for review April 17, 2026 00:48
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f4e6c684e3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/modelaudit-picklescan/src/modelaudit_picklescan/api.py Outdated
@mldangelo-oai mldangelo-oai force-pushed the mdangelo/codex/fix-pytorch-hidden-pickles branch from f4e6c68 to a274fc8 Compare April 17, 2026 00:56
mldangelo-oai and others added 2 commits April 16, 2026 18:10
Apply the same refinements this reviewer just landed on PR #1048 to the
hidden-pickle discovery path:

1. Aggregate probe-failure INFO checks. An adversarial archive with many
   members that raise on decompression (for example, unsupported methods
   or intermittent I/O) previously produced one `Pickle Discovery` INFO
   finding per member, flooding the checks list. Collect failures into
   a single summary check carrying the per-member exceptions under
   `details["entries"]`, with `details["zip_entries"]` and
   `details["failed_count"]` for quick consumers. `mark_inconclusive_
   scan_result` is hoisted out of the loop so the metadata reason is
   recorded exactly once even if many members fail.

2. Document the identity-based dedup invariant in
   `_discover_pickle_files`. Both passes iterate the same
   `safe_entries` list, so `id(ZipInfo)` is stable for the duration of
   discovery; a future refactor that rebuilds the list from separate
   `infolist()` walks or fresh `ZipInfo` constructions would silently
   defeat the dedup. Call that out inline so we don't re-learn it the
   hard way.

3. Explain the magic thresholds in `_looks_like_binary_pickle_prefix`.
   The `>= 4` clean-parse and `>= 2` truncation thresholds are tuned to
   balance tensor-storage false positives against real pickle prefixes;
   undocumented they read like arbitrary numbers.

4. Add "keep in sync" comments on the standalone picklescan copies of
   `_looks_like_binary_pickle_prefix` and `_looks_like_proto0_or_1_
   pickle` so the duplication (intentional for the standalone package)
   is at least signposted.

Expand the CHANGELOG bullet to describe the always-on second-pass sniff,
the fail-closed aggregation behavior, and the standalone-package mirror.
Adds a new `test_pytorch_zip_discovery_aggregates_probe_failures_
into_single_check` regression test that proves 5 failing members
collapse to one `Pickle Discovery` check with a deduplicated reason and
all five per-member records under `details["entries"]`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 234dfe66d3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/modelaudit-picklescan/src/modelaudit_picklescan/api.py Outdated
…torch-hidden-pickles

# Conflicts:
#	CHANGELOG.md
…torch-hidden-pickles

# Conflicts:
#	CHANGELOG.md
#	modelaudit/scanners/pytorch_zip_scanner.py
@mldangelo-oai mldangelo-oai merged commit 19b6ebe into main Apr 17, 2026
31 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/fix-pytorch-hidden-pickles branch April 17, 2026 14:50
This was referenced Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants