Skip to content

fix: bound pytorch zip jit reads#1048

Merged
mldangelo-oai merged 7 commits intomainfrom
mdangelo/codex/fix-pytorch-jit-bounded-reads
Apr 17, 2026
Merged

fix: bound pytorch zip jit reads#1048
mldangelo-oai merged 7 commits intomainfrom
mdangelo/codex/fix-pytorch-jit-bounded-reads

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

Summary

  • bound PyTorch ZIP JIT/network member reads with a configurable per-member cap
  • skip pickle members and numeric storage blobs in the JIT pass to avoid duplicate/raw memory-heavy scans
  • mark oversized JIT/network coverage inconclusive and finish unsuccessful when coverage is incomplete

Finding

Fixes finding 6: JIT scan reads full ZIP members into memory.

Validation

  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_pytorch_zip_scanner.py::test_pytorch_zip_jit_scan_size_limit_marks_inconclusive --maxfail=1
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1

@mldangelo-oai mldangelo-oai force-pushed the mdangelo/codex/fix-pytorch-jit-bounded-reads branch from c7a2d07 to 92b7eda Compare April 17, 2026 00:54
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 17, 2026

Workflow run and artifacts

Performance Benchmarks

Compared 19 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 19 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 194.79ms -> 191.82ms (-1.5%).

Benchmark Target Size Files Baseline Current Change Status
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_opcode_budget_tail_payload opcode_budget_tail 14 B 1 71.0us 68.5us -3.6% stable
tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle safe_model.pkl 49.4 KiB 1 31.8us 30.6us -3.6% stable
tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip state_dict.pt 1.5 MiB 1 53.3us 51.4us -3.4% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_multi_stream_padded_payload multi_stream_padded 4.1 KiB 1 134.5us 130.0us -3.4% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[stack_global] stack_global 21 B 1 67.6us 65.5us -3.2% stable
tests/benchmarks/test_scan_benchmarks.py::test_skip_filter_plain_text_files - 4.6 KiB 256 13.64ms 13.24ms -2.9% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_base64] nested_base64 98 B 1 102.9us 100.6us -2.2% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory mixed-corpus 1.7 MiB 54 77.33ms 75.89ms -1.9% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_raw] nested_raw 78 B 1 101.2us 99.4us -1.7% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory duplicate-corpus 840.0 KiB 81 48.03ms 47.27ms -1.6% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_stream chunked_stream 278.2 KiB 1 6.62ms 6.51ms -1.5% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large] safe_large 278.2 KiB 1 3.47ms 3.51ms +1.3% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex] nested_hex 130 B 1 109.2us 108.1us -1.0% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_hidden_suspicious_string_budget hidden_suspicious_string 8.0 KiB 1 584.9us 590.5us +0.9% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip state_dict.pt 1.5 MiB 1 31.55ms 31.26ms -0.9% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string] long_benign_string 1.0 MiB 1 1.09ms 1.08ms -0.9% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_small] safe_small 68 B 1 55.9us 55.8us -0.3% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle safe_model.pkl 49.4 KiB 1 11.68ms 11.66ms -0.2% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[malicious_reduce] malicious_reduce 52 B 1 78.2us 78.1us -0.1% stable

@mldangelo-oai mldangelo-oai marked this pull request as ready for review April 17, 2026 01:10
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 92b7eda723

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/pytorch_zip_scanner.py Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 392aba22d4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/pytorch_zip_scanner.py Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d28f386e00

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/pytorch_zip_scanner.py Outdated
mldangelo-oai and others added 2 commits April 16, 2026 21:34
…ants

Address the three nice-to-fix items from review:

1. Aggregate JIT/network-pass oversize and read-failure events into a
   single summary INFO check per kind. Adversarial archives with many
   unreachable members previously produced one INFO finding per member
   (verified live: 11 oversize members → 11 checks), flooding SARIF
   output and dashboards. Collect entries into `details["zip_entries"]`
   and `details["entries"]` instead, with a `skipped_count` /
   `failed_count` summary. The `mark_inconclusive_scan_result` call is
   hoisted out of the loop so the metadata reason is recorded once.

2. Document the identity-based pickle dedup. `pickle_files` and
   `safe_entries` share `ZipInfo` instances because both come from the
   same `infolist()` walk upstream, which is what makes `id()` work.
   A future refactor that rebuilds `pickle_files` from filenames or
   from a separate `infolist()` call would silently defeat the dedup;
   the inline comment now calls that out and suggests a fallback key.

3. Document the `pickle_members_scanned` proxy. It means the scanner is
   wired up, not that every pickle member was actually processed — if
   the pickle scanner crashed mid-scan on a member, that member is
   still skipped here. The trade-off is intentional; the comment makes
   it explicit.

Also document why `max_jit_scan_member_bytes=0` falls back to the
default (32 MiB) instead of meaning "unlimited" the way
`ZipScanner.max_entry_size=0` does: this pass cannot safely run
unbounded. Expand the CHANGELOG to mention the aggregation,
duplicate-name handling, pickle dedup, and directory-entry skip.

Test expectations updated: the existing size-limit and read-failure
tests now assert aggregation shape (single check, per-entry list),
and a new `test_pytorch_zip_jit_scan_aggregates_many_oversize_members_
into_one_check` proves 25 generated oversize members collapse to a
single check with a deduplicated reason.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8e0cd8574

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread modelaudit/scanners/pytorch_zip_scanner.py Outdated
mldangelo added a commit that referenced this pull request Apr 17, 2026
Apply the same refinements this reviewer just landed on PR #1048 to the
hidden-pickle discovery path:

1. Aggregate probe-failure INFO checks. An adversarial archive with many
   members that raise on decompression (for example, unsupported methods
   or intermittent I/O) previously produced one `Pickle Discovery` INFO
   finding per member, flooding the checks list. Collect failures into
   a single summary check carrying the per-member exceptions under
   `details["entries"]`, with `details["zip_entries"]` and
   `details["failed_count"]` for quick consumers. `mark_inconclusive_
   scan_result` is hoisted out of the loop so the metadata reason is
   recorded exactly once even if many members fail.

2. Document the identity-based dedup invariant in
   `_discover_pickle_files`. Both passes iterate the same
   `safe_entries` list, so `id(ZipInfo)` is stable for the duration of
   discovery; a future refactor that rebuilds the list from separate
   `infolist()` walks or fresh `ZipInfo` constructions would silently
   defeat the dedup. Call that out inline so we don't re-learn it the
   hard way.

3. Explain the magic thresholds in `_looks_like_binary_pickle_prefix`.
   The `>= 4` clean-parse and `>= 2` truncation thresholds are tuned to
   balance tensor-storage false positives against real pickle prefixes;
   undocumented they read like arbitrary numbers.

4. Add "keep in sync" comments on the standalone picklescan copies of
   `_looks_like_binary_pickle_prefix` and `_looks_like_proto0_or_1_
   pickle` so the duplication (intentional for the standalone package)
   is at least signposted.

Expand the CHANGELOG bullet to describe the always-on second-pass sniff,
the fail-closed aggregation behavior, and the standalone-package mirror.
Adds a new `test_pytorch_zip_discovery_aggregates_probe_failures_
into_single_check` regression test that proves 5 failing members
collapse to one `Pickle Discovery` check with a deduplicated reason and
all five per-member records under `details["entries"]`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mldangelo-oai added a commit that referenced this pull request Apr 17, 2026
* fix: detect hidden pytorch zip pickles

* fix: fail closed on hidden pickle probe errors

* fix: aggregate hidden-pickle probe failures and document invariants

Apply the same refinements this reviewer just landed on PR #1048 to the
hidden-pickle discovery path:

1. Aggregate probe-failure INFO checks. An adversarial archive with many
   members that raise on decompression (for example, unsupported methods
   or intermittent I/O) previously produced one `Pickle Discovery` INFO
   finding per member, flooding the checks list. Collect failures into
   a single summary check carrying the per-member exceptions under
   `details["entries"]`, with `details["zip_entries"]` and
   `details["failed_count"]` for quick consumers. `mark_inconclusive_
   scan_result` is hoisted out of the loop so the metadata reason is
   recorded exactly once even if many members fail.

2. Document the identity-based dedup invariant in
   `_discover_pickle_files`. Both passes iterate the same
   `safe_entries` list, so `id(ZipInfo)` is stable for the duration of
   discovery; a future refactor that rebuilds the list from separate
   `infolist()` walks or fresh `ZipInfo` constructions would silently
   defeat the dedup. Call that out inline so we don't re-learn it the
   hard way.

3. Explain the magic thresholds in `_looks_like_binary_pickle_prefix`.
   The `>= 4` clean-parse and `>= 2` truncation thresholds are tuned to
   balance tensor-storage false positives against real pickle prefixes;
   undocumented they read like arbitrary numbers.

4. Add "keep in sync" comments on the standalone picklescan copies of
   `_looks_like_binary_pickle_prefix` and `_looks_like_proto0_or_1_
   pickle` so the duplication (intentional for the standalone package)
   is at least signposted.

Expand the CHANGELOG bullet to describe the always-on second-pass sniff,
the fail-closed aggregation behavior, and the standalone-package mirror.
Adds a new `test_pytorch_zip_discovery_aggregates_probe_failures_
into_single_check` regression test that proves 5 failing members
collapse to one `Pickle Discovery` check with a deduplicated reason and
all five per-member records under `details["entries"]`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: align picklescan proto0 probe trivia

---------

Co-authored-by: mldangelo <michael.l.dangelo@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…torch-jit-bounded-reads

# Conflicts:
#	CHANGELOG.md
#	modelaudit/scanners/pytorch_zip_scanner.py
#	tests/scanners/test_pytorch_zip_scanner.py
@mldangelo-oai mldangelo-oai merged commit f920d76 into main Apr 17, 2026
28 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/fix-pytorch-jit-bounded-reads branch April 17, 2026 14:59
@github-actions github-actions bot mentioned this pull request Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants