Skip to content

fix: preserve scannable skipped ZIP containers#1028

Merged
mldangelo-oai merged 2 commits intomainfrom
mdangelo/codex/harden-routing-bypasses
Apr 16, 2026
Merged

fix: preserve scannable skipped ZIP containers#1028
mldangelo-oai merged 2 commits intomainfrom
mdangelo/codex/harden-routing-bypasses

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

Summary

  • preserve skipped-suffix ZIP containers when config.json is structurally Keras, even without .keras metadata sidecars
  • sniff .bin archive members for bounded pickle/ZIP signals so Office-like ZIPs cannot hide model payloads solely behind a .docx suffix
  • keep ordinary Office documents and benign OLE .bin embeddings skipped

Validation

  • uv run ruff format modelaudit/utils/file/filtering.py tests/utils/file/test_file_filter.py tests/test_directory_file_filtering.py
  • uv run ruff check modelaudit/utils/file/filtering.py tests/utils/file/test_file_filter.py tests/test_directory_file_filtering.py
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/file/test_file_filter.py tests/test_directory_file_filtering.py -q
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 15, 2026

Workflow run and artifacts

Performance Benchmarks

Compared 19 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 19 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 183.23ms -> 183.56ms (+0.2%).

Benchmark Target Size Files Baseline Current Change Status
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_opcode_budget_tail_payload opcode_budget_tail 14 B 1 61.1us 58.0us -5.1% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_base64] nested_base64 98 B 1 97.1us 93.8us -3.4% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_raw] nested_raw 78 B 1 78.2us 80.9us +3.4% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_small] safe_small 68 B 1 46.4us 47.7us +2.8% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large] safe_large 278.2 KiB 1 4.46ms 4.38ms -1.8% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_stream chunked_stream 278.2 KiB 1 7.47ms 7.60ms +1.7% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string] long_benign_string 1.0 MiB 1 1.17ms 1.16ms -1.4% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_hidden_suspicious_string_budget hidden_suspicious_string 8.0 KiB 1 546.3us 539.8us -1.2% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[malicious_reduce] malicious_reduce 52 B 1 64.4us 65.0us +1.0% stable
tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip state_dict.pt 1.5 MiB 1 44.1us 43.7us -1.0% stable
tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle safe_model.pkl 49.4 KiB 1 24.2us 24.0us -0.7% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory duplicate-corpus 840.0 KiB 81 44.45ms 44.76ms +0.7% stable
tests/benchmarks/test_scan_benchmarks.py::test_skip_filter_plain_text_files - 4.6 KiB 256 10.11ms 10.06ms -0.6% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[stack_global] stack_global 21 B 1 53.3us 53.0us -0.5% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex] nested_hex 130 B 1 108.3us 108.7us +0.4% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip state_dict.pt 1.5 MiB 1 30.95ms 30.82ms -0.4% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory mixed-corpus 1.7 MiB 54 72.52ms 72.73ms +0.3% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_multi_stream_padded_payload multi_stream_padded 4.1 KiB 1 101.9us 101.6us -0.3% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle safe_model.pkl 49.4 KiB 1 10.86ms 10.85ms -0.1% stable

@mldangelo-oai mldangelo-oai marked this pull request as ready for review April 15, 2026 23:31
@mldangelo-oai mldangelo-oai force-pushed the mdangelo/codex/harden-routing-bypasses branch from ebe538c to 64d895b Compare April 16, 2026 06:31
@mldangelo-oai mldangelo-oai merged commit 29747d5 into main Apr 16, 2026
24 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/harden-routing-bypasses branch April 16, 2026 06:31
@github-actions github-actions bot mentioned this pull request Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant