Skip to content

fix: fail closed on pytorch zip timeouts#1045

Merged
mldangelo-oai merged 2 commits intomainfrom
mdangelo/codex/fix-pytorch-timeout-inconclusive
Apr 17, 2026
Merged

fix: fail closed on pytorch zip timeouts#1045
mldangelo-oai merged 2 commits intomainfrom
mdangelo/codex/fix-pytorch-timeout-inconclusive

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

Summary

  • mark PyTorch ZIP timeout paths as analysis_incomplete with an inconclusive scan outcome
  • finish timeout results unsuccessful so partial archive coverage is not reported as complete
  • downgrade the timeout check itself to INFO while preserving fail-closed result semantics

Finding

Fixes finding 3: timeout returns a complete-looking result.

Validation

  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_pytorch_zip_scanner.py::test_pytorch_zip_timeout_marks_inconclusive --maxfail=1
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 17, 2026

Workflow run and artifacts

Performance Benchmarks

Compared 19 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 19 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 192.04ms -> 193.44ms (+0.7%).

Benchmark Target Size Files Baseline Current Change Status
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_small] safe_small 68 B 1 53.2us 56.3us +5.8% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_raw] nested_raw 78 B 1 95.6us 100.5us +5.2% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_multi_stream_padded_payload multi_stream_padded 4.1 KiB 1 129.9us 136.4us +5.0% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_stream chunked_stream 278.2 KiB 1 6.66ms 6.85ms +2.9% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[stack_global] stack_global 21 B 1 64.9us 66.7us +2.7% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_base64] nested_base64 98 B 1 100.0us 102.7us +2.7% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_opcode_budget_tail_payload opcode_budget_tail 14 B 1 69.4us 71.3us +2.7% stable
tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip state_dict.pt 1.5 MiB 1 50.0us 51.3us +2.6% stable
tests/benchmarks/test_scan_benchmarks.py::test_skip_filter_plain_text_files - 4.6 KiB 256 13.51ms 13.84ms +2.4% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[malicious_reduce] malicious_reduce 52 B 1 75.1us 76.8us +2.3% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_hidden_suspicious_string_budget hidden_suspicious_string 8.0 KiB 1 585.1us 593.0us +1.4% stable
tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle safe_model.pkl 49.4 KiB 1 29.9us 30.3us +1.3% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large] safe_large 278.2 KiB 1 3.47ms 3.50ms +1.0% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip state_dict.pt 1.5 MiB 1 30.84ms 31.14ms +1.0% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory duplicate-corpus 840.0 KiB 81 47.47ms 47.87ms +0.8% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex] nested_hex 130 B 1 106.9us 107.7us +0.7% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle safe_model.pkl 49.4 KiB 1 11.78ms 11.81ms +0.3% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory mixed-corpus 1.7 MiB 54 75.88ms 75.96ms +0.1% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string] long_benign_string 1.0 MiB 1 1.08ms 1.08ms +0.0% stable

@mldangelo-oai mldangelo-oai marked this pull request as ready for review April 17, 2026 00:48
@mldangelo-oai mldangelo-oai force-pushed the mdangelo/codex/fix-pytorch-timeout-inconclusive branch from bc9654f to 110a02b Compare April 17, 2026 00:58
Copy link
Copy Markdown
Contributor

@ianw-oai ianw-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Tight fail-closed timeout fix with direct regression coverage.

@mldangelo-oai mldangelo-oai merged commit bf72f62 into main Apr 17, 2026
28 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/fix-pytorch-timeout-inconclusive branch April 17, 2026 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants