fix: route ONNX pb files by content by mldangelo-oai · Pull Request #1029 · promptfoo/modelaudit

mldangelo-oai · 2026-04-15T18:26:04Z

Summary

detect ONNX content hints before falling back to the generic .pb/protobuf extension route
treat ONNX-in-.pb as valid file-type routing and suppress the expected mismatch warning
add filetype and core dispatch regressions for ONNX payloads saved with a .pb suffix

Validation

uv run ruff format modelaudit/utils/file/detection.py modelaudit/core.py tests/utils/file/test_filetype.py tests/test_core.py
uv run ruff check modelaudit/utils/file/detection.py modelaudit/core.py tests/utils/file/test_filetype.py tests/test_core.py
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/file/test_filetype.py tests/test_core.py -q
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1

github-actions · 2026-04-15T18:27:24Z

Performance Benchmarks

Compared 19 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 19 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 167.39ms -> 168.15ms (+0.5%).

Benchmark	Target	Size	Files	Baseline	Current	Change	Status
`tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle`	`safe_model.pkl`	49.4 KiB	1	21.0us	22.5us	+7.1%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory`	`duplicate-corpus`	840.0 KiB	81	39.80ms	40.92ms	+2.8%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip`	`state_dict.pt`	1.5 MiB	1	37.4us	38.2us	+2.0%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_skip_filter_plain_text_files`	`-`	4.6 KiB	256	10.35ms	10.55ms	+1.9%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_small]`	`safe_small`	68 B	1	52.6us	53.6us	+1.9%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_multi_stream_padded_payload`	`multi_stream_padded`	4.1 KiB	1	110.8us	112.6us	+1.6%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory`	`mixed-corpus`	1.7 MiB	54	66.53ms	65.60ms	-1.4%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle`	`safe_model.pkl`	49.4 KiB	1	9.91ms	10.04ms	+1.4%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex]`	`nested_hex`	130 B	1	119.8us	121.3us	+1.2%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_opcode_budget_tail_payload`	`opcode_budget_tail`	14 B	1	69.8us	69.0us	-1.2%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[stack_global]`	`stack_global`	21 B	1	61.5us	60.9us	-1.0%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_stream`	`chunked_stream`	278.2 KiB	1	6.85ms	6.92ms	+1.0%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_base64]`	`nested_base64`	98 B	1	105.8us	104.8us	-0.9%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip`	`state_dict.pt`	1.5 MiB	1	27.79ms	27.98ms	+0.7%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large]`	`safe_large`	278.2 KiB	1	3.93ms	3.91ms	-0.6%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_hidden_suspicious_string_budget`	`hidden_suspicious_string`	8.0 KiB	1	508.6us	511.6us	+0.6%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[malicious_reduce]`	`malicious_reduce`	52 B	1	72.2us	71.8us	-0.5%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string]`	`long_benign_string`	1.0 MiB	1	972.6us	971.2us	-0.1%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_raw]`	`nested_raw`	78 B	1	89.4us	89.4us	-0.1%	stable

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ca7e41601e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-15T23:34:52Z

+
+def _looks_like_onnx_model_candidate_file(path: Path, size: int, header: bytes) -> bool:
+    """Run the bounded ONNX parser only for plausible protobuf tag starts."""
+    return _has_onnx_model_tag_start(header) and _looks_like_onnx_model_file(path, size)


Parse ONNX candidates without first-byte whitelisting

_looks_like_onnx_model_candidate_file short-circuits on _has_onnx_model_tag_start(header). Protobuf allows unknown/reordered fields, so a valid ONNX ModelProto can legally start with a different tag (e.g., prefixed unknown field). In that case this returns False, routing a malicious .pb as generic protobuf and bypassing ONNX-specific checks (like PythonOp detection).

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7a711c276a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T06:36:37Z

+
+def _looks_like_onnx_model_candidate_file(path: Path, size: int, header: bytes) -> bool:
+    """Run the bounded ONNX parser only for plausible protobuf tag starts."""
+    return _has_onnx_model_tag_start(header) and _looks_like_onnx_model_file(path, size)


Parse ONNX candidates without first-byte whitelist

_looks_like_onnx_model_candidate_file() only runs the ONNX structural parser when header[0] is in _ONNX_MODEL_TOP_LEVEL_TAG_START_BYTES. Protobuf permits reordered/unknown fields, so valid ModelProto payloads can start with other tags (e.g., unknown-field prefixes or other top-level fields). Those .pb files are then routed as generic protobuf, skipping ONNX-specific checks like PythonOp detection.

Useful? React with 👍 / 👎.

mldangelo-oai marked this pull request as ready for review April 15, 2026 23:31

chatgpt-codex-connector bot reviewed Apr 15, 2026

View reviewed changes

mldangelo-oai added 2 commits April 15, 2026 23:31

fix: route onnx pb files by content

1940b4e

fix: harden onnx protobuf routing

7a711c2

mldangelo-oai force-pushed the mdangelo/codex/route-onnx-pb-by-content branch from ca7e416 to 7a711c2 Compare April 16, 2026 06:32

mldangelo-oai merged commit 6e9aa45 into main Apr 16, 2026
8 checks passed

mldangelo-oai deleted the mdangelo/codex/route-onnx-pb-by-content branch April 16, 2026 06:32

github-actions bot mentioned this pull request Apr 16, 2026

chore: release main #1007

Merged

chatgpt-codex-connector bot reviewed Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: route ONNX pb files by content#1029

fix: route ONNX pb files by content#1029
mldangelo-oai merged 2 commits intomainfrom
mdangelo/codex/route-onnx-pb-by-content

mldangelo-oai commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 15, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mldangelo-oai commented Apr 15, 2026

Summary

Validation

Uh oh!

github-actions bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Benchmarks

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Apr 15, 2026 •

edited

Loading