fix: scan generic archive python handlers by mldangelo-oai · Pull Request #1047 · promptfoo/modelaudit

mldangelo-oai · 2026-04-17T00:47:32Z

Summary

add shared archive-member source helpers for bounded Python static analysis
scan generic ZIP/TAR Python members for dangerous handler code, while skipping NPZ source-like internals
preserve benign Python source members and mark oversized Python-member coverage incomplete

Finding

Fixes finding 5: generic archives pass Python handlers as unknown.

Validation

PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_zip_scanner.py::test_scan_zip_flags_dangerous_python_member tests/scanners/test_zip_scanner.py::test_scan_zip_ignores_benign_python_member tests/scanners/test_tar_scanner.py::TestTarScanner::test_scan_tar_flags_dangerous_python_member tests/scanners/test_tar_scanner.py::TestTarScanner::test_scan_tar_ignores_benign_python_member --maxfail=1
uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1

github-actions · 2026-04-17T00:48:39Z

Workflow run and artifacts

Performance Benchmarks

Compared 19 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 19 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 199.02ms -> 196.60ms (-1.2%).

Benchmark	Target	Size	Files	Baseline	Current	Change	Status
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string]`	`long_benign_string`	1.0 MiB	1	1.15ms	1.09ms	-5.7%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_base64]`	`nested_base64`	98 B	1	102.1us	105.5us	+3.3%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_skip_filter_plain_text_files`	`-`	4.6 KiB	256	13.58ms	13.18ms	-3.0%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip`	`state_dict.pt`	1.5 MiB	1	52.7us	51.4us	-2.4%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle`	`safe_model.pkl`	49.4 KiB	1	30.6us	29.9us	-2.1%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_multi_stream_padded_payload`	`multi_stream_padded`	4.1 KiB	1	136.3us	133.9us	-1.8%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_hidden_suspicious_string_budget`	`hidden_suspicious_string`	8.0 KiB	1	575.5us	585.8us	+1.8%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory`	`mixed-corpus`	1.7 MiB	54	79.45ms	78.15ms	-1.6%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle`	`safe_model.pkl`	49.4 KiB	1	12.24ms	12.06ms	-1.5%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_small]`	`safe_small`	68 B	1	55.2us	54.4us	-1.5%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[stack_global]`	`stack_global`	21 B	1	65.9us	66.7us	+1.2%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_stream`	`chunked_stream`	278.2 KiB	1	6.73ms	6.80ms	+1.1%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_raw]`	`nested_raw`	78 B	1	101.7us	100.7us	-0.9%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory`	`duplicate-corpus`	840.0 KiB	81	48.47ms	48.06ms	-0.8%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_opcode_budget_tail_payload`	`opcode_budget_tail`	14 B	1	73.7us	74.1us	+0.6%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip`	`state_dict.pt`	1.5 MiB	1	32.53ms	32.39ms	-0.4%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex]`	`nested_hex`	130 B	1	110.0us	109.5us	-0.4%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large]`	`safe_large`	278.2 KiB	1	3.48ms	3.49ms	+0.3%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[malicious_reduce]`	`malicious_reduce`	52 B	1	77.0us	76.9us	-0.1%	stable

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c93337cabd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 67490754a0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 124db27959

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Address three review findings on the generic ZIP/TAR Python-member scan: 1. Rule-code attribution. Previously every high-risk finding was emitted under S104 (eval/exec) regardless of the actual call, so SARIF export, dashboards, and per-rule severity overrides lost the distinction between `os.system` (S101), `subprocess.run` (S103), `importlib. import_module` (S107), `pickle.load`/`pickle.loads` (S213), bare `__import__` (S106), and `eval`/`exec` (S104). Emit one finding per rule code, with the rule code derived from the resolved call name. 2. Config override. The generic ZIP path silently ignored `max_mar_python_analysis_bytes`, so operators who raised or lowered the cap for MAR archives never saw it applied to ordinary ZIPs. Extract a `_max_python_member_analysis_bytes()` helper that the MAR path also consumes, keeping both paths in sync. 3. Code duplication. Move the ~60-line dispatch body that was copy-pasted between `zip_scanner._scan_generic_member_security` and `tar_scanner._scan_generic_member_security` into a shared `scan_archive_member_for_known_risks` helper in `archive_member_security.py`, parameterised by archive kind, analysis cap, and inconclusive-reason string. Also pass source bytes to `ast.parse` directly so PEP 263 encoding declarations (`# -*- coding: latin-1 -*-`) are honored instead of being lossily replaced before parsing — verified with a regression test that embeds a non-UTF-8 byte in a comment. Adds TAR executable-member parity test, parametrized rule-code tests for ZIP and TAR, a mixed-risk test asserting one finding per rule code, a config-override test, a PEP 263 test, and short comments on the deliberate `getattr` string-concat limitation and the `Starred` binding-loss case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 60211ccfcb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 86609d64d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…neric-archive-python-handlers # Conflicts: # CHANGELOG.md

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bfd96c6149

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…neric-archive-python-handlers # Conflicts: # CHANGELOG.md

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1d1825024e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 94f9dddc04

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-17T15:18:55Z

+    def visit_Assign(self, node: ast.Assign) -> None:
+        self.visit(node.value)
+        for target in node.targets:
+            self._bind_target_to_value(target, node.value)
+            self.visit(target)


Handle match-case alias branches before mutating alias scope

_HighRiskPythonCallVisitor merges aliases for if/while/try, but not match. Case bodies are visited sequentially, so visit_Assign mutates aliases as if every case executed. For match __name__: case 'a': import subprocess as sp; case _: import os as sp; sp.run(...), sp ends up as os and subprocess.run is missed, creating a false-negative bypass in archive Python-member security checks.

Useful? React with 👍 / 👎.

fix: scan generic archive python handlers

c93337c

mldangelo-oai marked this pull request as ready for review April 17, 2026 00:48

mldangelo-oai mentioned this pull request Apr 17, 2026

fix: close archive payload scan gaps #1042

Closed

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanners/archive_member_security.py Outdated

Comment thread modelaudit/scanners/zip_scanner.py Outdated

mldangelo-oai added 2 commits April 16, 2026 17:57

fix: tighten archive python member detection

de1a1ca

fix: scan npz archive member payloads

6749075

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanners/archive_member_security.py Outdated

Comment thread modelaudit/scanners/archive_member_security.py Outdated

fix: harden archive python member analysis

124db27

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanners/archive_member_security.py

Comment thread modelaudit/scanners/archive_member_security.py Outdated

mldangelo-oai and others added 2 commits April 16, 2026 21:50

fix: resolve archive python call aliases

aac8614

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanners/archive_member_security.py Outdated

fix: ignore class aliases in archive method scans

86609d6

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanners/archive_member_security.py Outdated

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-ge…

bfd96c6

…neric-archive-python-handlers # Conflicts: # CHANGELOG.md

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanners/archive_member_security.py

mldangelo-oai added 2 commits April 17, 2026 07:58

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-ge…

4860df0

…neric-archive-python-handlers # Conflicts: # CHANGELOG.md

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-ge…

1d18250

…neric-archive-python-handlers # Conflicts: # CHANGELOG.md

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanners/archive_member_security.py Outdated

Comment thread modelaudit/scanners/archive_member_security.py Outdated

fix: preserve archive python aliases across branches

94f9ddd

mldangelo-oai merged commit 5b90a84 into main Apr 17, 2026
28 checks passed

mldangelo-oai deleted the mdangelo/codex/fix-generic-archive-python-handlers branch April 17, 2026 15:17

github-actions bot mentioned this pull request Apr 17, 2026

chore: release main #1007

Merged

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Conversation

mldangelo-oai commented Apr 17, 2026

Summary

Finding

Validation

Uh oh!

github-actions bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Benchmarks

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Apr 17, 2026 •

edited

Loading