fix: preserve active payload severities by mldangelo-oai · Pull Request #1046 · promptfoo/modelaudit

mldangelo-oai · 2026-04-17T00:39:18Z

Summary

pass check context into whitelist downgrade decisions
keep active pickle payload, CVE, traversal, executable, operational-error, and incomplete-coverage findings at their original severity
preserve existing INFO downgrades for non-exempt findings from trusted whitelisted HuggingFace provenance

Finding

Fixes finding 4: whitelist can downgrade active payloads to INFO.

Validation

PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_base_scanner.py --maxfail=1
uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1

github-actions · 2026-04-17T00:40:23Z

Workflow run and artifacts

Performance Benchmarks

Compared 19 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 19 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 200.84ms -> 205.71ms (+2.4%).

Benchmark	Target	Size	Files	Baseline	Current	Change	Status
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[stack_global]`	`stack_global`	21 B	1	66.1us	70.8us	+7.1%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_raw]`	`nested_raw`	78 B	1	98.1us	103.2us	+5.2%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle`	`safe_model.pkl`	49.4 KiB	1	12.05ms	12.64ms	+4.9%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_opcode_budget_tail_payload`	`opcode_budget_tail`	14 B	1	73.3us	70.0us	-4.5%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[malicious_reduce]`	`malicious_reduce`	52 B	1	76.0us	79.4us	+4.4%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip`	`state_dict.pt`	1.5 MiB	1	32.61ms	33.88ms	+3.9%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_small]`	`safe_small`	68 B	1	55.6us	57.6us	+3.6%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_multi_stream_padded_payload`	`multi_stream_padded`	4.1 KiB	1	138.6us	134.3us	-3.1%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_skip_filter_plain_text_files`	`-`	4.6 KiB	256	13.71ms	14.08ms	+2.7%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_hidden_suspicious_string_budget`	`hidden_suspicious_string`	8.0 KiB	1	583.9us	598.8us	+2.6%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_stream`	`chunked_stream`	278.2 KiB	1	6.99ms	6.81ms	-2.5%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_base64]`	`nested_base64`	98 B	1	102.3us	104.7us	+2.3%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory`	`mixed-corpus`	1.7 MiB	54	79.99ms	81.81ms	+2.3%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory`	`duplicate-corpus`	840.0 KiB	81	49.43ms	50.48ms	+2.1%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large]`	`safe_large`	278.2 KiB	1	3.56ms	3.50ms	-1.9%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip`	`state_dict.pt`	1.5 MiB	1	52.4us	51.6us	-1.5%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string]`	`long_benign_string`	1.0 MiB	1	1.11ms	1.10ms	-1.0%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle`	`safe_model.pkl`	49.4 KiB	1	30.4us	30.1us	-1.0%	stable
`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex]`	`nested_hex`	130 B	1	108.6us	108.5us	-0.1%	stable

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b44107acc0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 25e9c4180e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 51c7c572da

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cb9ae83109

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6215deca0a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The previous exempt set covered S2xx pickle opcodes, a handful of S4xx/S5xx codes, and a keyword fallback over message/check name. It still let S1xx code-execution primitives (os/sys/subprocess/eval/compile/__import__/ importlib/runpy/webbrowser/ctypes/builtins) and HIGH-severity S3xx network primitives (raw sockets, ftplib, telnetlib, exfiltration) through when the emitter used a generic message — concretely, `flax_msgpack_scanner`'s "Suspicious code pattern detected: <regex>" CRITICAL findings were still being silently INFO-downgraded on whitelisted HuggingFace models. Adds those rule codes to `_WHITELIST_DOWNGRADE_EXEMPT_RULE_CODES` and switches the keyword fallback to word-boundary regex matching so incidental substrings like "executable" inside "ExecuTorch" or "rce" inside "force" no longer over-suppress legitimate downgrades. Adds parametrized regression coverage for S101–S110, S115, S301, S304, S305, S310, and for the word-boundary substring cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…itelist-active-payloads # Conflicts: # CHANGELOG.md

fix: preserve active payload severities

b44107a

mldangelo-oai marked this pull request as ready for review April 17, 2026 00:48

mldangelo-oai mentioned this pull request Apr 17, 2026

fix: close archive payload scan gaps #1042

Closed

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanners/base.py Outdated

fix: preserve executable whitelist severities

25e9c41

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanners/base.py

fix: preserve operational error severities

51c7c57

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanners/base.py Outdated

Comment thread modelaudit/scanners/base.py Outdated

fix: broaden whitelist severity exemptions

cb9ae83

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanner_results.py Outdated

Comment thread modelaudit/scanner_results.py

fix: refresh whitelist restore after metadata updates

6215dec

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread modelaudit/scanner_results.py

mldangelo-oai and others added 3 commits April 16, 2026 22:05

fix: remember pre-finish whitelist severity restores

2035037

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-wh…

f209e07

…itelist-active-payloads # Conflicts: # CHANGELOG.md

mldangelo-oai merged commit 13752e9 into main Apr 17, 2026
28 checks passed

mldangelo-oai deleted the mdangelo/codex/fix-whitelist-active-payloads branch April 17, 2026 15:24

github-actions bot mentioned this pull request Apr 17, 2026

chore: release main #1007

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve active payload severities#1046

fix: preserve active payload severities#1046
mldangelo-oai merged 8 commits intomainfrom
mdangelo/codex/fix-whitelist-active-payloads

mldangelo-oai commented Apr 17, 2026

Uh oh!

github-actions bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mldangelo-oai commented Apr 17, 2026

Summary

Finding

Validation

Uh oh!

github-actions bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Benchmarks

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Apr 17, 2026 •

edited

Loading