Skip to content

fix: preserve active payload severities#1046

Merged
mldangelo-oai merged 8 commits intomainfrom
mdangelo/codex/fix-whitelist-active-payloads
Apr 17, 2026
Merged

fix: preserve active payload severities#1046
mldangelo-oai merged 8 commits intomainfrom
mdangelo/codex/fix-whitelist-active-payloads

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

Summary

  • pass check context into whitelist downgrade decisions
  • keep active pickle payload, CVE, traversal, executable, operational-error, and incomplete-coverage findings at their original severity
  • preserve existing INFO downgrades for non-exempt findings from trusted whitelisted HuggingFace provenance

Finding

Fixes finding 4: whitelist can downgrade active payloads to INFO.

Validation

  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_base_scanner.py --maxfail=1
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 17, 2026

Workflow run and artifacts

Performance Benchmarks

Compared 19 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 19 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 200.84ms -> 205.71ms (+2.4%).

Benchmark Target Size Files Baseline Current Change Status
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[stack_global] stack_global 21 B 1 66.1us 70.8us +7.1% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_raw] nested_raw 78 B 1 98.1us 103.2us +5.2% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle safe_model.pkl 49.4 KiB 1 12.05ms 12.64ms +4.9% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_opcode_budget_tail_payload opcode_budget_tail 14 B 1 73.3us 70.0us -4.5% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_dangerous_global_payloads[malicious_reduce] malicious_reduce 52 B 1 76.0us 79.4us +4.4% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip state_dict.pt 1.5 MiB 1 32.61ms 33.88ms +3.9% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_small] safe_small 68 B 1 55.6us 57.6us +3.6% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_multi_stream_padded_payload multi_stream_padded 4.1 KiB 1 138.6us 134.3us -3.1% stable
tests/benchmarks/test_scan_benchmarks.py::test_skip_filter_plain_text_files - 4.6 KiB 256 13.71ms 14.08ms +2.7% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_hidden_suspicious_string_budget hidden_suspicious_string 8.0 KiB 1 583.9us 598.8us +2.6% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_stream chunked_stream 278.2 KiB 1 6.99ms 6.81ms -2.5% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_base64] nested_base64 98 B 1 102.3us 104.7us +2.3% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory mixed-corpus 1.7 MiB 54 79.99ms 81.81ms +2.3% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory duplicate-corpus 840.0 KiB 81 49.43ms 50.48ms +2.1% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[safe_large] safe_large 278.2 KiB 1 3.56ms 3.50ms -1.9% stable
tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip state_dict.pt 1.5 MiB 1 52.4us 51.6us -1.5% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_safe_payloads[long_benign_string] long_benign_string 1.0 MiB 1 1.11ms 1.10ms -1.0% stable
tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle safe_model.pkl 49.4 KiB 1 30.4us 30.1us -1.0% stable
tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payloads[nested_hex] nested_hex 130 B 1 108.6us 108.5us -0.1% stable

@mldangelo-oai mldangelo-oai marked this pull request as ready for review April 17, 2026 00:48
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b44107acc0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/base.py Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 25e9c4180e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/base.py
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 51c7c572da

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/base.py Outdated
Comment thread modelaudit/scanners/base.py Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cb9ae83109

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanner_results.py Outdated
Comment thread modelaudit/scanner_results.py
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6215deca0a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanner_results.py
mldangelo-oai and others added 3 commits April 16, 2026 22:05
The previous exempt set covered S2xx pickle opcodes, a handful of S4xx/S5xx
codes, and a keyword fallback over message/check name. It still let S1xx
code-execution primitives (os/sys/subprocess/eval/compile/__import__/
importlib/runpy/webbrowser/ctypes/builtins) and HIGH-severity S3xx network
primitives (raw sockets, ftplib, telnetlib, exfiltration) through when the
emitter used a generic message — concretely, `flax_msgpack_scanner`'s
"Suspicious code pattern detected: <regex>" CRITICAL findings were still
being silently INFO-downgraded on whitelisted HuggingFace models.

Adds those rule codes to `_WHITELIST_DOWNGRADE_EXEMPT_RULE_CODES` and
switches the keyword fallback to word-boundary regex matching so
incidental substrings like "executable" inside "ExecuTorch" or "rce"
inside "force" no longer over-suppress legitimate downgrades.

Adds parametrized regression coverage for S101–S110, S115, S301, S304,
S305, S310, and for the word-boundary substring cases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…itelist-active-payloads

# Conflicts:
#	CHANGELOG.md
@mldangelo-oai mldangelo-oai merged commit 13752e9 into main Apr 17, 2026
28 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/fix-whitelist-active-payloads branch April 17, 2026 15:24
@github-actions github-actions bot mentioned this pull request Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants