fix: report actual file size in scan summary when scanner exits early by yash2998chhabria · Pull Request #587 · promptfoo/modelaudit

yash2998chhabria · 2026-02-25T20:01:21Z

Summary

Fix scan summary displaying "Size: 0 bytes" for files whose scanner returns early (missing optional dependency, parse error, etc.)
Add a fallback in _scan_file_internal (core.py) that sets bytes_scanned to the actual file size whenever a scanner leaves it at zero
Affects ONNX, TFLite, Flax, XGBoost, and 20+ other scanners that have early-return paths before setting bytes_scanned

Test plan

Verified ONNX file scan now shows correct size (4.72 MB) instead of "0 bytes"
Verified safetensors file scan still shows correct size (no regression)
Verified GGUF file scan still shows correct size (no regression)
All 351 existing tests pass (python -m pytest tests/ -x -q)

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Scan results now fallback to the actual file size when a scanner leaves bytes-scanned unset.
New Features
- Pickle parsing now handles multiple streams, resynchronizes across separators, and preserves first-stream boundaries for downstream scans.
- Detection coverage expanded to include many additional suspicious modules/functions and extended pickle variants.
Tests
- Added regression tests for multi-stream parsing, blocklist hardening, broadened format/error detection, and related dangerous-pattern cases.

Expand ALWAYS_DANGEROUS_MODULES with ~45 new modules not covered by PR #518: network (smtplib, imaplib, poplib, nntplib, xmlrpc, socketserver, ssl, requests, aiohttp), compilation (codeop, marshal, types, compileall, py_compile), FFI (_ctypes), debugging (bdb, trace), operator bypasses (_operator, functools), pickle recursion (pickle, _pickle, dill, cloudpickle, joblib), filesystem (tempfile, filecmp, fileinput, glob, distutils, pydoc, pexpect), venv/pip (venv, ensurepip, pip), threading (signal, _signal, threading), and more (webbrowser, asyncio, mmap, select, selectors, logging, syslog, tarfile, zipfile, shelve, sqlite3, _sqlite3, doctest, idlelib, lib2to3, uuid). Add uuid as novel detection (VULN-6: _get_command_stdout calls subprocess.Popen) and pkgutil.resolve_name as dynamic resolution trampoline. Mirror new modules in SUSPICIOUS_GLOBALS for defense-in-depth. Fix multi-stream opcode analysis so detailed checks run across all pickle streams. Add NEWOBJ_EX to all opcode check lists. Includes 10 regression tests covering pkgutil trampoline, uuid RCE, multi-stream exploit, NEWOBJ_EX, and spot-checks for smtplib, sqlite3, tarfile, marshal, cloudpickle, and webbrowser. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…e scanning - Fix test regression: track first pickle STOP position and use it for binary content scanning instead of f.tell() (which advances past multi-stream data) - Fix multi-stream resync bypass: skip junk separator bytes between streams (up to 256 bytes) instead of returning immediately - Gracefully handle ValueErrors on subsequent streams (non-pickle data after valid pickle) instead of propagating them as file-level errors - Add type hints: ScanResult return type on _scan_bytes, -> None on all test methods - Add separator-byte resync regression test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Buffer opcodes for subsequent pickle streams and discard partial streams (binary tensor data misinterpreted as opcodes) to prevent MANY_DANGEROUS_OPCODES false positives - Add resync logic to skip up to 256 non-pickle separator bytes between streams, preventing single-byte bypass of multi-stream detection - Track pickle memo (BINPUT/BINGET) for safe ML globals so REDUCE calls referencing safe callables via memo are not counted as dangerous opcodes - Reset dangerous opcode counters at STOP boundaries so each stream is evaluated independently - Use first_pickle_end_pos for binary content scanning instead of f.tell() which advances past tensor data during multi-stream parsing Validated against 5 HuggingFace models with A/B comparison to main branch — no new false positives introduced. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Scanners that return early (missing optional dependency, parse error, etc.) left bytes_scanned at its default value of 0, causing the scan summary to always display "Size: 0 bytes". This affected ONNX, TFLite, Flax, XGBoost, and several other format scanners. Add a fallback in _scan_file_internal that sets bytes_scanned to the actual file size (already computed via os.path.getsize) whenever a scanner leaves it at zero. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-02-25T20:01:41Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds a bytes_scanned fallback in core scanning, massively extends the suspicious-globals mapping, implements multi-stream-aware pickle parsing with NEWOBJ_EX handling and first_pickle_end_pos metadata, and adds regression tests covering multi-stream resync, NEWOBJ_EX, and expanded blocklists.

Changes

Cohort / File(s)	Summary
Core scanning `modelaudit/core.py`	Adds fallback: if a scanner returns `bytes_scanned == 0` but actual file size > 0, set `bytes_scanned` to the file size.
Detectors — suspicious globals `modelaudit/detectors/suspicious_symbols.py`	Extensively expands `SUSPICIOUS_GLOBALS` with many new modules and callable patterns (pkgutil, zipimport, uuid, networking, FFI, serialization, profiling/debugging, packaging, threading, DBs, archives, etc.).
Pickle scanner `modelaudit/scanners/pickle_scanner.py`	Adds multi-stream-aware parsing/resynchronization and buffering, exposes `first_pickle_end_pos` in result metadata, adds `multi_stream` parameter to `_genops_with_fallback`, integrates `NEWOBJ_EX` into detection paths, and expands ALWAYS_DANGEROUS_MODULES/FUNCTIONS (e.g., `pkgutil`, `uuid`, networking/import modules).
Tests — pickle scanner `tests/scanners/test_pickle_scanner.py`	Adds `TestPickleScannerBlocklistHardening` with helpers and ~11 regression tests for pkgutil/uuid, NEWOBJ_EX, multi-stream benign→malicious flows, separator-byte resync, and blocking of many modules (smtplib, sqlite3, tarfile, marshal, cloudpickle, webbrowser).
Tests — real-world joblib/dill `tests/test_real_world_dill_joblib.py`	Loosens parse/format-failure matching to accept additional keywords (`opcode`, `format`, `unable to parse`, `invalid`, `memoryerror`, `corrupted`) when `bytes_scanned == 0` and issues exist.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant PickleScanner as PickleScanner
    participant GenOps as _genops_with_fallback
    participant OpcodeProcessor as OpcodeProcessor
    participant Matcher as Matcher

    Client->>PickleScanner: scan(file_obj)
    PickleScanner->>GenOps: _genops_with_fallback(file_obj, multi_stream=True)
    GenOps->>GenOps: parse stream(s), resync, buffer subsequent streams, set first_pickle_end_pos
    GenOps->>OpcodeProcessor: yield opcodes (including NEWOBJ_EX)
    OpcodeProcessor->>Matcher: evaluate sequences / blocklists / CVE patterns
    Matcher->>PickleScanner: report dangerous findings
    PickleScanner->>Client: return ScanResult (metadata:first_pickle_end_pos, bytes_scanned)

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I hop through pickles, bytes in tow,

I stitch the streams where sneaky bytes flow.
NEWOBJ_EX, I sniff and flag,
Blocklists thick, I guard the bag.
Nibble safety, onward I go.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: fixing scan summaries to report actual file size when scanners exit early, which is the primary objective of this PR.
Docstring Coverage	✅ Passed	Docstring coverage is 96.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/size-bug

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 52-54: The fixed 256-byte resync cap (_MAX_RESYNC_BYTES) combined
with the resync_skipped counter can hide malicious second pickle streams; update
the resynchronization logic in pickle_scanner.py to remove or substantially
raise the fixed cap and instead scan for additional pickle stream opcodes until
EOF (or until a safe global size-based limit tied to the input length), i.e.,
revise the loop that increments resync_skipped and checks _MAX_RESYNC_BYTES
(around the code using resync_skipped and _MAX_RESYNC_BYTES between lines
~106-115) so it continues scanning for stream markers across larger/padded gaps
(or uses a sliding-window opcode search) rather than stopping after 256 bytes.
- Around line 1549-1553: When handling the pickle stream boundary (the branch
where opcode.name == "STOP") you currently reset counters but not the
stream-local memo; add logic to clear or reinitialize the scanner's _safe_memo
at the same place you reset dangerous_opcode_count, consecutive_dangerous and
max_consecutive so memo indexes from a previous stream cannot be reused to mark
BINGET->REDUCE pairs in a new stream as safe. Update the STOP handling block in
pickle_scanner.py to reset _safe_memo (e.g., assign a fresh empty memo
structure) while keeping the existing counter resets to preserve detection
behavior.
- Line 1477: The current conditional incorrectly requires isinstance(arg, str)
for OBJ/NEWOBJ/NEWOBJ_EX, making those branches unreachable; split the logic so
INST is handled when opcode.name == "INST" and isinstance(arg, str) as before,
but handle opcode.name in ["OBJ","NEWOBJ","NEWOBJ_EX"] in a separate branch that
does not require arg to be a string and instead inspects the stack or preceding
GLOBAL/STACK items to resolve the class being instantiated (so NEWOBJ_EX
following a dangerous GLOBAL is flagged); update the logic in pickle_scanner.py
where opcode and arg are checked and ensure the test
test_newobj_ex_dangerous_class can detect the dangerous pattern.

In `@tests/scanners/test_pickle_scanner.py`:
- Around line 647-695: The module-blocklist spot tests (e.g.,
test_smtplib_blocked, test_sqlite3_blocked, test_tarfile_blocked,
test_marshal_blocked, test_cloudpickle_blocked, test_webbrowser_blocked) only
assert result.has_errors and may pass on unrelated failure paths; add an
explicit assert result.success (immediately after obtaining result from
self._scan_bytes(...)) in each of these test_* methods to ensure the
CRITICAL/module checks validate successful detection flow before the existing
severity/message assertions.

ℹ️ Review info

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e736ebb and 5efc0df.

📒 Files selected for processing (4)

modelaudit/core.py
modelaudit/detectors/suspicious_symbols.py
modelaudit/scanners/pickle_scanner.py
tests/scanners/test_pickle_scanner.py

modelaudit/scanners/pickle_scanner.py

tests/scanners/test_pickle_scanner.py

Resolve conflicts in pickle_scanner.py: - DANGEROUS_FUNCTIONS: keep both branch additions (pkgutil, uuid) and main additions (cProfile, profile, pdb, timeit, ctypes functions) - ALWAYS_DANGEROUS_MODULES: merge main's core blocklist (ctypes, cProfile, profile, pdb, timeit, _thread, multiprocessing, zipimport, importlib, runpy) with branch's extended coverage (pkgutil, smtplib, marshal, functools, pickle, cloudpickle, tempfile, uuid, etc.) and preserve main's NOTE about linecache/logging.config - REDUCE opcode handling: adopt main's resolved_callables lookup with branch's _safe_memo fallback for BINGET/memo patterns Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (4)

modelaudit/scanners/pickle_scanner.py (3)

2209-2213: ⚠️ Potential issue | 🟠 Major

Reset _safe_memo at each stream boundary (STOP).

Line 2209 resets counters only; memo-index safety state can leak into the next stream.

🔧 Suggested fix

         if opcode.name == "STOP":
             dangerous_opcode_count = 0
             consecutive_dangerous = 0
             max_consecutive = 0
+            _safe_memo.clear()
             continue

Based on learnings: "Preserve or strengthen security detections; test both benign and malicious samples when modifying scanners".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@modelaudit/scanners/pickle_scanner.py` around lines 2209 - 2213, At each
pickle stream boundary where the code checks opcode.name == "STOP" (inside the
PickleScanner/scan loop), also reset the memo safety state by clearing or
reinitializing _safe_memo so memo-index safety does not leak between streams;
update the STOP handling block that currently resets dangerous_opcode_count,
consecutive_dangerous, and max_consecutive to additionally reset _safe_memo (and
any related memo tracking structures) and run unit tests for both benign and
malicious samples to ensure detections are preserved.

2132-2138: ⚠️ Potential issue | 🟠 Major

OBJ/NEWOBJ/NEWOBJ_EX branch is unreachable behind isinstance(arg, str).

Line 2132 gates stack-based opcodes behind a string-arg check; only INST can satisfy this in practice.

🔧 Suggested fix

-        if opcode.name in ["INST", "OBJ", "NEWOBJ", "NEWOBJ_EX"] and isinstance(arg, str):
+        if opcode.name in {"OBJ", "NEWOBJ", "NEWOBJ_EX"}:
+            return {
+                "pattern": f"{opcode.name}_EXECUTION",
+                "argument": str(arg),
+                "position": pos,
+                "opcode": opcode.name,
+            }
+        if opcode.name == "INST" and isinstance(arg, str):
             return {
                 "pattern": f"{opcode.name}_EXECUTION",
                 "argument": arg,
                 "position": pos,
                 "opcode": opcode.name,
             }

#!/bin/bash
# Verify pickletools opcode argument descriptors for the affected opcodes.
python - <<'PY'
import pickletools
for name in ("INST", "OBJ", "NEWOBJ", "NEWOBJ_EX"):
    op = next(o for o in pickletools.opcodes if o.name == name)
    print(f"{name}: has_stream_argument={op.arg is not None}")
PY

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@modelaudit/scanners/pickle_scanner.py` around lines 2132 - 2138, The branch
currently requires isinstance(arg, str) for all opcodes, making the
OBJ/NEWOBJ/NEWOBJ_EX cases unreachable; update the logic in pickle_scanner.py
(around the block handling opcode.name in ["INST","OBJ","NEWOBJ","NEWOBJ_EX"])
to separate handling: keep the INST path gated by isinstance(arg, str) to return
the INST_EXECUTION pattern, but remove the str requirement for OBJ, NEWOBJ, and
NEWOBJ_EX so they return their respective _EXECUTION pattern even when arg is
not a string (handle None/int safely), i.e., branch on opcode.name first and
only apply isinstance(arg, str) for INST to avoid crashes and correctly detect
stack-based opcodes.

52-54: ⚠️ Potential issue | 🟠 Major

Fixed 256-byte resync limit is bypassable with larger separators.

Line 113 hard-stops resync after 256 bytes, so padded separators can hide later malicious streams.

🔧 Suggested hardening

-    # Maximum number of consecutive non-pickle bytes to skip when resyncing
-    _MAX_RESYNC_BYTES = 256
@@
-            if resync_skipped >= _MAX_RESYNC_BYTES:
-                return

Based on learnings: "Preserve or strengthen security detections; test both benign and malicious samples when modifying scanners".

Also applies to: 113-114

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@modelaudit/scanners/pickle_scanner.py` around lines 52 - 54, The current
fixed resync limit (_MAX_RESYNC_BYTES = 256) and resync_skipped counter can be
bypassed by large padded separators; update the resync logic in the pickle
scanner to stop relying on a small hard cap and instead scan the incoming stream
for valid pickle start/marker patterns until a legitimate pickle header or EOF
is found. Make _MAX_RESYNC_BYTES configurable (or remove the fixed cap) and
change the loop that increments resync_skipped to perform a sliding search for
known pickle opcodes/headers and reset resync_skipped when a candidate is found;
ensure the scanner still fails safely on EOF and include tests for both short
and very large padded separators to validate detection.

tests/scanners/test_pickle_scanner.py (1)

744-792: ⚠️ Potential issue | 🟡 Minor

Add explicit result.success assertions in module spot checks.

These tests currently assert has_errors only; they can pass on non-target failure paths.

🔧 Suggested consolidation

+    def _assert_module_blocked(self, module: str, func: str) -> None:
+        result = self._scan_bytes(self._craft_global_reduce_pickle(module, func))
+        assert result.success
+        assert result.has_errors
+        assert any(
+            i.severity == IssueSeverity.CRITICAL and module in i.message
+            for i in result.issues
+        ), f"Expected CRITICAL {module} issue, got: {[i.message for i in result.issues]}"
@@
     def test_smtplib_blocked(self) -> None:
-        result = self._scan_bytes(self._craft_global_reduce_pickle("smtplib", "SMTP"))
-        assert result.has_errors
-        assert any(i.severity == IssueSeverity.CRITICAL and "smtplib" in i.message for i in result.issues), (
-            f"Expected CRITICAL smtplib issue, got: {[i.message for i in result.issues]}"
-        )
+        self._assert_module_blocked("smtplib", "SMTP")

Based on learnings: "Add comprehensive tests including edge cases and regression coverage for scanner changes".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/scanners/test_pickle_scanner.py` around lines 744 - 792, The tests
(e.g., test_smtplib_blocked, test_sqlite3_blocked, test_tarfile_blocked,
test_marshal_blocked, test_cloudpickle_blocked, test_webbrowser_blocked) only
assert result.has_errors and can pass on unrelated failure paths; update each to
also assert the explicit success flag (e.g., assert result.success is False or
assert not result.success) after calling
_scan_bytes(_craft_global_reduce_pickle(...)) so the test guarantees the scanner
reported a failure state for these dangerous-module pickles.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 435-451: The list literal that adds "logging" to the
always-dangerous modules conflicts with the nearby intent note that
"logging.config" is intentionally excluded; either remove "logging" from the
always-dangerous/modules blocklist or update the comment to explicitly state
that only specific submodules (e.g., "logging.handlers" or others) are
considered dangerous while "logging.config" is excluded. Locate the module list
in pickle_scanner.py (the literal containing "webbrowser", "asyncio", ...,
"logging", ...) and make the change so the code and comment consistently reflect
the intended policy.
- Around line 91-95: The early return on a partial-decode inside the pickle
scanner (the block that checks stream_error and had_opcodes) aborts scanning the
rest of the file and can miss later malicious streams; instead, discard the
current buffer and continue scanning subsequent streams. Edit the code in
pickle_scanner.py (the block using stream_error and had_opcodes inside the
stream-processing loop) to remove the return and ensure control continues to the
next stream (e.g., clear/reset the per-stream buffer/state and use continue or
otherwise advance the outer loop), preserving any detection bookkeeping so later
streams are still scanned.

---

Duplicate comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 2209-2213: At each pickle stream boundary where the code checks
opcode.name == "STOP" (inside the PickleScanner/scan loop), also reset the memo
safety state by clearing or reinitializing _safe_memo so memo-index safety does
not leak between streams; update the STOP handling block that currently resets
dangerous_opcode_count, consecutive_dangerous, and max_consecutive to
additionally reset _safe_memo (and any related memo tracking structures) and run
unit tests for both benign and malicious samples to ensure detections are
preserved.
- Around line 2132-2138: The branch currently requires isinstance(arg, str) for
all opcodes, making the OBJ/NEWOBJ/NEWOBJ_EX cases unreachable; update the logic
in pickle_scanner.py (around the block handling opcode.name in
["INST","OBJ","NEWOBJ","NEWOBJ_EX"]) to separate handling: keep the INST path
gated by isinstance(arg, str) to return the INST_EXECUTION pattern, but remove
the str requirement for OBJ, NEWOBJ, and NEWOBJ_EX so they return their
respective _EXECUTION pattern even when arg is not a string (handle None/int
safely), i.e., branch on opcode.name first and only apply isinstance(arg, str)
for INST to avoid crashes and correctly detect stack-based opcodes.
- Around line 52-54: The current fixed resync limit (_MAX_RESYNC_BYTES = 256)
and resync_skipped counter can be bypassed by large padded separators; update
the resync logic in the pickle scanner to stop relying on a small hard cap and
instead scan the incoming stream for valid pickle start/marker patterns until a
legitimate pickle header or EOF is found. Make _MAX_RESYNC_BYTES configurable
(or remove the fixed cap) and change the loop that increments resync_skipped to
perform a sliding search for known pickle opcodes/headers and reset
resync_skipped when a candidate is found; ensure the scanner still fails safely
on EOF and include tests for both short and very large padded separators to
validate detection.

In `@tests/scanners/test_pickle_scanner.py`:
- Around line 744-792: The tests (e.g., test_smtplib_blocked,
test_sqlite3_blocked, test_tarfile_blocked, test_marshal_blocked,
test_cloudpickle_blocked, test_webbrowser_blocked) only assert result.has_errors
and can pass on unrelated failure paths; update each to also assert the explicit
success flag (e.g., assert result.success is False or assert not result.success)
after calling _scan_bytes(_craft_global_reduce_pickle(...)) so the test
guarantees the scanner reported a failure state for these dangerous-module
pickles.

ℹ️ Review info

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5efc0df and a13f819.

📒 Files selected for processing (3)

modelaudit/detectors/suspicious_symbols.py
modelaudit/scanners/pickle_scanner.py
tests/scanners/test_pickle_scanner.py

modelaudit/scanners/pickle_scanner.py

coderabbitai · 2026-02-26T01:32:11Z

modelaudit/scanners/pickle_scanner.py

+    "webbrowser",
+    "asyncio",
+    "mmap",
+    "select",
+    "selectors",
+    "logging",
+    "syslog",
+    "tarfile",
+    "zipfile",
+    "shelve",
+    "sqlite3",
+    "_sqlite3",
+    "doctest",
+    "idlelib",
+    "lib2to3",
+    # uuid — _get_command_stdout internally calls subprocess.Popen (VULN-6)
+    "uuid",


⚠️ Potential issue | 🟡 Minor

logging blocklist addition conflicts with nearby intent note.

Line 440 adds logging to always-dangerous modules, but the nearby note states logging.config is intentionally not in this set. Please align code or comment to avoid policy ambiguity.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelaudit/scanners/pickle_scanner.py` around lines 435 - 451, The list literal that adds "logging" to the always-dangerous modules conflicts with the nearby intent note that "logging.config" is intentionally excluded; either remove "logging" from the always-dangerous/modules blocklist or update the comment to explicitly state that only specific submodules (e.g., "logging.handlers" or others) are considered dangerous while "logging.config" is excluded. Locate the module list in pickle_scanner.py (the literal containing "webbrowser", "asyncio", ..., "logging", ...) and make the change so the code and comment consistently reflect the intended policy.

- Increase _MAX_RESYNC_BYTES from 256 to 8192 to prevent attackers from hiding malicious streams behind padding gaps larger than the resync window - Continue scanning on partial stream decode errors instead of returning early, so malicious payloads later in the file are not missed - Clear _safe_memo at STOP boundaries so stale memo entries from a prior stream cannot mark BINGET->REDUCE pairs in a new stream as safe - Split OBJ/NEWOBJ/NEWOBJ_EX detection from INST: these stack-based opcodes have no string arg, so resolve the class via callable_refs instead - Strengthen module blocklist test assertions with explicit result.success and result.issues checks before severity/message checks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

modelaudit/scanners/pickle_scanner.py (1)

2149-2156: ⚠️ Potential issue | 🟠 Major

INST path currently flags all string-arg INST opcodes as dangerous.

Unlike OBJ/NEWOBJ/NEWOBJ_EX, the INST branch returns immediately without checking whether the resolved target is actually dangerous. This can create false critical findings.

Proposed fix

         # INST encodes the class directly in the string argument.
         if opcode.name == "INST" and isinstance(arg, str):
-            return {
-                "pattern": f"{opcode.name}_EXECUTION",
-                "argument": arg,
-                "position": pos,
-                "opcode": opcode.name,
-            }
+            parsed = _parse_module_function(arg)
+            if parsed:
+                mod, func = parsed
+                if _is_dangerous_ref(mod, func):
+                    return {
+                        "pattern": f"{opcode.name}_EXECUTION",
+                        "argument": f"{mod}.{func}",
+                        "position": pos,
+                        "opcode": opcode.name,
+                    }

♻️ Duplicate comments (1)

modelaudit/scanners/pickle_scanner.py (1)

52-55: ⚠️ Potential issue | 🟠 Major

Fixed resync cap still allows padded multi-stream bypass.

A malicious stream placed after >8192 non-pickle bytes will still be skipped because scanning terminates at Line 117–Line 118.

Hardening direction

-    _MAX_RESYNC_BYTES = 8192
+    # Avoid fixed-size resync cutoffs that can be bypassed with padding.
+    # Runtime is already bounded by scanner timeout / opcode limits.
@@
-            if resync_skipped >= _MAX_RESYNC_BYTES:
-                return
             continue

Based on learnings: "Preserve or strengthen security detections; test both benign and malicious samples when modifying scanners".

Also applies to: 117-118

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@modelaudit/scanners/pickle_scanner.py` around lines 52 - 55, The scanner's
fixed resync cap (_MAX_RESYNC_BYTES = 8192) plus the early termination in the
resync logic can be bypassed by padding a malicious pickle after a larger gap;
update the resync handling to not silently stop scanning once the skipped byte
count exceeds _MAX_RESYNC_BYTES—either (a) remove the premature termination and
continue scanning the remainder of the stream for additional pickle markers, or
(b) treat exceeding the cap as a detection (fail/flag) instead of skipping
further checks; locate the resync loop/branch that references _MAX_RESYNC_BYTES
and change its control flow accordingly so no multi-stream payload can hide
beyond the cap.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 93-103: The code currently resets stream_error and then still
yields the prior buffered opcodes, allowing false positives; in the block
handling a partial stream (the branch that checks if stream_error and
had_opcodes) explicitly discard the buffered opcode store (e.g., clear or
reassign buffered = [] or buffered.clear()) before resetting
had_opcodes/stream_error so no buffered data is yielded later; update the logic
around the variables buffered, had_opcodes, and stream_error in
pickle_scanner.py to only yield from buffered when stream_error is false and
buffered was not discarded.

In `@tests/scanners/test_pickle_scanner.py`:
- Around line 723-739: The test test_newobj_ex_dangerous_class currently can
pass due to the GLOBAL 'os._wrap_close' match rather than verifying NEWOBJ_EX
handling; update the assertion to specifically verify that the reported issue is
tied to NEWOBJ_EX by checking the issue details from result.issues (e.g.,
message contains "NEWOBJ_EX" or an opcode/source field indicating NEWOBJ_EX) and
still ensure it's CRITICAL; use the existing test helper _scan_bytes and
IssueSeverity.CRITICAL to locate the relevant issue and assert that at least one
issue is NEWOBJ_EX-specific so the test fails if NEWOBJ_EX detection regresses.

---

Duplicate comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 52-55: The scanner's fixed resync cap (_MAX_RESYNC_BYTES = 8192)
plus the early termination in the resync logic can be bypassed by padding a
malicious pickle after a larger gap; update the resync handling to not silently
stop scanning once the skipped byte count exceeds _MAX_RESYNC_BYTES—either (a)
remove the premature termination and continue scanning the remainder of the
stream for additional pickle markers, or (b) treat exceeding the cap as a
detection (fail/flag) instead of skipping further checks; locate the resync
loop/branch that references _MAX_RESYNC_BYTES and change its control flow
accordingly so no multi-stream payload can hide beyond the cap.

ℹ️ Review info

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a13f819 and 7a3ce67.

📒 Files selected for processing (2)

modelaudit/scanners/pickle_scanner.py
tests/scanners/test_pickle_scanner.py

modelaudit/scanners/pickle_scanner.py

coderabbitai · 2026-02-26T03:42:15Z

tests/scanners/test_pickle_scanner.py

+    def test_newobj_ex_dangerous_class(self) -> None:
+        """NEWOBJ_EX opcode with a dangerous class should be flagged."""
+        # Craft pickle: PROTO 4 | GLOBAL 'os _wrap_close' | EMPTY_TUPLE | EMPTY_DICT | NEWOBJ_EX | STOP
+        # Protocol 4 is needed for NEWOBJ_EX (opcode 0x92)
+        proto = b"\x80\x04"
+        global_op = b"c" + b"os\n_wrap_close\n"
+        empty_tuple = b")"
+        empty_dict = b"}"
+        newobj_ex = b"\x92"  # NEWOBJ_EX opcode
+        stop = b"."
+        data = proto + global_op + empty_tuple + empty_dict + newobj_ex + stop
+
+        result = self._scan_bytes(data)
+        assert result.success
+        assert result.has_errors
+        os_issues = [i for i in result.issues if i.severity == IssueSeverity.CRITICAL and "os" in i.message.lower()]
+        assert os_issues, f"Expected CRITICAL os issue for NEWOBJ_EX, got: {[i.message for i in result.issues]}"


⚠️ Potential issue | 🟡 Minor

test_newobj_ex_dangerous_class is too broad to validate NEWOBJ_EX-specific detection.

This test can pass from the GLOBAL os._wrap_close finding alone, even if NEWOBJ_EX handling regresses. Please assert on a check/detail tied to opcode == "NEWOBJ_EX" (or NEWOBJ_EX-specific pattern) to make it a true regression guard.

Based on learnings: "Add comprehensive tests including edge cases and regression coverage for scanner changes".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/scanners/test_pickle_scanner.py` around lines 723 - 739, The test test_newobj_ex_dangerous_class currently can pass due to the GLOBAL 'os._wrap_close' match rather than verifying NEWOBJ_EX handling; update the assertion to specifically verify that the reported issue is tied to NEWOBJ_EX by checking the issue details from result.issues (e.g., message contains "NEWOBJ_EX" or an opcode/source field indicating NEWOBJ_EX) and still ensure it's CRITICAL; use the existing test helper _scan_bytes and IssueSeverity.CRITICAL to locate the relevant issue and assert that at least one issue is NEWOBJ_EX-specific so the test fails if NEWOBJ_EX detection regresses.

… on Windows The test_real_joblib_compressed test only accepted "opcode" in the error message, but compressed joblib files can produce different parse errors depending on the platform (e.g. MemoryError on Windows). Widen the assertion to accept any format/parse-related error keyword so the test passes across all platforms. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…des and reset symbolic state at STOP - Fix bug where partial stream opcodes were yielded instead of discarded after a ValueError in _genops_with_fallback (clear buffer before resetting stream_error flag) - Reset stack/memo in _build_symbolic_reference_maps at STOP boundaries so stale references from stream 1 don't leak into stream 2 resolution - Apply ruff format fix to test_real_world_dill_joblib.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

modelaudit/scanners/pickle_scanner.py (1)

2159-2166: ⚠️ Potential issue | 🟠 Major

Gate INST findings on dangerous target resolution.

Line 2160 currently returns INST_EXECUTION for any INST string arg, even benign classes. This can generate false CRITICAL findings and dilute signal quality.

🔧 Suggested fix

-        if opcode.name == "INST" and isinstance(arg, str):
-            return {
-                "pattern": f"{opcode.name}_EXECUTION",
-                "argument": arg,
-                "position": pos,
-                "opcode": opcode.name,
-            }
+        if opcode.name == "INST" and isinstance(arg, str):
+            parsed = _parse_module_function(arg)
+            if parsed:
+                mod, func = parsed
+                if _is_dangerous_ref(mod, func):
+                    return {
+                        "pattern": f"{opcode.name}_EXECUTION",
+                        "argument": arg,
+                        "position": pos,
+                        "opcode": opcode.name,
+                    }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@modelaudit/scanners/pickle_scanner.py` around lines 2159 - 2166, The current
INST handling in pickle_scanner.py returns an INST_EXECUTION finding for any
string arg, causing false positives; update the INST branch (the code checking
if opcode.name == "INST" and isinstance(arg, str)) to suppress benign types by
checking the arg against a safe builtin whitelist (e.g.,
"object","list","dict","set","tuple","str","int","float","bool","bytes") and
only emit the INST_EXECUTION finding when the arg is not in that whitelist
and/or appears to be a fully-qualified or suspicious class name (e.g., contains
a module qualifier like "." or matches known dangerous targets); implement the
whitelist as a module-level SAFE_BUILTINS and reference it inside the INST
handling so harmless classes are ignored while truly suspicious INST args still
produce a finding.

♻️ Duplicate comments (1)

modelaudit/scanners/pickle_scanner.py (1)
52-56: ⚠️ Potential issue | 🟠 Major

Remove fixed resync ceiling; it still permits padded-stream evasion.

At Line 118, multi-stream parsing stops after 8192 non-pickle bytes. A payload can place a malicious second stream beyond that gap and evade detection.

Based on learnings: "Preserve or strengthen security detections; test both benign and malicious samples when modifying scanners".
🔧 Suggested hardening
-    _MAX_RESYNC_BYTES = 8192
     resync_skipped = 0
@@
-            if resync_skipped >= _MAX_RESYNC_BYTES:
-                return
             continue
Also applies to: 118-119
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelaudit/scanners/pickle_scanner.py` around lines 52 - 56, The fixed
ceiling _MAX_RESYNC_BYTES and related resync_skipped logic allow evasion by
padding a malicious second pickle stream beyond 8192 bytes; replace the hard
limit with a safer approach: remove or enlarge the fixed cap and instead
implement streaming-aware resync that either (a) reads until EOF when scanning
for additional pickle streams, or (b) enforces a configurable/strict
total-scan-size limit and per-file sanity checks, and update the code paths that
reference _MAX_RESYNC_BYTES and resync_skipped (the multi-stream parsing loop)
to use the new behavior/config parameter; add unit tests covering both benign
long-padding and malicious delayed-stream samples to validate detection.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 2251-2267: The backward scan ranges skip index 0 due to Python's
exclusive stop; update the range stop arguments used in the backward scans in
pickle_scanner.py (the for loops that start with for j in range(i - 1, max(0, i
- 4), -1) and for k in range(j - 1, max(0, j - 10), -1)) so the stop allows
hitting index 0 (e.g., use max(-1, i - 4) and max(-1, j - 10) or otherwise set
the stop to -1) to include the stream-start GLOBAL/STACK_GLOBAL entries when
computing _safe_memo in the surrounding logic that references opcodes,
prev_opcode, prev_arg, and _is_safe_ml_global.

---

Outside diff comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 2159-2166: The current INST handling in pickle_scanner.py returns
an INST_EXECUTION finding for any string arg, causing false positives; update
the INST branch (the code checking if opcode.name == "INST" and isinstance(arg,
str)) to suppress benign types by checking the arg against a safe builtin
whitelist (e.g.,
"object","list","dict","set","tuple","str","int","float","bool","bytes") and
only emit the INST_EXECUTION finding when the arg is not in that whitelist
and/or appears to be a fully-qualified or suspicious class name (e.g., contains
a module qualifier like "." or matches known dangerous targets); implement the
whitelist as a module-level SAFE_BUILTINS and reference it inside the INST
handling so harmless classes are ignored while truly suspicious INST args still
produce a finding.

---

Duplicate comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 52-56: The fixed ceiling _MAX_RESYNC_BYTES and related
resync_skipped logic allow evasion by padding a malicious second pickle stream
beyond 8192 bytes; replace the hard limit with a safer approach: remove or
enlarge the fixed cap and instead implement streaming-aware resync that either
(a) reads until EOF when scanning for additional pickle streams, or (b) enforces
a configurable/strict total-scan-size limit and per-file sanity checks, and
update the code paths that reference _MAX_RESYNC_BYTES and resync_skipped (the
multi-stream parsing loop) to use the new behavior/config parameter; add unit
tests covering both benign long-padding and malicious delayed-stream samples to
validate detection.

ℹ️ Review info

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a7ad66c and 5521b8c.

📒 Files selected for processing (2)

modelaudit/scanners/pickle_scanner.py
tests/test_real_world_dill_joblib.py

coderabbitai · 2026-02-26T17:46:01Z

modelaudit/scanners/pickle_scanner.py

+            for j in range(i - 1, max(0, i - 4), -1):
+                prev_opcode, prev_arg, _prev_pos = opcodes[j]
+                if prev_opcode.name == "GLOBAL" and isinstance(prev_arg, str):
+                    parts = (
+                        prev_arg.split(" ", 1)
+                        if " " in prev_arg
+                        else prev_arg.rsplit(".", 1)
+                        if "." in prev_arg
+                        else [prev_arg, ""]
+                    )
+                    if len(parts) == 2:
+                        _safe_memo[arg] = _is_safe_ml_global(parts[0], parts[1])
+                    break
+                if prev_opcode.name == "STACK_GLOBAL":
+                    strs: list[str] = []
+                    for k in range(j - 1, max(0, j - 10), -1):
+                        pk_op, pk_arg, _ = opcodes[k]


⚠️ Potential issue | 🟡 Minor

Fix off-by-one in backward scans (index 0 is currently skipped).

The ranges at Line 2251 and Line 2266 exclude index 0 due Python’s stop-exclusive semantics. This misses stream-start GLOBAL/STACK_GLOBAL references and can misclassify memo safety.

🔧 Suggested fix

- for j in range(i - 1, max(0, i - 4), -1): + for j in range(i - 1, max(-1, i - 5), -1): @@ - for k in range(j - 1, max(0, j - 10), -1): + for k in range(j - 1, max(-1, j - 11), -1):

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelaudit/scanners/pickle_scanner.py` around lines 2251 - 2267, The backward scan ranges skip index 0 due to Python's exclusive stop; update the range stop arguments used in the backward scans in pickle_scanner.py (the for loops that start with for j in range(i - 1, max(0, i - 4), -1) and for k in range(j - 1, max(0, j - 10), -1)) so the stop allows hitting index 0 (e.g., use max(-1, i - 4) and max(-1, j - 10) or otherwise set the stop to -1) to include the stream-start GLOBAL/STACK_GLOBAL entries when computing _safe_memo in the surrounding logic that references opcodes, prev_opcode, prev_arg, and _is_safe_ml_global.

yash2998chhabria and others added 4 commits February 25, 2026 10:46

coderabbitai bot reviewed Feb 25, 2026

View reviewed changes

modelaudit/scanners/pickle_scanner.py Outdated Show resolved Hide resolved

modelaudit/scanners/pickle_scanner.py Outdated Show resolved Hide resolved

modelaudit/scanners/pickle_scanner.py Show resolved Hide resolved

tests/scanners/test_pickle_scanner.py Show resolved Hide resolved

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

yash2998chhabria and others added 2 commits February 26, 2026 06:40

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: report actual file size in scan summary when scanner exits early#587

fix: report actual file size in scan summary when scanner exits early#587
yash2998chhabria wants to merge 8 commits intomainfrom
fix/size-bug

yash2998chhabria commented Feb 25, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 25, 2026 •

edited

Loading

Reviews paused

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot Feb 26, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot Feb 26, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yash2998chhabria commented Feb 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yash2998chhabria commented Feb 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 25, 2026 •

edited

Loading