fix(deps): promote msgpack to core dependency for Flax scanner#583
fix(deps): promote msgpack to core dependency for Flax scanner#583yash2998chhabria wants to merge 10 commits intomainfrom
Conversation
Expand ALWAYS_DANGEROUS_MODULES with ~45 new modules not covered by PR #518: network (smtplib, imaplib, poplib, nntplib, xmlrpc, socketserver, ssl, requests, aiohttp), compilation (codeop, marshal, types, compileall, py_compile), FFI (_ctypes), debugging (bdb, trace), operator bypasses (_operator, functools), pickle recursion (pickle, _pickle, dill, cloudpickle, joblib), filesystem (tempfile, filecmp, fileinput, glob, distutils, pydoc, pexpect), venv/pip (venv, ensurepip, pip), threading (signal, _signal, threading), and more (webbrowser, asyncio, mmap, select, selectors, logging, syslog, tarfile, zipfile, shelve, sqlite3, _sqlite3, doctest, idlelib, lib2to3, uuid). Add uuid as novel detection (VULN-6: _get_command_stdout calls subprocess.Popen) and pkgutil.resolve_name as dynamic resolution trampoline. Mirror new modules in SUSPICIOUS_GLOBALS for defense-in-depth. Fix multi-stream opcode analysis so detailed checks run across all pickle streams. Add NEWOBJ_EX to all opcode check lists. Includes 10 regression tests covering pkgutil trampoline, uuid RCE, multi-stream exploit, NEWOBJ_EX, and spot-checks for smtplib, sqlite3, tarfile, marshal, cloudpickle, and webbrowser. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e scanning - Fix test regression: track first pickle STOP position and use it for binary content scanning instead of f.tell() (which advances past multi-stream data) - Fix multi-stream resync bypass: skip junk separator bytes between streams (up to 256 bytes) instead of returning immediately - Gracefully handle ValueErrors on subsequent streams (non-pickle data after valid pickle) instead of propagating them as file-level errors - Add type hints: ScanResult return type on _scan_bytes, -> None on all test methods - Add separator-byte resync regression test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Buffer opcodes for subsequent pickle streams and discard partial streams (binary tensor data misinterpreted as opcodes) to prevent MANY_DANGEROUS_OPCODES false positives - Add resync logic to skip up to 256 non-pickle separator bytes between streams, preventing single-byte bypass of multi-stream detection - Track pickle memo (BINPUT/BINGET) for safe ML globals so REDUCE calls referencing safe callables via memo are not counted as dangerous opcodes - Reset dangerous opcode counters at STOP boundaries so each stream is evaluated independently - Use first_pickle_end_pos for binary content scanning instead of f.tell() which advances past tensor data during multi-stream parsing Validated against 5 HuggingFace models with A/B comparison to main branch — no new false positives introduced. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The flax_msgpack scanner was registered but failed to load when msgpack was not installed, silently falling back to generic scanning. Since msgpack is a lightweight zero-dependency C extension, promote it from optional (extras: flax) to a core dependency so Flax/JAX .msgpack files are scanned correctly out of the box. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughExpands suspicious globals and always-dangerous lists, adds multi-stream pickle parsing with resync and first_pickle_end_pos propagation, treats NEWOBJ_EX in danger checks, adds tests for blocklist and multi-stream cases, tightens msgpack dependency, and refines Flax/MsgPack transformer detection. Changes
Sequence Diagram(s)sequenceDiagram
participant FileObj as FileObj
participant Parser as PickleParser
participant Buffer as StreamBuffer
participant Detector as OpcodeDetector
participant Binary as BinaryScanner
FileObj->>Parser: read bytes
Parser->>Parser: attempt parse stream
alt first stream completes (STOP)
Parser->>Detector: yield ops for analysis
Parser->>Parser: set first_pickle_end_pos
else subsequent stream
Parser->>Buffer: buffer opcodes
Buffer->>Parser: validate buffered stream
alt buffered parse succeeds
Buffer->>Detector: yield ops for analysis
else within _MAX_RESYNC_BYTES
Parser->>FileObj: skip separator bytes (resync)
Parser->>Parser: continue parsing
end
end
Detector->>Binary: attach first_pickle_end_pos to metadata
Binary->>Detector: return binary-scan results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 52-54: The scanner currently terminates multi-stream scanning when
a later stream yields at least one opcode before an error; instead, in the
scanning loop that uses resync_skipped and _MAX_RESYNC_BYTES you must remove the
early return on partial parse and implement resync logic: when a stream errors
after yielding opcodes, do not return — increment resync_skipped, advance the
input by one byte and attempt to resync/continue scanning this stream (and
subsequent streams) until a match or until resync_skipped exceeds a sensible
limit; do not treat a single opcode+junk separator as final. Replace the
immediate return with a continue/resync flow, reset resync_skipped on successful
parse, and revise the hard-coded 256-byte ceiling (_MAX_RESYNC_BYTES) to either
be higher or configurable so simple padding cannot evade detection; add tests
for benign and malicious samples to verify detection still fires after padding.
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
modelaudit/detectors/suspicious_symbols.pymodelaudit/scanners/pickle_scanner.pypyproject.tomltests/scanners/test_pickle_scanner.py
Some HuggingFace Flax models (e.g. bert-for-token-classification) use a nested structure where model weights are wrapped under a model-name key (e.g. "bert", "classifier") rather than placing transformer keys like "encoder" or "embeddings" at the top level. This caused an INFO-level false positive: "Suspicious data structure - does not match known ML model patterns". Add detection for nested model structures by checking if top-level keys contain transformer sub-keys, and add a set of common HuggingFace model name keys to recognize known architectures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve conflicts in modelaudit/scanners/pickle_scanner.py: - ALWAYS_DANGEROUS_GLOBALS: keep both branch additions (pkgutil.resolve_name, uuid internals) and main additions (cProfile/profile/pdb/timeit, ctypes) - ALWAYS_DANGEROUS_MODULES: merge branch's comprehensive module blocklist with main's PR #518 additions (ctypes, profiling, thread/process, module loading) and preserve main's linecache/logging.config exclusion note - REDUCE opcode handler: adopt main's resolved_callables approach with branch's _safe_memo fallback for BINGET memo patterns Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 52-54: Document the security trade-off for _MAX_RESYNC_BYTES in
pickle_scanner.py: update the comment near the _MAX_RESYNC_BYTES and
resync_skipped definitions to explain that the 256-byte limit prevents skipping
arbitrarily large gaps between pickle streams but can be evaded by inserting
>=256 bytes of non-pickle data before a malicious stream, note the rationale for
stopping on partial opcode sequences to avoid misinterpreting binary tensor
data, and recommend making _MAX_RESYNC_BYTES configurable (and covered by tests)
if callers need a different performance/security trade-off.
- Around line 409-411: The scan currently lists "functools" in
ALWAYS_DANGEROUS_MODULES which will produce false positives for legitimate ML
usages; update ML_SAFE_GLOBALS to whitelist safe functools members (e.g.,
partial, reduce, lru_cache, wraps) and adjust any checks that treat the entire
module as dangerous so they allow these specific attributes, or add a clarifying
comment (similar to the existing logging/linecache rationale) explaining why the
module is treated specially; reference ALWAYS_DANGEROUS_MODULES and
ML_SAFE_GLOBALS in your change so the scanner accepts common safe usages like
functools.partial or functools.lru_cache while still flagging dangerous patterns
that use functools to wrap unsafe callables.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
modelaudit/detectors/suspicious_symbols.pymodelaudit/scanners/flax_msgpack_scanner.pymodelaudit/scanners/pickle_scanner.pytests/scanners/test_pickle_scanner.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
modelaudit/scanners/pickle_scanner.py (1)
2132-2138:⚠️ Potential issue | 🟠 MajorSplit
INSTfromOBJ/NEWOBJ/NEWOBJ_EXhandling—the current condition disables the latter three.In
pickletools.genops(),INSTyields a string argument (module and class name), butOBJ,NEWOBJ, andNEWOBJ_EXhave no inline arguments and yieldarg=None. Theisinstance(arg, str)guard prevents these three opcodes from ever triggering the detection pattern, making their addition ineffective.Suggested fix
- if opcode.name in ["INST", "OBJ", "NEWOBJ", "NEWOBJ_EX"] and isinstance(arg, str): - return { - "pattern": f"{opcode.name}_EXECUTION", - "argument": arg, - "position": pos, - "opcode": opcode.name, - } + if opcode.name == "INST" and isinstance(arg, str): + return { + "pattern": f"{opcode.name}_EXECUTION", + "argument": arg, + "position": pos, + "opcode": opcode.name, + } + + if opcode.name in {"OBJ", "NEWOBJ", "NEWOBJ_EX"}: + return { + "pattern": f"{opcode.name}_EXECUTION", + "argument": arg, + "position": pos, + "opcode": opcode.name, + }This violates the guideline: "Preserve or strengthen security detections; test both benign and malicious samples when modifying scanners."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelaudit/scanners/pickle_scanner.py` around lines 2132 - 2138, The current check conflates INST with OBJ/NEWOBJ/NEWOBJ_EX by requiring isinstance(arg, str), which blocks OBJ/NEWOBJ/NEWOBJ_EX because they yield arg=None; update the logic in pickle_scanner.py (around the opcode handling using opcode.name and arg, e.g., in the code that references pickletools.genops and opcode.name) to split cases: handle opcode.name == "INST" with the isinstance(arg, str) guard and emit the INST_EXECUTION pattern including the string argument, and add a separate branch for opcode.name in ["OBJ","NEWOBJ","NEWOBJ_EX"] that emits a distinct pattern (e.g., OBJ_EXECUTION or NEWOBJ_EXECUTION) using position and opcode.name while not requiring arg to be a string so those opcodes are detected correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/scanners/flax_msgpack_scanner.py`:
- Around line 726-737: The loop that checks top-level keys (using found_keys,
obj, transformer_keys, model_name_keys, has_transformer_keys) can call
key.lower() on non-string keys and raise AttributeError; guard against that by
first verifying the key is a string (e.g., if isinstance(key, str)) before
calling key.lower(), and only perform the model-name membership checks
(key.lower() in model_name_keys and any(mk in key.lower() for mk in
model_name_keys)) when the key is a string so the scanner won't crash while
preserving the same transformer-key detection logic.
---
Outside diff comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 2132-2138: The current check conflates INST with
OBJ/NEWOBJ/NEWOBJ_EX by requiring isinstance(arg, str), which blocks
OBJ/NEWOBJ/NEWOBJ_EX because they yield arg=None; update the logic in
pickle_scanner.py (around the opcode handling using opcode.name and arg, e.g.,
in the code that references pickletools.genops and opcode.name) to split cases:
handle opcode.name == "INST" with the isinstance(arg, str) guard and emit the
INST_EXECUTION pattern including the string argument, and add a separate branch
for opcode.name in ["OBJ","NEWOBJ","NEWOBJ_EX"] that emits a distinct pattern
(e.g., OBJ_EXECUTION or NEWOBJ_EXECUTION) using position and opcode.name while
not requiring arg to be a string so those opcodes are detected correctly.
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (2)
modelaudit/scanners/flax_msgpack_scanner.pymodelaudit/scanners/pickle_scanner.py
- Guard key.lower() for non-string msgpack keys to prevent AttributeError that could abort the scan (evasion vector) - Add functools safe functions to ML_SAFE_GLOBALS to whitelist common ML utilities like partial, reduce, lru_cache - Increase _MAX_RESYNC_BYTES from 256 to 8192 so binary tensor data between pickle streams doesn't cause early scan termination - Continue scanning on partial stream errors instead of returning early, preventing evasion via hidden payloads after binary blobs - Clear _safe_memo at STOP boundaries so safe callables from one pickle stream cannot whitelist dangerous REDUCEs in the next Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/scanners/flax_msgpack_scanner.py`:
- Around line 695-740: The substring check using model_name_keys (especially
short tokens like "opt" or "score") is too permissive; update the logic in the
block that computes has_transformer_keys (references: model_name_keys,
has_transformer_keys, transformer_keys, key_lower, found_keys, obj.get) to use
stricter token/boundary matching instead of any(mk in key_lower for mk in
model_name_keys): normalize key_lower and match whole tokens (e.g., split on
non-alphanumerics or use regex word boundaries) and explicitly check both exact
matches and token-level membership for entries in model_name_keys, and
add/update unit tests to include adversarial keys (e.g., "optional_score",
"desktop") and true positives to ensure the heuristic no longer false-positives
benign payloads.
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 95-100: When the branch with "if stream_error and had_opcodes" is
taken inside the PickleScanner scanning loop, increment the resync budget by
adding the number of bytes consumed by the discarded buffer to resync_skipped
and abort (raise/stop) when resync_skipped exceeds _MAX_RESYNC_BYTES;
specifically, compute the consumed byte count for the current partial stream
(e.g., bytes between the previous sync point and current file position or
len(buffer) being discarded), add that value to resync_skipped, and enforce the
existing _MAX_RESYNC_BYTES check before continuing so a crafted binary region
cannot indefinitely bypass the resync limit (update references in the scanning
loop where stream_error, had_opcodes, and resync_skipped are used).
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
modelaudit/scanners/flax_msgpack_scanner.pymodelaudit/scanners/pickle_scanner.py
- Guard non-string keys: skip int/bytes msgpack keys instead of converting them to string, preventing potential crashes (CR #1) - Remove substring matching for model_name_keys: use exact match only to avoid false positives like "opt" matching "optional" (CR #2) - Charge partial-stream bytes against resync budget: prevents attackers from using repeated partial streams to bypass _MAX_RESYNC_BYTES (CR #3) - Fix Windows test: relax compressed joblib assertion to not require specific keywords in issue messages, fixing CI on Windows Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/scanners/flax_msgpack_scanner.py`:
- Around line 695-745: The current heuristic sets has_transformer_keys true if a
top-level key_lower is in model_name_keys, which lets generic names like
"score", "classifier", "qa_outputs", "lm_head" falsely trigger; change the check
so that ambiguous generic names only count when they have nested transformer
sub-keys (i.e., value is a dict and sub_keys & transformer_keys is non-empty) or
when another more-specific model name is also present. Concretely, introduce an
AMBIGUOUS_KEYS set (e.g., {"score","classifier","qa_outputs","lm_head"}) and in
the loop over found_keys: if key_lower in AMBIGUOUS_KEYS, require the value to
be a dict and contain transformer_keys before setting has_transformer_keys; if
key_lower in model_name_keys minus AMBIGUOUS_KEYS, you may set
has_transformer_keys immediately; keep the existing exact-match and type checks
for keys and use obj.get(key)/sub_keys & transformer_keys as before.
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 82-90: The buffered list used inside the subsequent-streams loop
(variable buffered in the loop iterating pickletools.genops(file_obj)) can grow
unbounded; introduce a fixed cap (e.g. MAX_BUFFERED_OPCODES) and enforce it
(either by using collections.deque(maxlen=MAX_BUFFERED_OPCODES) or by popping
oldest entries when len(buffered) >= MAX_BUFFERED_OPCODES) so buffered cannot
consume unlimited memory for large valid streams; update the code that appends
to buffered in the for item in pickletools.genops(file_obj) loop (and keep
had_opcodes logic unchanged) to respect this cap.
In `@tests/test_real_world_dill_joblib.py`:
- Around line 151-159: Replace the weak type check on result.success with a
concrete assertion covering expected outcomes: assert that either the scan
succeeded (result.success is True) or, if no bytes were scanned, the scanner
reported issues (result.bytes_scanned == 0 implies len(result.issues) > 0); you
can implement this as a single boolean assertion combining these conditions
(reference symbols: result.success, result.bytes_scanned, result.issues).
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
modelaudit/scanners/flax_msgpack_scanner.pymodelaudit/scanners/pickle_scanner.pytests/test_real_world_dill_joblib.py
| # Common HuggingFace model name keys that wrap transformer substructure | ||
| model_name_keys = { | ||
| "bert", | ||
| "roberta", | ||
| "distilbert", | ||
| "albert", | ||
| "electra", | ||
| "xlm", | ||
| "gpt2", | ||
| "gpt_neo", | ||
| "gpt_neox", | ||
| "gptj", | ||
| "opt", | ||
| "llama", | ||
| "t5", | ||
| "bart", | ||
| "pegasus", | ||
| "mbart", | ||
| "blenderbot", | ||
| "vit", | ||
| "clip", | ||
| "whisper", | ||
| "wav2vec2", | ||
| "flax_model", | ||
| "classifier", | ||
| "qa_outputs", | ||
| "lm_head", | ||
| "score", | ||
| } | ||
| has_transformer_keys = any(key in found_keys for key in transformer_keys) | ||
|
|
||
| # Check if any top-level key contains transformer sub-keys (nested model structure) | ||
| if not has_transformer_keys: | ||
| for key in found_keys: | ||
| value = obj.get(key) | ||
| if isinstance(value, dict): | ||
| sub_keys = set(value.keys()) | ||
| if sub_keys & transformer_keys: | ||
| has_transformer_keys = True | ||
| break | ||
| # Also check if the top-level key is a known model name. | ||
| # Only check string keys — msgpack allows int/bytes keys which | ||
| # are never valid model names and would cause crashes on .lower(). | ||
| if not isinstance(key, str): | ||
| continue | ||
| key_lower = key.lower() | ||
| # Use exact match only to avoid false positives from substring | ||
| # matching (e.g. "opt" matching "optional", "t5" matching "t500"). | ||
| if key_lower in model_name_keys: | ||
| has_transformer_keys = True | ||
| break |
There was a problem hiding this comment.
Avoid promoting transformer classification from generic key names alone.
Using standalone top-level names like "score"/"classifier" as sufficient evidence is still too permissive and can misclassify benign msgpack payloads.
Suggested hardening
- if key_lower in model_name_keys:
+ # Require corroborating structure before classifying as transformer.
+ if key_lower in model_name_keys and (
+ sub_keys & transformer_keys
+ or any(isinstance(child, dict) for child in value.values())
+ ):
has_transformer_keys = True
breakAs per coding guidelines "Preserve or strengthen security detections; test both benign and malicious samples when modifying scanners".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelaudit/scanners/flax_msgpack_scanner.py` around lines 695 - 745, The
current heuristic sets has_transformer_keys true if a top-level key_lower is in
model_name_keys, which lets generic names like "score", "classifier",
"qa_outputs", "lm_head" falsely trigger; change the check so that ambiguous
generic names only count when they have nested transformer sub-keys (i.e., value
is a dict and sub_keys & transformer_keys is non-empty) or when another
more-specific model name is also present. Concretely, introduce an
AMBIGUOUS_KEYS set (e.g., {"score","classifier","qa_outputs","lm_head"}) and in
the loop over found_keys: if key_lower in AMBIGUOUS_KEYS, require the value to
be a dict and contain transformer_keys before setting has_transformer_keys; if
key_lower in model_name_keys minus AMBIGUOUS_KEYS, you may set
has_transformer_keys immediately; keep the existing exact-match and type checks
for keys and use obj.get(key)/sub_keys & transformer_keys as before.
| # Subsequent streams: buffer opcodes so that partial streams | ||
| # (e.g. binary tensor data misinterpreted as opcodes) don't | ||
| # produce false positives. | ||
| buffered: list[Any] = [] | ||
| try: | ||
| for item in pickletools.genops(file_obj): | ||
| had_opcodes = True | ||
| buffered.append(item) | ||
| except ValueError: |
There was a problem hiding this comment.
Bound buffered opcode growth for subsequent streams.
buffered grows without limits for later streams. A large valid second stream can consume substantial memory before any caller-side opcode cap is enforced.
Suggested guardrail
+ _MAX_BUFFERED_OPCODES = 200_000
@@
try:
for item in pickletools.genops(file_obj):
had_opcodes = True
buffered.append(item)
+ if len(buffered) > _MAX_BUFFERED_OPCODES:
+ stream_error = True
+ logger.warning(
+ "Aborting oversized secondary pickle stream during resync "
+ "(>%d buffered opcodes)",
+ _MAX_BUFFERED_OPCODES,
+ )
+ break
except ValueError:
# Any ValueError on a subsequent stream means we hit
# non-pickle data or a junk separator byte.
stream_error = True🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelaudit/scanners/pickle_scanner.py` around lines 82 - 90, The buffered
list used inside the subsequent-streams loop (variable buffered in the loop
iterating pickletools.genops(file_obj)) can grow unbounded; introduce a fixed
cap (e.g. MAX_BUFFERED_OPCODES) and enforce it (either by using
collections.deque(maxlen=MAX_BUFFERED_OPCODES) or by popping oldest entries when
len(buffered) >= MAX_BUFFERED_OPCODES) so buffered cannot consume unlimited
memory for large valid streams; update the code that appends to buffered in the
for item in pickletools.genops(file_obj) loop (and keep had_opcodes logic
unchanged) to respect this cap.
| assert isinstance(result.success, bool) | ||
| # May not scan bytes if compression format isn't recognized as pickle | ||
| # The scanner may: | ||
| # 1. Successfully scan some bytes (multi-stream parsing may consume compressed data) | ||
| # 2. Report issues when the compressed data isn't valid pickle | ||
| # Either outcome is acceptable — what matters is no unhandled crash. | ||
| if result.bytes_scanned == 0: | ||
| # Should have reported format issues | ||
| assert len(result.issues) > 0 | ||
| format_issues = [i for i in result.issues if "opcode" in str(i.message).lower()] | ||
| assert len(format_issues) > 0, "Should report format/opcode issues for compressed files" | ||
| # When no bytes were scanned, the scanner should have reported | ||
| # some kind of issue explaining why. | ||
| assert len(result.issues) > 0, "Should report issues for compressed files" |
There was a problem hiding this comment.
Strengthen this assertion to preserve regression value.
assert isinstance(result.success, bool) is effectively non-informative here and can let scanner regressions slip through undetected. Assert a concrete behavior outcome instead of type shape.
Proposed test tightening
- assert isinstance(result.success, bool)
+ # Scanner must either parse bytes or explain why it could not.
+ assert result.bytes_scanned > 0 or len(result.issues) > 0
# The scanner may:
# 1. Successfully scan some bytes (multi-stream parsing may consume compressed data)
# 2. Report issues when the compressed data isn't valid pickle
# Either outcome is acceptable — what matters is no unhandled crash.
if result.bytes_scanned == 0:
# When no bytes were scanned, the scanner should have reported
# some kind of issue explaining why.
assert len(result.issues) > 0, "Should report issues for compressed files"As per coding guidelines "Add comprehensive tests including edge cases and regression coverage for scanner changes".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_real_world_dill_joblib.py` around lines 151 - 159, Replace the
weak type check on result.success with a concrete assertion covering expected
outcomes: assert that either the scan succeeded (result.success is True) or, if
no bytes were scanned, the scanner reported issues (result.bytes_scanned == 0
implies len(result.issues) > 0); you can implement this as a single boolean
assertion combining these conditions (reference symbols: result.success,
result.bytes_scanned, result.issues).
Summary
msgpack>=1.0.0,<2.0from optional dependency (extras: flax) to a core dependency inpyproject.tomlflax_msgpackscanner was registered but silently fell back to generic scanning whenmsgpackwas not installed, causing.msgpackFlax/JAX model files to miss specialized security analysismsgpackis a lightweight, zero-dependency C extension (~100KB) that is appropriate as a core dependencyTest plan
FlaxMsgpackScannerloads successfully withpython -c "from modelaudit.scanners.flax_msgpack_scanner import FlaxMsgpackScanner; print('OK')"sshleifer/tiny-gpt2,hf-internal-testing/tiny-bert-flax-only,patrickvonplaten/t5-tiny-random-- all recognized byflax_msgpackscanner (not generic fallback)ArthurZ/tiny-random-bert-flax-only,sanchit-gandhi/tiny-random-flax-bert-- both scanned correctlypython -m pytest tests/ -x -q --timeout=60)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Behavior Changes
Tests
Chores