feat(security): detect CVE-2025-10155 pickle protocol 0/1 bypass via .bin extension#605
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughDetects CVE-2025-10155 by extending Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/utils/file/detection.py`:
- Around line 224-231: The MARK-based pickle detection using magic16 is too
broad: in detection.py the condition "if magic16[:1] in (b'(', b']', b'}') and
any(op in magic16 for op in (b'c', b'\x80')): return 'pickle'" can yield false
positives for files starting with EMPTY_LIST (']') or EMPTY_DICT ('}')—tighten
the heuristic by restricting the start-opcode check to only the '(' MARK byte
(i.e., change the tuple to (b'(',) or otherwise require the exact MARK
structure) or alternatively add unit tests in test_cve_2025_10155_bin_pickle.py
that cover starting bytes b']' and b'}' to justify keeping them; update the
magic16-based conditional in the detection logic accordingly.
In `@tests/test_cve_2025_10155_bin_pickle.py`:
- Around line 20-65: Add two edge-case tests to ensure files starting with ']'
and '}' are not misdetected as pickle: in the test class
TestCVE202510155FormatDetection add methods (e.g.,
test_bracket_start_not_misdetected and test_brace_start_not_misdetected) that
write bytes starting with b"]..." and b"}..." to model.bin and assert
detect_file_format(str(bin_path)) == "pytorch_binary"; reference the detection
function detect_file_format and the existing test patterns for
safetensors/regular binary to mirror setup and assertions.
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
CHANGELOG.mdmodelaudit/detectors/suspicious_symbols.pymodelaudit/utils/file/detection.pytests/conftest.pytests/test_cve_2025_10155_bin_pickle.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_cve_2025_10155_bin_pickle.py`:
- Around line 49-56: Move the inline import of the stdlib zipfile inside
test_zip_bin_not_misdetected to the module level: add "import zipfile" at the
top of the test file and remove the in-function "import zipfile" statement so
the test function test_zip_bin_not_misdetected uses the module-level import.
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
modelaudit/utils/file/detection.pytests/test_cve_2025_10155_bin_pickle.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/utils/file/detection.py`:
- Around line 224-229: The checks for ZIP and pickle magics inside the .bin
handling block are dead code and should be removed: update the function in
modelaudit/utils/file/detection.py to delete the redundant if
magic4.startswith(b"PK") and if any(magic4.startswith(m) for m in pickle_magics)
branches inside the .bin branch so execution falls through to the intended .bin
handling logic; keep existing earlier ZIP and pickle checks (the ones that
return before the .bin block) intact and run tests to confirm no behavioral
change.
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
modelaudit/utils/file/detection.py
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Line 19: The changelog bullet "- **security:** detect CVE-2025-10155 pickle
protocol 0/1 payloads disguised as `.bin` files by recognizing protocol opcodes
during file-type detection" is missing terminal punctuation; update that bullet
(the exact string starting with "**security:** detect CVE-2025-10155...") to end
with a period so it matches surrounding entries and markdown grammar checks.
In `@tests/test_cve_2025_10155_bin_pickle.py`:
- Around line 90-130: Add a regression test inside the
TestCVE202510155PickleScanning test class that verifies a single comment token
does not suppress detection: create a new test (e.g.,
test_comment_token_does_not_bypass_detection) that writes a malicious protocol 0
(or protocol 2) pickle payload into payload.bin but injects a single comment
token character (e.g., b'#' or ASCII comment byte) into the byte stream, then
call PickleScanner().scan(...) and assert that result.issues contains at least
one IssueSeverity.CRITICAL; reference the class TestCVE202510155PickleScanning
and the PickleScanner.scan function to locate where to add this test.
- Around line 102-129: Replace the loose assertions that only check
len(result.issues) and severity with a targeted assertion that an issue exists
whose identifying fields match the expected payload signature (e.g., check_name
/ message / detail indicating "posix.system" or "nt.system") in the
PickleScanner tests (test_nt_system_in_bin_detected,
test_protocol2_posix_in_bin_detected and the posix .bin test above);
specifically, scan result.issues for an entry where issue.check_name (or
issue.message/detail) equals the expected sentinel (like "posix.system" or
"nt.system") and assert its severity is CRITICAL and its detail/message contains
the command string (e.g., "cmd /c whoami" or "id") so the test verifies the
exact detection rather than any unrelated finding.
- Around line 24-148: Add missing type annotations: import Path from pathlib at
the top and annotate each test function signature to include tmp_path: Path and
return type -> None (e.g., def
test_protocol0_global_opcode_detected_as_pickle(self, tmp_path: Path) -> None:).
Apply the same change for every test method in this file (examples:
test_protocol0_mark_opcode_detected_as_pickle, test_protocol2_still_detected,
test_safetensors_bin_not_misdetected, test_zip_bin_not_misdetected,
test_regular_bin_not_misdetected, test_bracket_start_not_misdetected,
test_brace_start_not_misdetected, and all methods inside
TestCVE202510155PickleScanning and TestCVE202510155BinaryPatterns) so each
tmp_path parameter is typed and each function ends with -> None.
- Around line 75-87: The test test_brace_start_not_misdetected uses a loose
negative assertion (assert result != "pickle") which can mask regressions;
update the assertion to require the exact expected format string (assert result
== "pytorch_binary") so the test explicitly verifies detect_file_format returns
the intended "pytorch_binary" value for a .bin file starting with '}' that is
not actually a pickle stream, and leave detect_file_format behavior unchanged
aside from ensuring it still returns "pytorch_binary" for this input.
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
CHANGELOG.mdmodelaudit/detectors/suspicious_symbols.pymodelaudit/utils/file/detection.pytests/conftest.pytests/test_cve_2025_10155_bin_pickle.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_cve_2025_10155_bin_pickle.py`:
- Around line 33-45: Add tests covering protocol 1 by creating a .bin file that
starts with the protocol 1 magic byte sequence (0x80 0x01) and contains a
malicious GLOBAL-based payload (e.g., posix.system/'echo pwned') and assert
detect_file_format(str(bin_path)) == "pickle"; also add a separate test that
feeds the same protocol 1 payload to PickleScanner.scan(...) and asserts the
scanner flags it as a pickle/malicious sample (use a test name like
test_protocol1_detected_for_detect_file_format and
test_protocol1_scan_detected_by_PickleScanner_scan so the protocol 1 branch is
exercised for both detect_file_format and PickleScanner.scan).
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
CHANGELOG.mdtests/test_cve_2025_10155_bin_pickle.py
….bin extension Pickle protocol 0/1 files lack the standard \x80\x0N magic bytes, allowing malicious pickles to evade detection when disguised with a .bin extension. - Extend detect_file_format() to recognize GLOBAL opcode (b'c') and MARK-based patterns in .bin files as pickle format - Add posix/nt internal module names (posix.system, nt.system, etc.) to BINARY_CODE_PATTERNS for defense-in-depth - Add 13 tests covering format detection, scanner integration, and pattern verification Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add tests verifying ] and } prefixed .bin files are not misdetected as pickle when they lack a properly-formed GLOBAL opcode (c<module>\n). The prior heuristic matched bare b"c" anywhere in the 16-byte header, catching the ASCII letter 'c' in ordinary text (e.g. "content"). Tighten the check to require b"c" followed by b"\n" within the header window, mirroring the existing c-start detection at line 227. The new has_global_opcode variable makes the intent explicit. Also document the EMPTY_LIST/EMPTY_DICT opcode rationale inline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
02be9af to
6a513e8
Compare
Summary
detect_file_format()to recognize pickle protocol 0/1 files disguised with.binextension by detecting GLOBAL opcode (b'c') and MARK-based patternsposix/ntinternal module names (posix.system,posix.popen,nt.system,nt.popen) toBINARY_CODE_PATTERNSfor defense-in-depth binary scanning\x80\x0Nmagic bytes, allowing malicious pickles to bypass format detection when saved as.binfilesTest plan
modelaudit scanon crafted.binpickle reports CRITICAL findings🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests
Documentation