Skip to content

Conversation

monoetienne
Copy link
Collaborator

@monoetienne monoetienne commented Sep 19, 2025

  • Enhanced slither to JSON py file
  • Added parsing cli function with additional vulnerability patterns and structured findings

Summary by CodeRabbit

  • New Features
    • Converts Slither analysis output into standardized diagnostic results.
    • Aggregates findings from multiple sources with regex-based parsing.
    • Supports key detectors: Controlled Delegatecall, Unchecked Low Level Call, Missing Zero Address Validation, plus heuristic fallback from general detector logs.
    • Assigns severities with file/function context, line ranges, timestamps, unique IDs, and optional references.
    • Produces consistent, JSON-like findings to streamline security reporting and tooling integration.

Copy link

coderabbitai bot commented Sep 19, 2025

Walkthrough

Adds a new function parse_slither_cli_output(stdout, stderr, contract_path) to parse Slither CLI outputs using regex, merge stderr/stdout, detect specific categories, infer severities heuristically, and emit standardized finding objects with IDs, timestamps, metadata, and line/function context.

Changes

Cohort / File(s) Summary
Slither diagnostics parsing
Static_agent/Slither_agent/slither_to_json_with_diagnostics.py
Adds parse_slither_cli_output(stdout, stderr, contract_path) that merges outputs, applies regex patterns for multiple detector categories (Controlled Delegatecall, Unchecked Low Level Call, Missing Zero Address Validation, and generic INFO parsing), assigns severities, and generates structured findings with UUIDs and ISO timestamps.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant SlitherCLI as Slither CLI
  participant Parser as parse_slither_cli_output
  participant Regex as Regex Detectors
  participant Aggregator as Findings Builder
  participant Output as JSON-like Findings

  User->>SlitherCLI: Run slither on contract_path
  SlitherCLI-->>User: stdout, stderr
  User->>Parser: stdout, stderr, contract_path
  Parser->>Parser: Merge stdout + stderr
  Parser->>Regex: Apply category-specific patterns
  Regex-->>Parser: Matched items (file, function, lines, messages)
  Parser->>Aggregator: Normalize + assign severity, IDs, timestamps
  Aggregator-->>Output: Findings array
  Output-->>User: Structured diagnostics
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • robaribas

Poem

Thump-thump goes my parser’s heart,
From stderr shadows I glean each part.
Regex whiskers twitch with care,
Findings hop out, crisp and fair.
UUID carrots, timestamps bright—
A bunny logs the audit night.
(_(_)* ฅ^•ﻌ•^ฅ 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely and accurately summarizes the primary change: enhanced Slither CLI output parsing to detect additional vulnerability patterns and emit structured findings, which matches the PR objectives and the added parse_slither_cli_output function in the diff.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/stag-sltojson-parse-slither-cli

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
Static_agent/Slither_agent/slither_to_json_with_diagnostics.py (6)

52-53: Fix the regex pattern to correctly escape hyphen in character class

The regex pattern contains input-controlled which should have the hyphen escaped or placed at the beginning/end of any character class if used within one. The current pattern works but could be more explicit.

-    controlled_delegatecall_pattern = r'([^\n]+)\s+uses delegatecall to a input-controlled function id\n\t- ([^\n]+)'
+    controlled_delegatecall_pattern = r'([^\n]+)\s+uses delegatecall to a input-controlled function id\n\t- ([^\n]+)'

Note: The pattern itself is correct as-is since the hyphen is outside character classes, but consider using raw strings consistently and documenting the expected format.


120-126: Inconsistent regex pattern - missing escape for dot character

The regex pattern ([^.]+)\.([^.]+) uses dots which should be escaped when matching literal dots. While [^.] correctly matches any character except dot, the \. between them should remain escaped as intended.

The pattern appears correct but consider adding a comment to clarify that \. matches literal dots while [^.] matches any non-dot character, as this distinction might confuse future maintainers.


157-161: Potential for duplicate findings due to overlapping patterns

The code checks if lines contain patterns already processed, but this string-based check could miss variations in formatting or fail if the patterns don't match exactly.

Consider tracking processed findings by a unique identifier (like line number + detector type) instead of string matching:

+    processed_findings = set()  # Track (line_num, detector_type) tuples
     for match in matches:
         lines = match.strip().split('\n')
         current_finding = None
         
         for line in lines:
             line = line.strip()
-            if not line or any(pattern in line for pattern in [
-                'uses delegatecall to a input-controlled function id',
-                'ignores return value by',
-                'lacks a zero-check on'
-            ]):
-                continue  # Skip lines we've already processed above
+            if not line:
+                continue
+            
+            # Create a simple hash to track processed items
+            line_hash = hash(line[:50])  # Use first 50 chars as identifier
+            if line_hash in processed_findings:
+                continue
+            processed_findings.add(line_hash)

178-184: Consider extracting severity inference logic to a separate function

The severity determination logic is embedded in the main parsing function, making it harder to maintain and test. Consider extracting it for better modularity.

def infer_severity(vuln_type):
    """Infer severity level from vulnerability type description"""
    vuln_lower = vuln_type.lower()
    
    high_keywords = ["reentrancy", "controlled", "delegatecall"]
    medium_keywords = ["low level", "unchecked", "missing"]
    low_keywords = ["naming", "convention", "style"]
    
    if any(keyword in vuln_lower for keyword in high_keywords):
        return "High"
    elif any(keyword in vuln_lower for keyword in medium_keywords):
        return "Medium"
    elif any(keyword in vuln_lower for keyword in low_keywords):
        return "Low"
    return "Medium"  # default

Then use it as:

-    # Determine severity based on vulnerability type
-    severity = "Medium"  # default
-    if any(keyword in vuln_type.lower() for keyword in ["reentrancy", "controlled", "delegatecall"]):
-        severity = "High"
-    elif any(keyword in vuln_type.lower() for keyword in ["low level", "unchecked", "missing"]):
-        severity = "Medium"
-    elif any(keyword in vuln_type.lower() for keyword in ["naming", "convention", "style"]):
-        severity = "Low"
+    severity = infer_severity(vuln_type)

202-208: Consider initializing 'elements' list at finding creation

The code checks if 'elements' exists in the finding dictionary before appending, but all findings are created with an 'elements' key. This check seems unnecessary.

     elif current_finding and (line.startswith('\t') or line.startswith('Reference:')):
-        if 'elements' not in current_finding:
-            current_finding['elements'] = []
         current_finding['elements'].append(line)
         
         if line.startswith('Reference:'):
             current_finding['reference'] = line.replace('Reference:', '').strip()

143-144: Inconsistent line_range data type

The line_range is set to a single line number string here, while other findings use a range format like "start-end". This inconsistency could cause issues for downstream consumers.

-    "line_range": line_num,
+    "line_range": f"{line_num}-{line_num}",  # Use consistent range format
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 64bb43a and 4c3b832.

📒 Files selected for processing (1)
  • Static_agent/Slither_agent/slither_to_json_with_diagnostics.py (1 hunks)
🔇 Additional comments (4)
Static_agent/Slither_agent/slither_to_json_with_diagnostics.py (4)

198-199: Line range format inconsistency

When file_info[1] or file_info[2] is None, this creates a string like "None-None" rather than setting line_range to None.

-    "line_range": f"{file_info[1]}-{file_info[2]}" if file_info[1] and file_info[2] else None
+    "line_range": f"{file_info[1]}-{file_info[2]}" if file_info[1] and file_info[2] else None

The current implementation is correct - it only formats the string if both values exist, otherwise returns None.


174-175: Unsafe assumption about regex match groups

The code accesses func_match.group(1) without checking if func_match is not None, which could cause an AttributeError.

-    function_name = func_match.group(1).strip() if func_match else None
-    file_info = file_match.groups() if file_match else (contract_path, None, None)
+    function_name = func_match.group(1).strip() if func_match else None
+    file_info = file_match.groups() if file_match else (contract_path, None, None)

The current code is actually correct with the ternary operator. However, ensure consistent null checking across all regex operations.


57-62: Add error handling for regex group extraction

The code assumes func_match groups exist but doesn't handle cases where the regex might not match the expected format. This could lead to AttributeError if func_match is None.

     func_match = re.search(r'([^(]+)\(([^)]*)\)\s+\(([^#]+)#(\d+)-(\d+)\)', func_sig.strip())
     if func_match:
         contract_func = func_match.group(1).strip()
         file_path = func_match.group(3).strip()
         start_line = func_match.group(4)
         end_line = func_match.group(5)
+    else:
+        # Log warning or handle the case where pattern doesn't match
+        continue

Likely an incorrect or invalid review comment.


69-70: Populate tool_version from Slither instead of "unknown"

Verification attempt failed: slither is not installed in the verification environment (/bin/bash: line 4: slither: command not found).

  • Primary: invoke slither --version (capture stdout/stderr) and parse semantic version (e.g. regex \d+.\d+.\d+).
  • Fallbacks: read package metadata (python -m pip show slither-analyzer or python -c "import importlib.metadata as m; print(m.version('slither-analyzer'))") or inspect pyproject.lock/requirements.txt.
  • Apply change in Static_agent/Slither_agent/slither_to_json_with_diagnostics.py (lines 69–70): set tool_version to detected version string and add a CI/unit check to catch missing detection.

@aleskrin aleskrin merged commit ce58770 into main Sep 21, 2025
1 check passed
@robaribas robaribas deleted the feature/stag-sltojson-parse-slither-cli branch October 2, 2025 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants