-
Notifications
You must be signed in to change notification settings - Fork 3
Enhanced Slither CLI output parsing with additional vulnerability patterns and structured findings #104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…terns and structured findings
WalkthroughAdds a new function parse_slither_cli_output(stdout, stderr, contract_path) to parse Slither CLI outputs using regex, merge stderr/stdout, detect specific categories, infer severities heuristically, and emit standardized finding objects with IDs, timestamps, metadata, and line/function context. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User
participant SlitherCLI as Slither CLI
participant Parser as parse_slither_cli_output
participant Regex as Regex Detectors
participant Aggregator as Findings Builder
participant Output as JSON-like Findings
User->>SlitherCLI: Run slither on contract_path
SlitherCLI-->>User: stdout, stderr
User->>Parser: stdout, stderr, contract_path
Parser->>Parser: Merge stdout + stderr
Parser->>Regex: Apply category-specific patterns
Regex-->>Parser: Matched items (file, function, lines, messages)
Parser->>Aggregator: Normalize + assign severity, IDs, timestamps
Aggregator-->>Output: Findings array
Output-->>User: Structured diagnostics
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
Static_agent/Slither_agent/slither_to_json_with_diagnostics.py (6)
52-53
: Fix the regex pattern to correctly escape hyphen in character classThe regex pattern contains
input-controlled
which should have the hyphen escaped or placed at the beginning/end of any character class if used within one. The current pattern works but could be more explicit.- controlled_delegatecall_pattern = r'([^\n]+)\s+uses delegatecall to a input-controlled function id\n\t- ([^\n]+)' + controlled_delegatecall_pattern = r'([^\n]+)\s+uses delegatecall to a input-controlled function id\n\t- ([^\n]+)'Note: The pattern itself is correct as-is since the hyphen is outside character classes, but consider using raw strings consistently and documenting the expected format.
120-126
: Inconsistent regex pattern - missing escape for dot characterThe regex pattern
([^.]+)\.([^.]+)
uses dots which should be escaped when matching literal dots. While[^.]
correctly matches any character except dot, the\.
between them should remain escaped as intended.The pattern appears correct but consider adding a comment to clarify that
\.
matches literal dots while[^.]
matches any non-dot character, as this distinction might confuse future maintainers.
157-161
: Potential for duplicate findings due to overlapping patternsThe code checks if lines contain patterns already processed, but this string-based check could miss variations in formatting or fail if the patterns don't match exactly.
Consider tracking processed findings by a unique identifier (like line number + detector type) instead of string matching:
+ processed_findings = set() # Track (line_num, detector_type) tuples for match in matches: lines = match.strip().split('\n') current_finding = None for line in lines: line = line.strip() - if not line or any(pattern in line for pattern in [ - 'uses delegatecall to a input-controlled function id', - 'ignores return value by', - 'lacks a zero-check on' - ]): - continue # Skip lines we've already processed above + if not line: + continue + + # Create a simple hash to track processed items + line_hash = hash(line[:50]) # Use first 50 chars as identifier + if line_hash in processed_findings: + continue + processed_findings.add(line_hash)
178-184
: Consider extracting severity inference logic to a separate functionThe severity determination logic is embedded in the main parsing function, making it harder to maintain and test. Consider extracting it for better modularity.
def infer_severity(vuln_type): """Infer severity level from vulnerability type description""" vuln_lower = vuln_type.lower() high_keywords = ["reentrancy", "controlled", "delegatecall"] medium_keywords = ["low level", "unchecked", "missing"] low_keywords = ["naming", "convention", "style"] if any(keyword in vuln_lower for keyword in high_keywords): return "High" elif any(keyword in vuln_lower for keyword in medium_keywords): return "Medium" elif any(keyword in vuln_lower for keyword in low_keywords): return "Low" return "Medium" # defaultThen use it as:
- # Determine severity based on vulnerability type - severity = "Medium" # default - if any(keyword in vuln_type.lower() for keyword in ["reentrancy", "controlled", "delegatecall"]): - severity = "High" - elif any(keyword in vuln_type.lower() for keyword in ["low level", "unchecked", "missing"]): - severity = "Medium" - elif any(keyword in vuln_type.lower() for keyword in ["naming", "convention", "style"]): - severity = "Low" + severity = infer_severity(vuln_type)
202-208
: Consider initializing 'elements' list at finding creationThe code checks if 'elements' exists in the finding dictionary before appending, but all findings are created with an 'elements' key. This check seems unnecessary.
elif current_finding and (line.startswith('\t') or line.startswith('Reference:')): - if 'elements' not in current_finding: - current_finding['elements'] = [] current_finding['elements'].append(line) if line.startswith('Reference:'): current_finding['reference'] = line.replace('Reference:', '').strip()
143-144
: Inconsistent line_range data typeThe
line_range
is set to a single line number string here, while other findings use a range format like"start-end"
. This inconsistency could cause issues for downstream consumers.- "line_range": line_num, + "line_range": f"{line_num}-{line_num}", # Use consistent range format
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
Static_agent/Slither_agent/slither_to_json_with_diagnostics.py
(1 hunks)
🔇 Additional comments (4)
Static_agent/Slither_agent/slither_to_json_with_diagnostics.py (4)
198-199
: Line range format inconsistencyWhen
file_info[1]
orfile_info[2]
isNone
, this creates a string like"None-None"
rather than settingline_range
toNone
.- "line_range": f"{file_info[1]}-{file_info[2]}" if file_info[1] and file_info[2] else None + "line_range": f"{file_info[1]}-{file_info[2]}" if file_info[1] and file_info[2] else NoneThe current implementation is correct - it only formats the string if both values exist, otherwise returns
None
.
174-175
: Unsafe assumption about regex match groupsThe code accesses
func_match.group(1)
without checking iffunc_match
is notNone
, which could cause anAttributeError
.- function_name = func_match.group(1).strip() if func_match else None - file_info = file_match.groups() if file_match else (contract_path, None, None) + function_name = func_match.group(1).strip() if func_match else None + file_info = file_match.groups() if file_match else (contract_path, None, None)The current code is actually correct with the ternary operator. However, ensure consistent null checking across all regex operations.
57-62
: Add error handling for regex group extractionThe code assumes
func_match
groups exist but doesn't handle cases where the regex might not match the expected format. This could lead toAttributeError
iffunc_match
isNone
.func_match = re.search(r'([^(]+)\(([^)]*)\)\s+\(([^#]+)#(\d+)-(\d+)\)', func_sig.strip()) if func_match: contract_func = func_match.group(1).strip() file_path = func_match.group(3).strip() start_line = func_match.group(4) end_line = func_match.group(5) + else: + # Log warning or handle the case where pattern doesn't match + continueLikely an incorrect or invalid review comment.
69-70
: Populate tool_version from Slither instead of "unknown"Verification attempt failed:
slither
is not installed in the verification environment (/bin/bash: line 4: slither: command not found
).
- Primary: invoke
slither --version
(capture stdout/stderr) and parse semantic version (e.g. regex \d+.\d+.\d+).- Fallbacks: read package metadata (
python -m pip show slither-analyzer
orpython -c "import importlib.metadata as m; print(m.version('slither-analyzer'))"
) or inspect pyproject.lock/requirements.txt.- Apply change in Static_agent/Slither_agent/slither_to_json_with_diagnostics.py (lines 69–70): set tool_version to detected version string and add a CI/unit check to catch missing detection.
Summary by CodeRabbit