PR #3: Text Formatter for Scan Command #393

shivasurya · 2025-11-21T23:26:16Z

Objective

Implement human-readable text output for the scan command with detection type badges, code snippets, severity grouping, and taint flow visualization.

Changes

New Files

output/text_formatter.go (268 lines) - TextFormatter implementation
output/text_formatter_test.go (596 lines) - Comprehensive tests

Modified Files

cmd/scan.go - Integrated enrichment pipeline and text formatter

Features

✅ Detection Type Badges: [Pattern], [Taint-Local], [Taint-Global]
✅ Severity Grouping: Critical → High → Medium → Low (ordered by priority)
✅ Detail Levels:

Critical/High: Full details with code snippets, taint flow, confidence
Medium/Low: Single-line abbreviated format
✅ Code Snippets: Line numbers with highlight markers (>)
✅ Taint Flow Visualization: Source → Sink with variable tracking
✅ Summary Statistics: Total findings, severity breakdown
✅ Verbose Mode: Detection method breakdown

Test Results

✅ All Go tests passing (19 packages)
✅ Text formatter coverage: 100%
✅ Output package coverage: 98.4%
✅ All Python tests passing (185 tests)
✅ Linting: 0 issues

Commits

Add text formatter with rich output - Core formatter implementation with 100% test coverage
Integrate text formatter in scan command - Replace old output with enrichment pipeline

Example Output

Code Pathfinder Security Scan

Results:

Critical Issues (1):

  [critical] [Taint-Local] command-injection: Command Injection
    CWE-78 | A03:2021

    auth/login.py:10

      > 10 | eval(user_input)

    Flow: user_input (line 5) -> eval (line 10)
    Tainted variable 'user_input' reaches dangerous sink without sanitization

    Confidence: High | Detection: Intra-procedural taint analysis

Summary:
  1 findings across 5 rules
  1 critical

Dependencies

Stacked on: PR Parse full method name as identifier #2 (shiva/output-logging-system)
Blocks: PR Feature: Add Method Declaration filter based on attributes #4 (JSON & CSV Formatters)

Tech Spec Reference

Implements Section 4.1 of output-standardization tech spec

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

safedep · 2025-11-21T23:26:20Z

SafeDep Report Summary

No dependency changes detected. Nothing to scan.

_{This report is generated by SafeDep Github App}

codecov · 2025-11-21T23:27:37Z

Codecov Report

❌ Patch coverage is 90.11628% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.66%. Comparing base (962a10c) to head (3f7d794).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
sourcecode-parser/cmd/scan.go	0.00%	17 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #393      +/-   ##
==========================================
+ Coverage   79.33%   79.66%   +0.33%     
==========================================
  Files          74       75       +1     
  Lines        7378     7542     +164     
==========================================
+ Hits         5853     6008     +155     
- Misses       1283     1292       +9     
  Partials      242      242

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

shivasurya · 2025-11-22T00:37:37Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

shivasurya · 2025-11-22T00:38:50Z

Merge activity

Nov 22, 12:38 AM UTC: A user started a stack merge that includes this pull request via Graphite.
Nov 22, 12:39 AM UTC: Graphite rebased this pull request as part of a merge.
Nov 22, 12:40 AM UTC: @shivasurya merged this pull request with Graphite.

- Detection type badges (Pattern, Taint-Local, Taint-Global) - Severity-based grouping with detail levels - Code snippets with line numbers and highlight - Taint flow visualization - Summary statistics - Comprehensive tests with 100% coverage Part of output standardization feature. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Replace old printDetections() with enrichment pipeline - Add enricher to add context and metadata to detections - Connect enricher -> formatter flow for rich output - Keep printDetections() for query command compatibility Part of output standardization feature. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

## Summary Implements JSON and CSV output formatters for the `ci` command, replacing the old inline JSON generation with a modular, well-tested implementation. **Part of output-standardization tech spec (Stacked PRs)** - ✅ PR #1: Logging System Infrastructure (#391) - **Merged** - ✅ PR #2: Output Package Foundation (#392) - **In Review** - ✅ PR #3: Text Formatter for Scan Command (#393) - **In Review** - 🔄 PR #4: JSON and CSV Formatters ← **This PR** ## Changes ### New Files - `output/json_formatter.go` (235 lines) - Enhanced JSON output with rich metadata structure - Tool, scan, results, summary, and errors sections - Code snippets with configurable context lines - Taint flow source/sink information - CWE, OWASP, and reference metadata - `output/csv_formatter.go` (123 lines) - CSV output for CI/CD integration - 17 columns: severity, confidence, rule_id, rule_name, cwe, owasp, file, line, column, function, message, detection_type, detection_scope, source_line, sink_line, tainted_var, sink_call - Proper escaping via encoding/csv package - `output/json_formatter_test.go` (415 lines) - Comprehensive tests achieving 100% coverage - Structure validation, snippet handling, metadata, pattern vs taint detection - `output/csv_formatter_test.go` (395 lines) - Comprehensive tests achieving 100% coverage - Header validation, escaping, multiple rows, zero values ### Modified Files - `cmd/ci.go` - Replaced old `generateJSONOutput()` with new formatter integration - Added enrichment pipeline using `output.NewEnricher()` - Updated output format validation to include "csv" - Added CSV formatter support - Updated help text and examples - Exit code 1 when vulnerabilities found (for CI/CD) - `cmd/ci_test.go` - Skipped obsolete `TestGenerateJSONOutput` (replaced by new formatter tests) - `main_test.go` - Updated expected help text to include CSV output format ## JSON Output Structure ```json { "tool": { "name": "Code Pathfinder", "version": "1.0.0", "url": "https://codepathfinder.dev" }, "scan": { "target": "/path/to/project", "timestamp": "2025-01-21T10:30:00Z", "duration": 5.43, "rules_executed": 12 }, "results": [{ "rule_id": "sql-injection", "rule_name": "SQL Injection", "message": "Unsanitized user input flows to SQL query", "severity": "critical", "confidence": "high", "location": { "file": "src/main.py", "line": 42, "column": 8, "function": "process_user", "snippet": { "start_line": 40, "end_line": 44, "lines": ["...", "query = f\"SELECT * FROM users WHERE id={user_id}\"", "..."] } }, "detection": { "type": "taint-local", "scope": "intra-procedural", "confidence_score": 0.95, "source": {"line": 38, "variable": "user_id"}, "sink": {"line": 42, "call": "execute"} }, "metadata": { "cwe": ["CWE-89"], "owasp": ["A03:2021"], "references": ["https://..."] } }], "summary": { "total": 5, "by_severity": {"critical": 2, "high": 3}, "by_detection_type": {"taint-local": 4, "pattern": 1} }, "errors": [] } ``` ## CSV Output Format ```csv severity,confidence,rule_id,rule_name,cwe,owasp,file,line,column,function,message,detection_type,detection_scope,source_line,sink_line,tainted_var,sink_call critical,high,sql-injection,SQL Injection,CWE-89,A03:2021,src/main.py,42,8,process_user,Unsanitized user input flows to SQL query,taint-local,intra-procedural,38,42,user_id,execute ``` ## Testing - All tests passing (100% coverage for both formatters) - Output package overall: 98.1% coverage - Linting checks passed - Integration tests with ci command verified ## Usage Examples ```bash # Generate JSON report pathfinder ci --rules rules/ --project . --output json > results.json # Generate CSV report pathfinder ci --rules rules/ --project . --output csv > results.csv # Generate SARIF report (existing) pathfinder ci --rules rules/ --project . --output sarif > results.sarif ``` ## Breaking Changes - Old `generateJSONOutput()` function removed from cmd/ci.go - JSON output structure changed to new rich format (snake_case fields) - Exit code behavior unchanged (exits 1 when vulnerabilities found) ## Stack Status This PR stacks on: - **PR #3**: shiva/output-text-formatter (#393) ← base branch - **PR #2**: shiva/output-logging-system (#392) - **main**: Production branch Next PR: - PR #5: SARIF Formatter Enhancement (will stack on this PR) 🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Summary Implements enhanced SARIF formatter with code flows, related locations, and rich metadata for optimal GitHub Code Scanning integration. **Part of output-standardization tech spec (Stacked PRs)** - ✅ PR #1: Logging System Infrastructure (#391) - **Merged** - ✅ PR #2: Output Package Foundation (#392) - **In Review** - ✅ PR #3: Text Formatter for Scan Command (#393) - **In Review** - ✅ PR #4: JSON and CSV Formatters (#394) - **In Review** - 🔄 PR #5: Enhanced SARIF Formatter ← **This PR** ## Changes ### New Files - `output/sarif_formatter.go` (290 lines) - SARIF 2.1.0 compliant output formatter - Code flows for taint path visualization (source → sink) - Related locations for taint sources - Help text with markdown and CWE references - Security severity scores (9.0, 7.0, 5.0, 3.0) - Rule properties: tags, precision - Deduplicates rules across multiple detections - `output/sarif_formatter_test.go` (519 lines) - Comprehensive tests achieving 97.5% coverage - Tests for version, tool metadata, rules, results - Code flow generation tests (taint-local, taint-global) - Related locations validation - Pattern vs taint detection differentiation ### Modified Files - `cmd/ci.go` - Replaced old `generateSARIFOutput()` with new formatter - Uses enriched detections for rich output - Removed unused imports (sarif library, json, encoding/json) - Consistent pattern with JSON and CSV formatters - `cmd/ci_test.go` - Skipped obsolete SARIF tests - Removed unused helper functions ## Key Features ### Code Flows Taint detections automatically include code flows showing the path from source to sink: ```json { "codeFlows": [{ "message": {"text": "Taint flow from line 10 to line 20"}, "threadFlows": [{ "locations": [ { "location": {"physicalLocation": {"region": {"startLine": 10}}}, "message": {"text": "Taint source: user_input"} }, { "location": {"physicalLocation": {"region": {"startLine": 20}}}, "message": {"text": "Taint sink: os.system"} } ] }] }] } ``` ### Help Text with Markdown Rules include rich help text with CWE references: ```markdown ## Command Injection User input flows to shell command without sanitization ### References - [CWE-78](https://cwe.mitre.org/data/definitions/78.html) ``` ### Security Severity Scores GitHub-compatible severity scores for prioritization: - Critical: 9.0 - High: 7.0 - Medium: 5.0 - Low: 3.0 ### Rule Properties ```json { "properties": { "tags": ["security"], "security-severity": "9.0", "precision": "high" } } ``` ## Benefits over Old Implementation | Feature | Old | New | |---------|-----|-----| | Code flows | ❌ None | ✅ Source → Sink visualization | | Related locations | ❌ None | ✅ Taint sources highlighted | | Help text | ❌ Plain text | ✅ Markdown with references | | Security severity | ❌ Level only | ✅ Numeric scores for GitHub | | Rule properties | ❌ None | ✅ Tags, precision | | Pattern detection | ❌ Same as taint | ✅ No code flows (correct) | | Test coverage | ❌ ~60% | ✅ 97.5% | ## Testing - All tests passing (97.5% coverage on SARIF formatter) - Output package overall: 97.5% coverage - Linting checks passed - Integration with ci command verified ## Usage Examples ```bash # Generate enhanced SARIF report with code flows pathfinder ci --rules rules/ --project . --output sarif > results.sarif # Upload to GitHub Code Scanning gh api /repos/:owner/:repo/code-scanning/sarifs -F sarif=@results.sarif # View in GitHub UI with code flows highlighted ``` ## SARIF Output Sample ```json { "version": "2.1.0", "runs": [{ "tool": { "driver": { "name": "Code Pathfinder", "version": "0.0.25", "rules": [{ "id": "sql-injection", "name": "SQL Injection", "fullDescription": {"text": "Unsanitized user input flows to SQL query (CWE-89, A03:2021)"}, "helpUri": "https://github.com/shivasurya/code-pathfinder", "defaultConfiguration": {"level": "error"}, "properties": { "tags": ["security"], "security-severity": "9.0", "precision": "high" } }] } }, "results": [{ "ruleId": "sql-injection", "message": {"text": "Unsanitized user input flows to SQL query (sink: execute, confidence: 95%)"}, "locations": [{ "physicalLocation": { "artifactLocation": {"uri": "src/db/queries.py"}, "region": {"startLine": 42, "startColumn": 8} } }], "codeFlows": [...], "relatedLocations": [...] }] }] } ``` ## Breaking Changes - Old `generateSARIFOutput()` function removed - SARIF output structure enhanced with additional fields - Pattern matches no longer include code flows (correct behavior) ## Stack Status This PR stacks on: - **PR #4**: shiva/output-json-csv-formatters (#394) ← base branch - **PR #3**: shiva/output-text-formatter (#393) - **PR #2**: shiva/output-logging-system (#392) - **main**: Production branch Next PR: - PR #6: Exit Code Standardization (will stack on this PR) 🤖 Generated with [Claude Code](https://claude.com/claude-code)

This was referenced Nov 21, 2025

PR #4: Add JSON and CSV Output Formatters for CI Mode #394

Merged

PR #5: Enhanced SARIF Formatter with Code Flows #395

Merged

This was referenced Nov 22, 2025

PR #2: Structured Logging System #392

Merged

PR #6: Exit Code Standardization & --fail-on Flag #396

Merged

PR #7: Command Cleanup & Documentation #397

Merged

shivasurya changed the base branch from shiva/output-logging-system to graphite-base/393 November 22, 2025 00:38

shivasurya changed the base branch from graphite-base/393 to main November 22, 2025 00:39

shivasurya and others added 2 commits November 22, 2025 00:39

shivasurya force-pushed the shiva/output-text-formatter branch from d6b8b1c to 3f7d794 Compare November 22, 2025 00:39

shivasurya merged commit dd0a468 into main Nov 22, 2025
3 checks passed

shivasurya deleted the shiva/output-text-formatter branch November 22, 2025 00:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PR #3: Text Formatter for Scan Command #393

PR #3: Text Formatter for Scan Command #393

Uh oh!

shivasurya commented Nov 21, 2025

Uh oh!

safedep bot commented Nov 21, 2025 •

edited

Loading

Uh oh!

codecov bot commented Nov 21, 2025 •

edited

Loading

Uh oh!

shivasurya commented Nov 22, 2025 •

edited

Loading

Uh oh!

shivasurya commented Nov 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PR #3: Text Formatter for Scan Command #393

PR #3: Text Formatter for Scan Command #393

Uh oh!

Conversation

shivasurya commented Nov 21, 2025

Objective

Changes

New Files

Modified Files

Features

Test Results

Commits

Example Output

Dependencies

Tech Spec Reference

Uh oh!

safedep bot commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

SafeDep Report Summary

Uh oh!

codecov bot commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

shivasurya commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shivasurya commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

safedep bot commented Nov 21, 2025 •

edited

Loading

codecov bot commented Nov 21, 2025 •

edited

Loading

shivasurya commented Nov 22, 2025 •

edited

Loading

shivasurya commented Nov 22, 2025 •

edited

Loading