Skip to content

feat(query): --format sarif | annotations (B.8 — pipe rows into GitHub Code Scanning + PR annotations)#43

Merged
SutuSebastian merged 7 commits intomainfrom
feat/sarif-formatter
May 2, 2026
Merged

feat(query): --format sarif | annotations (B.8 — pipe rows into GitHub Code Scanning + PR annotations)#43
SutuSebastian merged 7 commits intomainfrom
feat/sarif-formatter

Conversation

@SutuSebastian
Copy link
Copy Markdown
Contributor

@SutuSebastian SutuSebastian commented May 2, 2026

Summary

Adds codemap query --format <text|json|sarif|annotations> so any recipe row-set can be piped into:

  • SARIF 2.1.0 — GitHub Code Scanning (and any SARIF-aware viewer) without a custom Action wrapper.
  • GH annotations::notice file=…,line=…::msg per row so PR diffs surface findings inline.

Pure output-formatter additions on top of the existing JSON pipeline; no schema impact. Closes B.8 from docs/research/fallow.md.

Tracers (one commit each)

  1. --format flag parser (cmd-query.ts) — enum + tests + parse-time rejection of incompatible combos (--summary / --group-by / baseline).
  2. formatSarif (application/output-formatters.ts) — pure transport-agnostic; wired end-to-end for deprecated-symbols recipe; 22 unit tests on location detection, message construction, region emission, ad-hoc rule id, recipe-body fullDescription.
  3. formatAnnotations::notice file=…,line=…::msg per row; collapses newlines (GH parser stops at first one); level override supported.
  4. Edge cases — verified by smoke (no new code): aggregate recipes (index-summary, markers-by-kind) emit valid SARIF with results: [] + stderr warning; ad-hoc SQL gets rule.id = codemap.adhoc.
  5. MCP integration — same format argument on query / query_recipe tools; 4 new MCP server tests; same incompatibility guard mirrored at the tool layer. query_batch deferred to v1.x (annotation/sarif on a heterogeneous batch is awkward).
  6. Docs sync — README CLI stripe, glossary (SARIF + GH annotations), architecture.md (Output formatters wiring), agent rule + skill (.agents/ + templates/agents/ per Rule 10), changeset (minor); plan doc deleted on ship per Rule 3.

Decisions (from the now-deleted docs/plans/sarif-formatter.md)

# Decision
Location auto-detection file_path / path / to_path / from_path priority + optional line_start / line_end
Aggregate recipes Skipped → results: [] + stderr warning
rule.id taxonomy codemap.<recipe-id> for --recipe; codemap.adhoc for ad-hoc SQL
result.level "note" (per-recipe override deferred to v1.x via frontmatter)
Flag precedence --format overrides --json; --json stays as alias for --format json; default text
Combo guard sarif / annotations reject --summary / --group-by / baseline at parse time (different output shapes)

Behavior change

Additive — no existing flag semantics change. New --format flag; new format arg on two MCP tools. Plan doc deleted on ship.

Test plan

  • 22 formatter unit tests + 4 MCP server tests + 7 parser tests pass locally.
  • End-to-end smoke: bun src/index.ts query --recipe deprecated-symbols --format sarif emits a valid SARIF 2.1.0 doc; --format annotations emits one ::notice line per row.
  • bun run check (format / lint / typecheck / 458+ tests / golden queries / barrel-files) green on every tracer commit.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added --format flag to codemap query with support for sarif (GitHub Code Scanning), annotations (GitHub Actions inline notices), json, and text formats.
    • Automatic location column detection for formatted query results.
  • Documentation

    • Updated CLI reference, glossary, and architecture documentation with new output formatting capabilities and examples.

…er 1 of 6)

Adds the parser slice + tests for the new --format flag. No formatter wired yet — sarif/annotations values parse and propagate but render via the existing text/json paths until Tracers 2 + 3 land the actual formatters.

Per docs/plans/sarif-formatter.md § D9, --format overrides --json when both passed; --json alone implies --format json (back-compat); default = text. Plan doc committed alongside (created on commit, deleted on ship per docs-governance Rule 3).
…f 6)

- New application/output-formatters.ts: pure transport-agnostic formatter; SARIF 2.1.0 doc with auto-detected location columns (file_path / path / to_path / from_path priority) and optional region.startLine + .endLine. Recipes without locations emit results: [] + stderr warning (per plan § D6 + § D8).
- Wired into runQueryCmd: --format sarif short-circuits to printFormattedQuery before printQueryResult; recipe description / body pulled from getQueryRecipeCatalogEntry to populate rule.shortDescription / fullDescription.
- Parser combo guard: --format sarif|annotations cannot be combined with --summary / --group-by / --save-baseline / --baseline (different output shapes — sarif/annotations require flat rows). Tested.
- 22 unit tests on the formatter cover location detection, message construction, region emission, ad-hoc rule id (codemap.adhoc), recipe-body fullDescription.
- Annotations branch is a stub that prints "not yet implemented" + returns 1 — Tracer 3 lands the actual formatter.

Verified end-to-end: bun src/index.ts query --recipe deprecated-symbols --format sarif emits a valid SARIF 2.1.0 doc with one result for the @deprecated fixture symbol.
GitHub Actions ::notice file=…,line=…::msg per row. One line per locatable row; rows without a location column are silently skipped (caller decides whether to print a stderr warning); empty input → empty string. Newlines in the message are collapsed to single spaces because the GH parser stops at the first newline.

Default level 'notice'; 'warning' / 'error' overrides supported for future per-recipe severity (sarifLevel frontmatter, deferred to v1.x).

Verified end-to-end: bun src/index.ts query --recipe deprecated-symbols --format annotations emits the expected ::notice line for the @deprecated fixture symbol.
…acer 5 of 6)

Adds the same --format CLI surface to the MCP query / query_recipe tools so agents can request a formatted text payload directly without piping through codemap query.

- New formatEnum on the inputSchema (json | sarif | annotations); 'text' is omitted because terminal-table output is useless to an agent.
- formatToolIncompatibility mirrors the CLI parser's incompatibility check (sarif/annotations + summary/group_by → error).
- formattedQueryResult shared helper between query and query_recipe — query_recipe passes recipeId so the SARIF rule.id derives to codemap.<recipe>; query (ad-hoc) leaves it undefined → codemap.adhoc.
- 4 new MCP server tests cover SARIF on a recipe, annotations on a recipe, sarif+summary rejection, and sarif on ad-hoc SQL.

query_batch deliberately not wired — annotation/sarif on a heterogeneous batch is awkward (every statement could ask for a different format) and no real consumer has asked. Defer to v1.x.
…te plan + changeset (Tracer 6 of 6)

- README.md "Daily commands" stripe: add --format <text|json|sarif|annotations> example pair (sarif > findings.sarif and annotations).
- docs/glossary.md: new 'SARIF' and 'GH annotations' entries (per Rule 9 — new domain nouns).
- docs/architecture.md: new 'Output formatters' wiring paragraph above 'Validate wiring'; covers location auto-detection, rule-id taxonomy, MCP integration, deferred-to-v1.x overrides.
- .agents/rules/codemap.md + templates/agents/rules/codemap.md (Rule 10): new 'SARIF / GH annotations' row in the CLI table.
- .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md: query / query_recipe tool descriptions extended with format arg semantics; query_batch deferral noted.
- .changeset/sarif-formatter.md: minor changeset (new flag).
- docs/plans/sarif-formatter.md: deleted on ship per docs-governance Rule 3.
- src/{cli/cmd-query.ts, application/output-formatters.ts}: replaced dangling cross-refs to the deleted plan with cross-refs to architecture.md § Output formatters.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 2, 2026

🦋 Changeset detected

Latest commit: 4cd91ba

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@stainless-code/codemap Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

Warning

Rate limit exceeded

@SutuSebastian has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 36 minutes and 8 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 14822b4f-fa15-4e5f-ae58-0e49424f44d8

📥 Commits

Reviewing files that changed from the base of the PR and between c592fa5 and 4cd91ba.

📒 Files selected for processing (5)
  • docs/glossary.md
  • src/application/output-formatters.test.ts
  • src/application/output-formatters.ts
  • src/cli/cmd-query.test.ts
  • src/cli/cmd-query.ts
📝 Walkthrough

Walkthrough

This PR adds support for outputting codemap query results as SARIF 2.1.0 JSON or GitHub Actions inline annotations via a new --format flag. Auto-detection of location columns is implemented, with incompatibility guards against --summary, --group-by, and baseline modes. Support extends to CLI, MCP tools, and documentation.

Changes

SARIF / Annotations Output Formatting

Layer / File(s) Summary
Type Definitions
src/cli/cmd-query.ts
Adds OUTPUT_FORMATS constant, OutputFormat type, and isOutputFormat validator for `"text"
Core Formatter Implementation
src/application/output-formatters.ts
New module provides detectLocationColumn, hasLocatableRows, buildMessageText, formatSarif, and formatAnnotations functions to convert flat row sets into SARIF 2.1.0 JSON or GitHub Actions workflow commands. Supports recipe metadata for rule IDs and descriptions, with auto-detection of file_path/path and line columns.
CLI Flag Parsing & Validation
src/cli/cmd-query.ts, src/cli/main.ts
parseQueryRest extended to parse --format flag and return format: OutputFormat in run payload. formatIncompatibility helper rejects sarif/annotations with --summary, --group-by, or baseline flags. runQueryCmd branches on format to call printFormattedQuery for SARIF/annotations output. main.ts passes parsed format to command handler.
MCP Tool Integration
src/application/mcp-server.ts
query and query_recipe MCP tools now accept optional format parameter (validated via formatEnum). Tool handlers branch on `format: "sarif"
Documentation
README.md, docs/architecture.md, docs/glossary.md, .agents/rules/codemap.md, .agents/skills/codemap/SKILL.md, .changeset/sarif-formatter.md, templates/agents/rules/codemap.md, templates/agents/skills/codemap/SKILL.md
CLI/MCP tool signatures documented with --format / format parameter and SARIF rule-ID conventions. Architecture and glossary entries clarify location detection, incompatibilities, and per-recipe SARIF overrides via frontmatter fields.
Test Coverage
src/application/output-formatters.test.ts, src/application/mcp-server.test.ts, src/cli/cmd-query.test.ts
Comprehensive tests for location detection, message construction, SARIF generation (version, rule IDs, regions, fullDescription), GitHub annotations formatting, MCP tool format handling, and CLI --format flag parsing including precedence and error cases.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI Parser
    participant Validator as Format Validator
    participant Query as Query Executor
    participant Formatter as Output Formatter
    participant Output as stdout/stderr

    CLI->>Validator: parseQueryRest(args)<br/>extract --format
    Validator->>Validator: validate format vs<br/>--summary/--group-by
    alt Format Incompatibility
        Validator->>Output: return error
    else Valid
        Validator->>Query: runQueryCmd({ format })
        Query->>Query: executeQuery(sql)
        Query->>Formatter: detectLocationColumn(rows)
        alt No Locatable Rows
            Formatter->>Output: warn to stderr
            Formatter->>Output: results: [] (empty SARIF)
        else Locatable Rows Found
            Formatter->>Formatter: format === "sarif"?<br/>buildMessageText(),<br/>construct SARIF results
            Formatter->>Formatter: format === "annotations"?<br/>emit ::notice lines
            Formatter->>Output: print formatted text
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

enhancement, documentation

Poem

🐰 A format so fine, SARIF by line,
Annotations dance on PRs so divine,
Location columns detected with care,
Results flowing forth, rules laid bare!
Query now speaks in formats so rare,
Codemap's new voice rings through the air. 🎯

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: adding --format sarif | annotations to the query command for piping rows into GitHub Code Scanning and PR annotations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/sarif-formatter

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 36 minutes and 8 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
src/cli/cmd-query.test.ts (1)

542-602: ⚡ Quick win

Add explicit parser tests for --format incompatibility combos.

This suite validates parsing/value errors well, but it doesn’t lock the guard behavior for --format sarif|annotations combined with --summary, --group-by, and baseline flags. Adding those assertions here would prevent regressions on the most failure-prone matrix.

Proposed test additions
 describe("parseQueryRest — --format flag", () => {
+  it("rejects --format sarif with --summary", () => {
+    const r = parseQueryRest([
+      "query",
+      "--format",
+      "sarif",
+      "--summary",
+      "SELECT 1",
+    ]);
+    expect(r.kind).toBe("error");
+  });
+
+  it("rejects --format annotations with --group-by", () => {
+    const r = parseQueryRest([
+      "query",
+      "--format",
+      "annotations",
+      "--group-by",
+      "directory",
+      "SELECT file_path FROM symbols",
+    ]);
+    expect(r.kind).toBe("error");
+  });
+
+  it("rejects --format sarif with --baseline", () => {
+    const r = parseQueryRest([
+      "query",
+      "--format",
+      "sarif",
+      "--baseline=base",
+      "SELECT 1",
+    ]);
+    expect(r.kind).toBe("error");
+  });
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/cmd-query.test.ts` around lines 542 - 602, Add explicit tests to
parseQueryRest that assert --format values "sarif" and "annotations" are
rejected when used with incompatible flags: --summary, --group-by, and any
baseline flags (e.g., --baseline, --baseline-compare); for each combination call
parseQueryRest([...]) and assert r.kind === "error" and that r.message mentions
both the offending --format value and the conflicting flag so failures are
clear; add these tests alongside the existing "--format flag" describe block
referencing parseQueryRest and the exact flag names to lock guard behavior and
prevent regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/glossary.md`:
- Around line 383-386: Move the "### GH annotations" entry into the alphabetical
"G" section of the glossary (it currently resides under the "S" section); locate
the header "### GH annotations" and cut/paste its entire paragraph so it appears
under the existing "## G" heading (ensuring surrounding entries remain
alphabetically ordered), and verify the surrounding headings and anchor links or
cross-references still render correctly after the move.

In `@src/application/output-formatters.ts`:
- Around line 200-205: The current annotation assembly emits raw values (file,
lineN, and message) into the GitHub Actions command via lines.push(`::${level}
${params.join(",")}::${message}`), which breaks if file contains "," or ":" or
message contains "%" or CR/LF; update the code around params, file, lineN,
buildMessageText, and the final lines.push so that: percent-encode
workflow-command field values per GitHub rules (replace "%" -> "%25", "\r" ->
"%0D", "\n" -> "%0A") for the message and also escape "," -> "%2C" and ":" ->
"%3A" in parameter values (file and any param strings) before joining and
printing; ensure buildMessageText(row) is still trimmed/collapsed before
applying the encoding so annotations are well-formed.

In `@src/cli/cmd-query.ts`:
- Around line 625-631: The non-SARIF output branch still checks opts.json
instead of the resolved format, so the new --format flag is ignored; update the
result-rendering logic to use the resolved OutputFormat (e.g. a local variable
like resolvedFormat or the field opts.format) instead of opts.json when deciding
between text vs JSON rendering (and likewise ensure any checks that disallow
group-by/summary/baseline for "sarif" or "annotations" use the same resolved
format). Locate the code that reads opts.json and opts.format (and the type
OutputFormat / format?: OutputFormat) and replace conditional checks on
opts.json with checks on the resolved format variable so --format overrides
--json/--text consistently (also apply the same change to the other branch
ranges noted around the other block).
- Around line 47-60: Update the CLI help and parser usage strings to document
the new --format flag and its accepted values: reference OUTPUT_FORMATS
(text|json|sarif|annotations) and the incompatibility rule that --format
overrides --json when both are supplied (with --json remaining an alias for
--format json); specifically edit printQueryCmdHelp() and any usage/description
strings returned by the parser in this file to mention --format and the
behavior, and validate/help text should call out the default (text) and that
--json is equivalent to --format json so users see both options and the
precedence rule.

---

Nitpick comments:
In `@src/cli/cmd-query.test.ts`:
- Around line 542-602: Add explicit tests to parseQueryRest that assert --format
values "sarif" and "annotations" are rejected when used with incompatible flags:
--summary, --group-by, and any baseline flags (e.g., --baseline,
--baseline-compare); for each combination call parseQueryRest([...]) and assert
r.kind === "error" and that r.message mentions both the offending --format value
and the conflicting flag so failures are clear; add these tests alongside the
existing "--format flag" describe block referencing parseQueryRest and the exact
flag names to lock guard behavior and prevent regressions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e82d91bd-684b-45f8-a154-387a9b77fda3

📥 Commits

Reviewing files that changed from the base of the PR and between e47d1f5 and c592fa5.

📒 Files selected for processing (15)
  • .agents/rules/codemap.md
  • .agents/skills/codemap/SKILL.md
  • .changeset/sarif-formatter.md
  • README.md
  • docs/architecture.md
  • docs/glossary.md
  • src/application/mcp-server.test.ts
  • src/application/mcp-server.ts
  • src/application/output-formatters.test.ts
  • src/application/output-formatters.ts
  • src/cli/cmd-query.test.ts
  • src/cli/cmd-query.ts
  • src/cli/main.ts
  • templates/agents/rules/codemap.md
  • templates/agents/skills/codemap/SKILL.md

Comment thread docs/glossary.md Outdated
Comment thread src/application/output-formatters.ts Outdated
Comment thread src/cli/cmd-query.ts
Comment thread src/cli/cmd-query.ts
…c --format in CLI help (CodeRabbit on #43)

Four CodeRabbit threads, all verified ✅ correct:

1. **(major) Text/JSON path was ignoring --format.** runQueryCmd resolved opts.format in the parser then passed opts.json (raw) to printQueryResult, breaking my own design D9: --format json (no --json) printed a terminal table; --json --format text printed JSON. Fixed by computing effectiveFormat once at the top of runQueryCmd and threading the resulting isJson boolean through every branch (saveBaseline / baselineDiff / groupedQuery / printQueryResult / emitErrorMaybeJson). Two new parser tests lock the precedence (--format text wins over --json; --format json with no flag still emits JSON).

2. **(major) formatAnnotations emitted unescaped fields.** Per actions/toolkit (https://github.com/actions/toolkit/blob/master/packages/core/src/command.ts), property values must escape % \r \n : , and message payloads must escape % \r \n; otherwise file paths with : (Windows drive letters) or , break the annotation, and messages with % get parsed as malformed escape sequences. Added escapeAnnotationData + escapeAnnotationProperty helpers (exported + 5 unit tests covering order-of-operations, empty strings, idempotence). Whitespace collapse still runs first so messages stay single-line by GH spec; the CR/LF escape paths are exercised by property values.

3. **(minor) printQueryCmdHelp didn't advertise --format.** Added the full enum + precedence + sarif/annotations incompatibility note. Both "missing SQL or recipe" usage strings updated to mention --format <fmt>.

4. **(minor) docs/glossary.md alphabetical order.** GH annotations was under ## S (next to SARIF since they shipped together); moved to a new ## G section between F and H per glossary's per-letter structure. Cross-references unchanged.

Smoke verified post-fix: `codemap query --format json` emits JSON; `--json --format text` emits text.

37 new + updated unit tests pass; bun run check green.
…odeRabbit nitpick on #43)

7 new parser tests covering every incompatible combo CodeRabbit flagged: --format sarif|annotations × --summary | --group-by | --baseline | --save-baseline (both =name and default-name forms) on ad-hoc SQL + on recipes. Plus 2 negative tests confirming --format text/json compose freely with summary/group-by (text/json don't trip the guard).

The guard logic existed since Tracer 2 but only had end-to-end coverage via the SARIF/annotations smoke + the MCP-side mirror tests; these direct parse-time tests prevent regressions when the parser grows new flags.
@SutuSebastian SutuSebastian merged commit 4061ac3 into main May 2, 2026
10 checks passed
@SutuSebastian SutuSebastian deleted the feat/sarif-formatter branch May 2, 2026 16:19
SutuSebastian added a commit that referenced this pull request May 2, 2026
…llow + competitive-scan (#45)

- fallow.md row B.8 was ❌ Open; PR #43 shipped it. Updated to ✅ with implementation notes (rule.id taxonomy, location auto-detection, deferred frontmatter overrides).
- fallow.md MCP-server bullet said 'HTTP API stays in roadmap backlog'; PR #44 shipped it. Reworded; added a dedicated PR #44 bullet covering the new transport, shared tool/resource handlers, CSRF/DNS-rebinding guard, Zod validation at the HTTP boundary, and 404/500 status semantics.
- competitive-scan-2026-04.md '4. What moved to the roadmap': replaced 'HTTP API (codemap serve) — still backlog' with shipped rows for PR #43 + PR #44.
- Fixed broken anchor: cross-ref to fallow.md was #status-snapshot-as-of-2026-05-01 but the heading bumped to 2026-05-02 in PR #42.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant