Skip to content

feat: ship workflow hardening#25

Merged
mohanagy merged 2 commits into
mainfrom
feature/workflow-hardening
Apr 30, 2026
Merged

feat: ship workflow hardening#25
mohanagy merged 2 commits into
mainfrom
feature/workflow-hardening

Conversation

@mohanagy
Copy link
Copy Markdown
Owner

@mohanagy mohanagy commented Apr 30, 2026

Summary

Testing

  • npm run test:run
  • npm run typecheck
  • npm run build
  • npm pack --dry-run (if packaging or install behavior changed)

Checklist

  • I updated docs for any user-visible change
  • I added or updated tests when behavior changed
  • I did not commit secrets, private corpora, or accidental generated artifacts
  • I kept this PR focused on a single change or tightly related set of changes

Related issues

Summary by CodeRabbit

Release Notes v0.9.2

  • New Features

    • Four new analysis tools: relevant files, feature maps, risk analysis, and implementation checklists for enhanced codebase insights.
    • Framework-aware JavaScript/TypeScript code extraction with semantic role detection.
    • Snippet coverage metric in evaluation reports alongside recall and ranking metrics.
  • Improvements

    • Retrieve/impact tools default to compact responses; use verbose: true for legacy full payloads.
    • File paths displayed relative to project root where applicable.
    • Stricter evaluation regression gates with snippet coverage enforcement.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 30, 2026

Warning

Rate limit exceeded

@mohanagy has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 33 minutes and 51 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3d672cb8-df6a-4e64-9029-392cfdc90e45

📥 Commits

Reviewing files that changed from the base of the PR and between abca524 and cb6a6e5.

📒 Files selected for processing (7)
  • src/infrastructure/generate.ts
  • src/pipeline/export.ts
  • src/pipeline/extract/combine.ts
  • src/runtime/retrieve.ts
  • src/shared/source-location.ts
  • tests/unit/generate.test.ts
  • tests/unit/retrieve.test.ts
📝 Walkthrough

Walkthrough

This PR introduces snippet coverage metric to evaluation, adds four new MCP developer tools (relevant_files, feature_map, risk_map, implementation_checklist), changes MCP payload semantics from opt-in compact: true to default compact with verbose: true override, implements path relativization against graph root, makes node_kind optional, and bumps version to 0.9.2.

Changes

Cohort / File(s) Summary
Version & Metadata
package.json
Version bumped from 0.9.1 to 0.9.2.
Documentation
CHANGELOG.md, README.md, docs/language-capability-matrix.md, docs/proof-workflows.md, examples/why-graphify.md
Updated to document snippet coverage metric in eval, new MCP tools (relevant_files, feature_map, risk_map, implementation_checklist), compact-by-default payload semantics with verbose: true override, and new Framework awareness section for JS/TS extractor's framework_role annotations.
CI Configuration
.github/workflows/ci.yml
CI eval regression gate now parses and validates snippet coverage metric alongside recall and MRR.
CLI Help
src/cli/main.ts
Updated help text to reflect eval measures recall, MRR, and snippet coverage.
Core Benchmark System
src/infrastructure/benchmark/quality.ts
Added per-question snippet_coverage metric (percentage of matched nodes with non-empty snippet) and aggregate avg_snippet_coverage to QualityResult and QualityReport interfaces, included in formatted output.
Path Relativization Utility
src/shared/source-path.ts
New exported relativizeSourceFile function converts absolute paths to root-relative paths when within configured graph root, with safety checks for escaping/invalid results.
Graph Construction
src/pipeline/build.ts
Graph now propagates optional root_path field from extraction input to knowledge graph metadata.
Retrieval Runtime
src/runtime/retrieve.ts
Added source file path relativization against graph root; made RetrieveMatchedNode.node_kind optional and conditionally included only when non-empty; updated compaction helpers to preserve optional node_kind.
Impact Analysis Runtime
src/runtime/impact.ts
Added source file path relativization for discovered and target nodes; made ImpactNode.node_kind optional; simplified framework ranking to work with optional node_kind.
New Runtime Tool: Relevant Files
src/runtime/relevant-files.ts
New function retrieves context, filters peripheral nodes, aggregates matches per source file with symbol tracking, ranks by score, and returns relevant files with match explanations and token counts.
New Runtime Tool: Feature Map
src/runtime/feature-map.ts
New function retrieves context and relevant files, aggregates matched nodes into per-community stats, ranks communities and entry points, generates summaries and why explanations, returns structured feature map result.
New Runtime Tool: Risk Map
src/runtime/risk-map.ts
New function builds feature scope, discovers graph communities and structural hotspots (bridge/god nodes), scores candidate nodes by impact breadth and structural exposure, returns ranked risk entries with affected files and hotspot classifications.
New Runtime Tool: Implementation Checklist
src/runtime/implementation-checklist.ts
New function computes feature and risk maps, derives edit steps from relevant files with entry points and risk derivations, builds validation steps from entry points and top risks, returns checklist result with summary and structured steps.
Server Graph Loading
src/runtime/serve.ts
Graph loading now forwards optional root_path field from parsed graph.json into extraction payload.
MCP Tool Definitions
src/runtime/stdio/definitions.ts
Replaced compact-only flag with verbose boolean (legacy override); added four new tool schemas for relevant_files, feature_map, risk_map, implementation_checklist with question/budget/limit/file_type/community parameters.
MCP Tool Handlers
src/runtime/stdio/tools.ts
Added verbose parameter handling for impact/retrieve; updated result shaping to return compacted output by default, full output when verbose===true; implemented handlers for four new tools with input validation and JSON stringification.
Benchmark & Integration Tests
tests/unit/benchmark-quality.test.ts, tests/unit/retrieve.test.ts, tests/unit/impact.test.ts, tests/unit/stdio-server.test.ts
Added regression tests for snippet coverage computation, source path relativization, optional node_kind serialization, and updated retrieve/impact tests to treat compact output as default with verbose override; extended stdio tests to validate new tool endpoints and schemas.
Feature Tool Tests
tests/unit/feature-map.test.ts, tests/unit/relevant-files.test.ts, tests/unit/risk-map.test.ts, tests/unit/implementation-checklist.test.ts
New unit tests for each runtime tool verifying correct aggregation by community/file, proper ranking, expected field presence, path handling, and integration across tools.
Package Metadata Tests
tests/unit/package-metadata.test.ts
Extended assertions to verify language capability matrix documentation and CI snippet coverage reporting.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • PR #10: Modifies src/infrastructure/benchmark/quality.ts to add snippet coverage metrics to QualityReport and QualityResult interfaces, directly overlapping with benchmark/eval enhancements in this PR.
  • PR #9: Modifies src/runtime/impact.ts and src/runtime/stdio/{definitions,tools}.ts to adjust impact result shapes and MCP tool handling, overlapping with impact/retrieve runtime and tool schema changes.
  • PR #24: Modifies src/runtime/{retrieve,impact}.ts to add framework-role handling and node_kind metadata management, sharing framework-awareness and optional field logic with this PR.

Poem

🐰 Hops of joy across the graph!

Four new tools bloom in the MCP path,
Snippet coverage shows the way,
Paths shrink to root—no more astray!
Verbose whispers when you ask,
Default compact makes each task.
A burrow of features, hopping smart! 🌿✨

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is entirely a template with no concrete summary of changes, no testing details, and no explanation of why the changes were made. Fill in the Summary section with details about the MCP tools and eval enhancements; document testing performed; and clarify the intent behind 'workflow hardening' terminology.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'feat: ship workflow hardening' is vague and does not clearly convey the main change; the raw summary shows extensive additions of MCP tools, eval metrics, and runtime features, not just workflow changes. Use a more specific title that reflects the primary feature additions, such as 'feat: add MCP tools and snippet coverage evaluation' or 'feat: introduce feature-map, risk-map, and implementation-checklist tools'.
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/workflow-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 33 minutes and 51 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/runtime/retrieve.ts (1)

990-1001: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Preserve framework metadata in the no-node_id fallback.

compactRetrieveResultForStdio accepts RetrieveResult, and RetrieveMatchedNode.node_id is optional. When a caller passes valid nodes without IDs, this fallback rebuild drops framework and framework_role even though both fields are still present on compactResult.matched_nodes, so the helper can silently degrade the payload shape.

Suggested fix
     return {
       label: node.label,
       source_file: node.source_file,
       line_number: node.line_number,
+      framework: node.framework,
+      framework_role: node.framework_role,
       framework_boost: 0,
       file_type: node.file_type ?? compactResult.shared_file_type ?? '',
       snippet: node.snippet,
       match_score: node.match_score,
       relevance_band: node.relevance_band,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/runtime/retrieve.ts` around lines 990 - 1001, The fallback branch in
compactRetrieveResultForStdio (handling RetrieveResult where
RetrieveMatchedNode.node_id is missing) currently rebuilds nodes but omits
framework and framework_role, losing metadata that exists on
compactResult.matched_nodes; update the object construction (the returned
literal in compactRetrieveResultForStdio) to preserve node.framework and
node.framework_role when present (e.g., spread or conditional properties similar
to node.node_kind) so the payload shape matches the original matched_nodes
entries.
🧹 Nitpick comments (4)
tests/unit/stdio-server.test.ts (1)

603-638: ⚡ Quick win

Add one regression check for the deprecated compact: false alias.

This test now covers the new verbose: true path, but it no longer exercises the transition alias that the handler still supports. A single compact: false assertion here would lock the backward-compat contract before the next cleanup pass.

Suggested addition
+      const retrieveLegacyVerbose = await Promise.resolve(handleStdioRequest(graphPath, {
+        id: 21,
+        method: 'tools/call',
+        params: {
+          name: 'retrieve',
+          arguments: {
+            question: 'which react router route renders dashboard page',
+            budget: 5000,
+            file_type: 'code',
+            compact: false,
+          },
+        },
+      }))
+
       const retrieveDefaultPayload = JSON.parse((retrieveDefault?.result as { content: Array<{ text: string }> }).content[0]!.text)
       const retrieveVerbosePayload = JSON.parse((retrieveVerbose?.result as { content: Array<{ text: string }> }).content[0]!.text)
+      const retrieveLegacyVerbosePayload = JSON.parse((retrieveLegacyVerbose?.result as { content: Array<{ text: string }> }).content[0]!.text)
       const impactDefaultPayload = JSON.parse((impactDefault?.result as { content: Array<{ text: string }> }).content[0]!.text)
       const impactVerbosePayload = JSON.parse((impactVerbose?.result as { content: Array<{ text: string }> }).content[0]!.text)

+      expect(retrieveLegacyVerbosePayload).toEqual(retrieveVerbosePayload)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/stdio-server.test.ts` around lines 603 - 638, Add a regression
call that exercises the deprecated alias by invoking handleStdioRequest with
compact: false (instead of verbose: true) so the handler still honors the old
option; specifically, add a Promise.resolve(handleStdioRequest(...)) entry next
to the existing retrieveVerbose/impactVerbose blocks using the same method name
and arguments but replacing verbose: true with compact: false (use a new unique
id, e.g., id: 5) and assert that its result matches the corresponding
verbose:true behavior; target the same function names in the test
(handleStdioRequest, params.name 'retrieve' or 'impact') so the backward-compat
contract is locked in.
src/runtime/feature-map.ts (1)

55-61: 💤 Low value

Duplicate helper function across files.

pushUnique is identically implemented in both feature-map.ts and relevant-files.ts. Consider extracting to a shared utility module to reduce duplication.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/runtime/feature-map.ts` around lines 55 - 61, The helper function
pushUnique is duplicated; extract it into a single shared utility (e.g., export
pushUnique from a new utility module) and update both locations to import and
use that shared implementation (replace the local pushUnique in feature-map.ts
and the copy in relevant-files.ts with an import), then remove the duplicate
definitions so only the exported function remains; reference pushUnique to
locate and replace the duplicates and ensure the new module exports the same
signature (values: string[], seen: Set<string>, value: string): void.
src/runtime/risk-map.ts (1)

75-83: ⚡ Quick win

Redundant retrieveContext call.

featureMap on line 77 internally calls retrieveContext, and then lines 78-83 call it again with the same parameters. Consider either:

  1. Having featureMap return the RetrieveResult it computed, or
  2. Calling retrieveContext once and passing the result to a lower-level feature map builder

This doubles the retrieval work for every riskMap call.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/runtime/risk-map.ts` around lines 75 - 83, The riskMap function is
calling retrieveContext twice (once indirectly via featureMap and again
directly), doubling retrieval work; to fix, choose one approach: either update
featureMap to also return the RetrieveResult it computes (e.g., change
featureMap to return { feature, retrieveResult } or similar) and use that inside
riskMap, or instead call retrieveContext once in riskMap and pass the resulting
RetrieveResult into a modified featureMap that accepts it as an argument; update
references to featureMap and retrieveContext accordingly (functions: riskMap,
featureMap, retrieveContext) so retrieval happens only once.
src/runtime/implementation-checklist.ts (1)

71-73: 🏗️ Heavy lift

Cascading redundant retrieval calls.

implementationChecklist calls both featureMap and riskMap, but riskMap internally calls featureMap and retrieveContext again. This creates up to 5 redundant retrieveContext calls per checklist request.

Consider restructuring to:

  1. Call retrieveContext once at the top level
  2. Pass the result to specialized builders for feature map and risk analysis

This is a higher-effort change but would significantly improve performance for this tool.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/runtime/implementation-checklist.ts` around lines 71 - 73,
implementationChecklist currently calls featureMap and riskMap which cause
redundant retrieveContext calls (riskMap itself calls featureMap and
retrieveContext), so refactor to call retrieveContext once in
implementationChecklist and pass the retrieved context into the specialized
builders; update function signatures for featureMap and riskMap (or create new
internal helpers like buildFeatureMapFromContext and buildRiskMapFromContext) to
accept the pre-fetched context and the existing
KnowledgeGraph/ImplementationChecklistOptions, remove internal retrieveContext
calls inside riskMap/featureMap, and ensure implementationChecklist constructs
the context once and supplies it to featureMap/riskMap to eliminate duplicate
retrievals.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/pipeline/build.ts`:
- Around line 94-96: The code stores extraction.root_path without normalizing
it, so leading/trailing whitespace can still be persisted; update the assignment
in the block guarding extraction.root_path to store a trimmed and normalized
value (e.g., const normalized = path.normalize(extraction.root_path.trim())) and
assign graph.graph.root_path = normalized (use Node's path.normalize or
equivalent to remove redundant separators/trailing slashes) so downstream
relativization works reliably; reference the extraction.root_path guard and the
graph.graph.root_path assignment when making the change.

---

Outside diff comments:
In `@src/runtime/retrieve.ts`:
- Around line 990-1001: The fallback branch in compactRetrieveResultForStdio
(handling RetrieveResult where RetrieveMatchedNode.node_id is missing) currently
rebuilds nodes but omits framework and framework_role, losing metadata that
exists on compactResult.matched_nodes; update the object construction (the
returned literal in compactRetrieveResultForStdio) to preserve node.framework
and node.framework_role when present (e.g., spread or conditional properties
similar to node.node_kind) so the payload shape matches the original
matched_nodes entries.

---

Nitpick comments:
In `@src/runtime/feature-map.ts`:
- Around line 55-61: The helper function pushUnique is duplicated; extract it
into a single shared utility (e.g., export pushUnique from a new utility module)
and update both locations to import and use that shared implementation (replace
the local pushUnique in feature-map.ts and the copy in relevant-files.ts with an
import), then remove the duplicate definitions so only the exported function
remains; reference pushUnique to locate and replace the duplicates and ensure
the new module exports the same signature (values: string[], seen: Set<string>,
value: string): void.

In `@src/runtime/implementation-checklist.ts`:
- Around line 71-73: implementationChecklist currently calls featureMap and
riskMap which cause redundant retrieveContext calls (riskMap itself calls
featureMap and retrieveContext), so refactor to call retrieveContext once in
implementationChecklist and pass the retrieved context into the specialized
builders; update function signatures for featureMap and riskMap (or create new
internal helpers like buildFeatureMapFromContext and buildRiskMapFromContext) to
accept the pre-fetched context and the existing
KnowledgeGraph/ImplementationChecklistOptions, remove internal retrieveContext
calls inside riskMap/featureMap, and ensure implementationChecklist constructs
the context once and supplies it to featureMap/riskMap to eliminate duplicate
retrievals.

In `@src/runtime/risk-map.ts`:
- Around line 75-83: The riskMap function is calling retrieveContext twice (once
indirectly via featureMap and again directly), doubling retrieval work; to fix,
choose one approach: either update featureMap to also return the RetrieveResult
it computes (e.g., change featureMap to return { feature, retrieveResult } or
similar) and use that inside riskMap, or instead call retrieveContext once in
riskMap and pass the resulting RetrieveResult into a modified featureMap that
accepts it as an argument; update references to featureMap and retrieveContext
accordingly (functions: riskMap, featureMap, retrieveContext) so retrieval
happens only once.

In `@tests/unit/stdio-server.test.ts`:
- Around line 603-638: Add a regression call that exercises the deprecated alias
by invoking handleStdioRequest with compact: false (instead of verbose: true) so
the handler still honors the old option; specifically, add a
Promise.resolve(handleStdioRequest(...)) entry next to the existing
retrieveVerbose/impactVerbose blocks using the same method name and arguments
but replacing verbose: true with compact: false (use a new unique id, e.g., id:
5) and assert that its result matches the corresponding verbose:true behavior;
target the same function names in the test (handleStdioRequest, params.name
'retrieve' or 'impact') so the backward-compat contract is locked in.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 26987a47-d000-4a11-b15f-013dfc8aa0c6

📥 Commits

Reviewing files that changed from the base of the PR and between 05e09cb and abca524.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (29)
  • .github/workflows/ci.yml
  • CHANGELOG.md
  • README.md
  • docs/language-capability-matrix.md
  • docs/proof-workflows.md
  • examples/why-graphify.md
  • package.json
  • src/cli/main.ts
  • src/infrastructure/benchmark/quality.ts
  • src/pipeline/build.ts
  • src/runtime/feature-map.ts
  • src/runtime/impact.ts
  • src/runtime/implementation-checklist.ts
  • src/runtime/relevant-files.ts
  • src/runtime/retrieve.ts
  • src/runtime/risk-map.ts
  • src/runtime/serve.ts
  • src/runtime/stdio/definitions.ts
  • src/runtime/stdio/tools.ts
  • src/shared/source-path.ts
  • tests/unit/benchmark-quality.test.ts
  • tests/unit/feature-map.test.ts
  • tests/unit/impact.test.ts
  • tests/unit/implementation-checklist.test.ts
  • tests/unit/package-metadata.test.ts
  • tests/unit/relevant-files.test.ts
  • tests/unit/retrieve.test.ts
  • tests/unit/risk-map.test.ts
  • tests/unit/stdio-server.test.ts

Comment thread src/pipeline/build.ts
Comment on lines +94 to +96
if (typeof extraction.root_path === 'string' && extraction.root_path.trim().length > 0) {
graph.graph.root_path = extraction.root_path
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Persist a normalized root_path value.

At Line 95, the value stored can still include leading/trailing whitespace even though the guard trims for emptiness. That can break downstream path relativization in edge cases.

Suggested fix
-  if (typeof extraction.root_path === 'string' && extraction.root_path.trim().length > 0) {
-    graph.graph.root_path = extraction.root_path
-  }
+  if (typeof extraction.root_path === 'string') {
+    const rootPath = extraction.root_path.trim()
+    if (rootPath.length > 0) {
+      graph.graph.root_path = rootPath
+    }
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (typeof extraction.root_path === 'string' && extraction.root_path.trim().length > 0) {
graph.graph.root_path = extraction.root_path
}
if (typeof extraction.root_path === 'string') {
const rootPath = extraction.root_path.trim()
if (rootPath.length > 0) {
graph.graph.root_path = rootPath
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pipeline/build.ts` around lines 94 - 96, The code stores
extraction.root_path without normalizing it, so leading/trailing whitespace can
still be persisted; update the assignment in the block guarding
extraction.root_path to store a trimmed and normalized value (e.g., const
normalized = path.normalize(extraction.root_path.trim())) and assign
graph.graph.root_path = normalized (use Node's path.normalize or equivalent to
remove redundant separators/trailing slashes) so downstream relativization works
reliably; reference the extraction.root_path guard and the graph.graph.root_path
assignment when making the change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mohanagy mohanagy merged commit ae7eba7 into main Apr 30, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant