Skip to content

feat: bundle — v0.16 finishers + 1c-ii.a framework propagation (#79, #80, #81, #72)#106

Merged
mohanagy merged 5 commits into
mainfrom
feat/bundle-v0.16-and-1c-ii-frameworks
May 10, 2026
Merged

feat: bundle — v0.16 finishers + 1c-ii.a framework propagation (#79, #80, #81, #72)#106
mohanagy merged 5 commits into
mainfrom
feat/bundle-v0.16-and-1c-ii-frameworks

Conversation

@mohanagy
Copy link
Copy Markdown
Owner

@mohanagy mohanagy commented May 10, 2026

⚠️ This is a bundle PR by explicit user request. It violates the project's standard "one slice per PR" rule. The user signed off on the trade-off explicitly. Each commit on this branch is self-contained, has its own test coverage, and is reviewable in isolation. If this is too much for one review, the cleanest unbundling is to revert this PR and cherry-pick each of the 4 commits onto its own branch.

Closes part of #79, #80, #81. Advances #72 (slice 1c-ii.a). Built on top of #105 (post-merge to main).

Commits in this PR (each its own logical unit)

1. `adf95ce` — feat(#72): SPI projector → propagate framework_role (slice 1c-ii.a)

When SPI symbols carry a `framework_role` (set by slice 3b's NestJS detector), the projector now surfaces `framework`, `framework_role`, and `node_kind` on the projected ExtractionNode. Maps SPI roles back onto the legacy extractor's shape so downstream consumers can route framework-aware UX without re-classifying. Full byte-equivalence on demo-repo (slice 1c-ii.b through .e — porting Express / Next.js / React Router / Redux extractor logic) remains future work.

  • New: `frameworkForRole`, `nodeKindForRole` helpers in projector.ts
  • Tests: 4 new — nest_module / nest_controller / nest_route propagation, no-tagging-of-plain-classes regression

2. `20ad12f` — feat(#79): PR-impact coverage scoring

Adds `coverage_score` (numeric, [0, 1]) and `uncovered_hotspots` (`ChangedNode[]`) to `PrImpactResult` and `CompactPrImpactResult`. Coverage score = ratio of high-impact changed nodes whose label appears in the review bundle. Convention: 1.0 when there are no high-impact nodes (no coverage gap to score). The compact result preserves both fields verbatim so MCP and CLI clients can audit pack coverage without round-tripping through the verbose payload.

  • Tests: 4 new in pr-impact-coverage.test.ts; existing pr-impact.test.ts fixtures updated to include the new fields.

3. `939c71e` — feat(#80): document + regression-test cache-aware prompt layout

Most of #80's mechanics already lived in `buildContextPrompt` (deterministic sort_key ordering, separate stable_prefix vs dynamic_suffix, `stable_prefix_tokens` / `reused_context_tokens` / `effective_prompt_tokens` metrics, session-aware delta payload). What was missing:

  • The convention for sort_key prefixes that maximises Anthropic's automatic prompt-cache reuse (`01_workspace_`, `10_communities_`, `20_evidence_`, `90_anchor_`). Documented in JSDoc.
  • Regression tests asserting the prefix is byte-stable across two consecutive calls with the same anchor, deterministic across input order, and never embeds an ISO timestamp (cache-invalidation regression guard).
  • Tests: 5 new in context-prompt-cache-stability.test.ts.

4. `273fa64` — feat(#81): standalone delta-pack helper

Adds `computeDeltaContextPack(pack, previouslySentNodeIds)` — pure side-effect-free filter that returns the input pack with overlapping nodes + their relationships removed, plus an explicit `referenced_ids` list of dropped handles and a `bytes_saved` measurement. Plus `collectPackNodeIds(pack)` for callers to record what the agent received after each call.

  • New file: src/runtime/context-pack-delta.ts (130 LOC + JSDoc)
  • Tests: 7 new in context-pack-delta.test.ts.
  • Out-of-scope here: wiring into the stdio context_pack tool's session state. The session-state plumbing (per-session handle store, response shape, reset flow) is substantial and bundling it here would push the diff past safe review size. The helper has full coverage so a one-line follow-up wires it in.

Test plan

  • `npm run typecheck` — clean
  • `npm run build` — clean
  • `npm run test:run` — 93 files / 1601 tests pass (20 new across the bundle: 4 + 4 + 5 + 7)
  • CI must pass on Ubuntu/macOS/Windows matrix before merge.

What's intentionally deferred

Per the bundle's "minimum viable per issue" scope, each issue is partially closed by this PR. The remaining work for each:

Refs #79, #80, #81, #72.

Summary by CodeRabbit

  • New Features

    • Propagates framework-specific metadata to improve NestJS code detection.
    • Adds PR coverage scoring and uncovered-hotspots reporting.
    • Adds context-pack delta deduplication to reduce multi-turn payloads.
  • Documentation

    • Added guidance on cache-aware prompt layout to improve prompt-cache reuse.
  • Tests

    • Added unit tests validating deduplication, prompt-layout stability, PR-coverage behavior, and framework-role propagation.

Review Change Stack

mohanagy added 4 commits May 10, 2026 23:58
… (slice 1c-ii.a)

When SPI symbols carry a framework_role (set by slice 3b's NestJS detector), the projector now surfaces `framework`, `framework_role`, and `node_kind` on the projected ExtractionNode. Maps SPI roles back onto the legacy extractor's shape so downstream consumers can route framework-aware UX without re-classifying. Full byte-equivalence on demo-repo's framework-specific synthetic nodes (e.g. NestJS route nodes minted as a separate symbol with route_path) remains slice 1c-iii.
Adds `coverage_score` (numeric, [0, 1]) and `uncovered_hotspots` (ChangedNode[]) to PrImpactResult and CompactPrImpactResult. Coverage score = ratio of high-impact changed nodes whose label appears in the review bundle. uncovered_hotspots is the corresponding list of ChangedNode entries that didn't make it. By convention coverage_score = 1.0 when there are no high-impact nodes (no coverage gap to score). The compact result preserves both fields verbatim so MCP and CLI clients can audit pack coverage without round-tripping through the verbose payload.
Most of #80's mechanics already lived in buildContextPrompt: deterministic sort_key ordering for stable sections, separate stable_prefix vs dynamic_suffix rendering, stable_prefix_tokens / reused_context_tokens / effective_prompt_tokens metrics, and a session-aware delta payload. What was missing was (a) the convention for sort_key prefixes that maximises Anthropic's automatic prompt-cache reuse, and (b) regression tests asserting the prefix is byte-stable across follow-ups.

Adds JSDoc on ContextPromptStableSection documenting the recommended sort_key bands (01_workspace_*, 10_communities_*, 20_evidence_*, 90_anchor_*). Adds tests/unit/context-prompt-cache-stability.test.ts pinning: byte-identical stable_prefix on two consecutive calls with the same anchor, deterministic ordering regardless of input order, anchor-shifted prefixes still keep the workspace+communities portion shared, follow-up calls with a prior session_state report non-zero reused_context_tokens, and the stable prefix never embeds an ISO timestamp (cache-invalidation regression guard).
Adds computeDeltaContextPack(pack, previouslySentNodeIds) — a pure helper that filters a CompiledContextPack down to only the nodes the agent has not yet received in the current session, plus an explicit referenced_ids list of the dropped handles. Drops relationships whose endpoints were filtered out, on the basis that the receiver already has the source/target node and can reconstruct the edge from session state if needed. Returns bytes_saved so callers can verify the second-call payload trends down across a multi-turn session.

Plus collectPackNodeIds(pack) for callers to build their session-state record after each call.

This is the standalone, side-effect-free building block of #81. Wiring into the stdio context_pack tool's session state (the per-session handle store, response shape, and reset flow) is intentionally a follow-up — that surgery touches enough of the stdio session infrastructure that bundling it here would balloon the diff past safe-review size for one PR. The helper has full coverage so the consuming follow-up PR can wire it in with a one-line change.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 10, 2026

📝 Walkthrough

Walkthrough

This PR propagates SPI framework_role into projected ExtractionNodes (with NestJS role→kind mapping), adds PR-impact coverage metrics (coverage_score and uncovered_hotspots) recalculated after compaction, and introduces a context-pack delta helper plus prompt-cache stability documentation and tests.

Changes

Framework Role Propagation to Extracted Nodes

Layer / File(s) Summary
Node Enrichment and Role Mapping
src/pipeline/spi/projector.ts
Symbol nodes are created via a temporary node variable and enriched with framework, framework_role, and node_kind from SPI metadata; helpers frameworkForRole and nodeKindForRole encode NestJS role mappings.
Tests
tests/unit/spi-projector.test.ts
New suite validates NestJS decorators (@Module, @Controller, @Get) propagate to projected nodes with correct framework/framework_role/node_kind; non-Nest classes remain untagged.

PR Impact Coverage Scoring

Layer / File(s) Summary
Data Shapes
src/runtime/pr-impact.ts
PrImpactResult and CompactPrImpactResult gain coverage_score: number and uncovered_hotspots: ChangedNode[] fields.
Coverage Calculation
src/runtime/pr-impact.ts
analyzePrImpact computes coverageScore as the ratio of high-impact node labels present in review_bundle.nodes (defaulting to 1 when no high-impact nodes exist); builds uncoveredHotspots from changed nodes whose labels are high-impact but absent from the bundle; early-return path initializes both fields.
Tests
tests/unit/pr-impact-coverage.test.ts, tests/unit/pr-impact.test.ts
Coverage-focused suite validates compactPrImpactResult recomputes coverage_score and produces correct uncovered_hotspots for partial/full/zero coverage scenarios; existing fixtures extended with new fields.

Context and Delta Optimization

Layer / File(s) Summary
Prompt Cache Strategy Documentation
src/infrastructure/context-prompt.ts
Block comment documents cache-aware stable section ordering and sort_key prefix recommendations for prompt-cache reuse.
Delta Pack Data Types and Computation
src/runtime/context-pack-delta.ts
New module exports DeltaContextPackResult interface and computeDeltaContextPack function that deduplicates previously-sent node ids by filtering overlapping nodes and relationships with known endpoints; returns delta flag, filtered pack, referenced ids, and bytes_saved estimate.
Node ID Collection
src/runtime/context-pack-delta.ts
collectPackNodeIds extracts all valid string node_id values from context pack nodes for tracking across turns.
Tests
tests/unit/context-pack-delta.test.ts, tests/unit/context-prompt-cache-stability.test.ts
Delta suite validates deduplication (no-overlap, node/relationship filtering, passthrough, bytes_saved); cache-stability suite verifies byte-identical stable prefixes, deterministic sort_key ordering, anchor-dependent updates, and context reuse.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

  • mohanagy/graphify-ts#98: Adds SPI pass that tags symbols with NestJS framework_role, which this PR propagates into projected nodes.
  • mohanagy/graphify-ts#100: Related projector changes that establish projectSpiToExtraction flow consumed by these projector enrichments.
  • mohanagy/graphify-ts#8: Prior work introducing node_kind that this PR maps framework roles into.

Poem

🐰 A rabbit hops through metadata streams,
Framework roles stitched into node dreams,
Coverage lights hotspots, tidy and clear,
Delta packs whisper "send less" in the ear,
Cache-stable prompts make reuse near!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: a bundle PR delivering feature completions across four issues (#79, #80, #81, #72), each with concrete deliverables (framework propagation, coverage scoring, cache stability docs/tests, and delta-pack helper).
Description check ✅ Passed The description comprehensively covers all required template sections: clear summary of changes, detailed commit breakdown, test plan with verification steps (typecheck/build/test results), intentional deferrals per issue, and explicit acknowledgment of bundle trade-offs.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/bundle-v0.16-and-1c-ii-frameworks

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/runtime/context-pack-delta.ts`:
- Around line 76-87: The filter currently drops any relationship if either
endpoint is in referencedIds; update the predicate used for keptRelationships
(iterating pack.relationships) to only exclude relationships when both endpoints
are referenced. Specifically, build referencedSet from referencedIds, derive
fromId/toId as done now, and change the condition to return false only when
fromId and toId are both non-null and referencedSet.has(fromId) &&
referencedSet.has(toId); otherwise keep the relationship so mixed
(new↔referenced) edges are preserved.

In `@src/runtime/pr-impact.ts`:
- Around line 1117-1133: Recompute coverageScore and uncoveredHotspots after the
reviewBundle has been compacted: build reviewBundleLabels from the
post-compaction reviewBundle.nodes, create highImpactLabelSet from
highImpactNodes, then recalculate totalHighImpact, coveredHighImpact,
coverageScore (default 1 when totalHighImpact === 0) and uncoveredHotspots by
filtering changedNodes.map(n => n.serialized) against highImpactLabelSet and the
recomputed reviewBundleLabels; update the existing
coverage_score/uncovered_hotspots assignments to use these recomputed values so
compaction-dropped nodes aren’t falsely counted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b43d6abe-30e9-4085-b101-2b775464ed43

📥 Commits

Reviewing files that changed from the base of the PR and between 25cbfa1 and 273fa64.

📒 Files selected for processing (9)
  • src/infrastructure/context-prompt.ts
  • src/pipeline/spi/projector.ts
  • src/runtime/context-pack-delta.ts
  • src/runtime/pr-impact.ts
  • tests/unit/context-pack-delta.test.ts
  • tests/unit/context-prompt-cache-stability.test.ts
  • tests/unit/pr-impact-coverage.test.ts
  • tests/unit/pr-impact.test.ts
  • tests/unit/spi-projector.test.ts

Comment thread src/runtime/context-pack-delta.ts Outdated
Comment thread src/runtime/pr-impact.ts
Two valid catches from the bundle's review:

1. #81 delta-helper edge filter was too aggressive. The original logic dropped a relationship if EITHER endpoint was previously sent (referenced). That's wrong — a mixed edge (one new endpoint, one referenced) carries novel information about how the new node connects to the known one and must be kept. The fix: only drop relationships when BOTH endpoints are already in the receiver's session. The corresponding test was rewritten to assert the new semantic explicitly: 4 input edges (both-new, mixed×2, both-referenced) → 3 kept, 1 dropped.

2. #79 compact coverage_score and uncovered_hotspots inherited the verbose result's values verbatim. If compactReviewBundle drops a high-impact node during compaction, the compact result would silently claim coverage that no longer existed in the compact bundle. The fix: recompute coverage in compactPrImpactResult against the post-compaction review bundle. Builds the compactedReviewBundle once, derives compactedReviewLabels from its nodes, then recomputes compactCoverageScore + compactUncoveredHotspots from the high-impact set. New regression test constructs a 12-hotspot fixture and asserts compact.coverage_score matches the labels actually present in compact.review_bundle.nodes — i.e., honest reporting matches the compact payload.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/runtime/pr-impact.ts (1)

1189-1220: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align compact coverage metrics with the compact high-impact list.

Line 1189 computes coverage using the full result.risk_summary.high_impact_nodes, but Line 1219 truncates risk_summary.high_impact_nodes. This can produce a coverage_score/uncovered_hotspots set that cannot be reconciled with the compact payload’s own high-impact list.

💡 Proposed fix
-  const compactHighImpactSet = new Set(result.risk_summary.high_impact_nodes)
+  const compactHighImpactNodes = result.risk_summary.high_impact_nodes.slice(0, MAX_COMPACT_HIGH_IMPACT_NODES)
+  const compactHighImpactSet = new Set(compactHighImpactNodes)
@@
     risk_summary: {
       ...result.risk_summary,
-      high_impact_nodes: result.risk_summary.high_impact_nodes.slice(0, MAX_COMPACT_HIGH_IMPACT_NODES),
+      high_impact_nodes: compactHighImpactNodes,
     },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/runtime/pr-impact.ts` around lines 1189 - 1220, The coverage and
uncovered-hotspots calculations should use the same truncated high-impact list
that's put into the compact payload: replace uses of
result.risk_summary.high_impact_nodes when building
compactHighImpactSet/compactTotalHighImpact/compactCoveredHighImpact/compactCoverageScore/compactUncoveredHotspots
with the truncated array result.risk_summary.high_impact_nodes.slice(0,
MAX_COMPACT_HIGH_IMPACT_NODES); ensure you still check membership against
compactedReviewLabels and use the same MAX_COMPACT_HIGH_IMPACT_NODES symbol so
the computed compactCoverageScore and compactUncoveredHotspots align with the
risk_summary.high_impact_nodes included in the returned object.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/runtime/pr-impact.ts`:
- Around line 1189-1220: The coverage and uncovered-hotspots calculations should
use the same truncated high-impact list that's put into the compact payload:
replace uses of result.risk_summary.high_impact_nodes when building
compactHighImpactSet/compactTotalHighImpact/compactCoveredHighImpact/compactCoverageScore/compactUncoveredHotspots
with the truncated array result.risk_summary.high_impact_nodes.slice(0,
MAX_COMPACT_HIGH_IMPACT_NODES); ensure you still check membership against
compactedReviewLabels and use the same MAX_COMPACT_HIGH_IMPACT_NODES symbol so
the computed compactCoverageScore and compactUncoveredHotspots align with the
risk_summary.high_impact_nodes included in the returned object.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: e494d3de-8a05-4644-974e-c179ce3bf4b8

📥 Commits

Reviewing files that changed from the base of the PR and between 273fa64 and 3a8fc67.

📒 Files selected for processing (4)
  • src/runtime/context-pack-delta.ts
  • src/runtime/pr-impact.ts
  • tests/unit/context-pack-delta.test.ts
  • tests/unit/pr-impact-coverage.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/runtime/context-pack-delta.ts
  • tests/unit/context-pack-delta.test.ts

@mohanagy mohanagy merged commit 0f88b91 into main May 10, 2026
7 checks passed
mohanagy added a commit that referenced this pull request May 11, 2026
…lector (#74)

Extends PR #121 to cover the rest of v0.15 in one slice as requested:

## #81 Delta-only context packs via stdio

- New context_pack parameter: delta_session_id. When set, the response ships only nodes the session hasn't received yet, plus referenced_ids[] for dropped nodes and bytes_saved.

- New MCP tool context_pack_session_reset to clear a delta session and force the next call to ship the full pack.

- 3 new StdioToolHelpers methods: getContextPackNodeIds, recordContextPackNodeIds, clearContextPackNodeIds. Backed by a per-MCP-process Map<sessionId, Set<nodeId>> in StdioSessionState with the same LRU bound as the prompt-session store (256 sessions).

- Reuses the existing computeDeltaContextPack helper (already had the both-endpoints relationship filter fix from PR #106) and collectPackNodeIds for recording shipped ids.

- Diagnostics on delta responses skip the budget_underutilized rule since a delta pack is small-by-design after dedup.

## #74 Value-per-token budget selector

- New module src/runtime/value-per-token.ts exporting selectByValuePerToken(candidates, options).

- Greedy density heuristic: sort by score / token_cost descending, pick the prefix that fits within budget. Tie-break: score desc, cost asc, id asc (deterministic).

- Optional pinZeroCost (default true): zero-cost candidates are always included; set false to exclude them entirely.

- Skips items whose individual cost exceeds the budget (cannot fit by definition) and items with non-finite scores or costs.

- Returns selected payload list, total_cost, remaining_budget, and per-candidate ranking[] with rank/density/included for diagnostics.

- Pure helper for now — adopting it inside retrieve.ts's candidate-selection pipeline is a follow-up once we have a benchmark to A/B against the current selector.

## Tests

- 11 new value-per-token tests covering density preference, zero-cost gating, budget overflow skip, non-finite filtering, ranking shape, tie-break determinism, negative budget clamp, empty input.

- MCP tool count increased to 26 (full profile). mcp-schema-budget test stays under the 12,000-byte ceiling after tightening the new tool descriptions.

- Verified: typecheck + build clean, 1760/1760 pass.

## Not in this PR (deferred to v0.16 slice train)

- #76 multi-resolution context representations — needs a new representation layer, structurally invasive.

- #79 PR-impact coverage calibration — needs real PRs to calibrate against, not a code-only delivery.

- #80 cache-aware prompt layout measurement — purely measurement work; sort_key bands already shipped.
mohanagy added a commit that referenced this pull request May 11, 2026
… + value-per-token (#74) (#121)

* feat(#78): context-pack quality diagnostics + bad-run detection (v0.15 slice 1)

Adds a deterministic structural quality scorer for compiled context-packs. Returns a 0-1 quality_score, a list of triggered warnings with kind/severity/message/detail, and the raw signals used to compute the score (node_count, claim_count, snippet_coverage, avg_match_score, budget_utilization, etc.).

Rules implemented (each weighted into the score):

- missing_required_evidence (error, weight 2) — pack lacks a required evidence class

- missing_required_semantic (warn)        — pack lacks a required semantic category

- zero_claims (warn)                      — claims array is empty

- undersized_retrieval (warn)             — fewer than 3 nodes returned

- budget_underutilized (info)             — token_count < 25% of budget on a >= 500-token request

- missing_snippets (warn)                 — > 50% of nodes lack a source snippet

- low_avg_match_score (warn)              — mean match_score < 0.30 (when scores exist)

- orphan_nodes (warn)                     — > 1 nodes but zero relationships

- no_graph_signals (info)                  — both god_nodes and bridge_nodes empty

Surface points:

- New contracts file src/contracts/context-pack-diagnostics.ts with ContextPackDiagnosticKind / Severity / Warning / Signals / Diagnostics types.

- New runtime helper src/runtime/context-pack-diagnostics.ts exporting computeContextPackDiagnostics(pack, options?). Pure, deterministic, no I/O — fully unit-testable against synthetic CompiledContextPack inputs.

- contextPackFromRetrieveResult is now exported from retrieve.ts (was private) so the stdio handler can construct the full pack shape from a RetrieveResult and feed it to the scorer.

- stdio context_pack tool response now includes a diagnostics field on the explain branch. Impact and review branches use different pack taxonomies and will land in a follow-up.

Verified: typecheck + build clean, 1749/1749 tests pass (+16 new). No public API surface changes outside the additive diagnostics field.

* feat(v0.15): delta-only context-pack stdio (#81) + value-per-token selector (#74)

Extends PR #121 to cover the rest of v0.15 in one slice as requested:

## #81 Delta-only context packs via stdio

- New context_pack parameter: delta_session_id. When set, the response ships only nodes the session hasn't received yet, plus referenced_ids[] for dropped nodes and bytes_saved.

- New MCP tool context_pack_session_reset to clear a delta session and force the next call to ship the full pack.

- 3 new StdioToolHelpers methods: getContextPackNodeIds, recordContextPackNodeIds, clearContextPackNodeIds. Backed by a per-MCP-process Map<sessionId, Set<nodeId>> in StdioSessionState with the same LRU bound as the prompt-session store (256 sessions).

- Reuses the existing computeDeltaContextPack helper (already had the both-endpoints relationship filter fix from PR #106) and collectPackNodeIds for recording shipped ids.

- Diagnostics on delta responses skip the budget_underutilized rule since a delta pack is small-by-design after dedup.

## #74 Value-per-token budget selector

- New module src/runtime/value-per-token.ts exporting selectByValuePerToken(candidates, options).

- Greedy density heuristic: sort by score / token_cost descending, pick the prefix that fits within budget. Tie-break: score desc, cost asc, id asc (deterministic).

- Optional pinZeroCost (default true): zero-cost candidates are always included; set false to exclude them entirely.

- Skips items whose individual cost exceeds the budget (cannot fit by definition) and items with non-finite scores or costs.

- Returns selected payload list, total_cost, remaining_budget, and per-candidate ranking[] with rank/density/included for diagnostics.

- Pure helper for now — adopting it inside retrieve.ts's candidate-selection pipeline is a follow-up once we have a benchmark to A/B against the current selector.

## Tests

- 11 new value-per-token tests covering density preference, zero-cost gating, budget overflow skip, non-finite filtering, ranking shape, tie-break determinism, negative budget clamp, empty input.

- MCP tool count increased to 26 (full profile). mcp-schema-budget test stays under the 12,000-byte ceiling after tightening the new tool descriptions.

- Verified: typecheck + build clean, 1760/1760 pass.

## Not in this PR (deferred to v0.16 slice train)

- #76 multi-resolution context representations — needs a new representation layer, structurally invasive.

- #79 PR-impact coverage calibration — needs real PRs to calibrate against, not a code-only delivery.

- #80 cache-aware prompt layout measurement — purely measurement work; sort_key bands already shipped.

* fix(#78): low_avg_match_score must fire on avg=0 (CodeRabbit)

CodeRabbit caught that the predicate excluded the worst-possible case (every node scoring exactly 0). The '> 0' clause meant a pack with three zero-scored nodes silently passed the rule, but that is precisely the kind of retrieval the warning was supposed to catch.

Fix: drop the '> 0' clause. The NaN guard above already covers the 'no scored nodes' case (where avg_match_score is NaN). Added a test pinning the avg=0 case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant