fix(search): finalize token budget advisory payload#98
Conversation
Greptile SummaryThis PR prepares the v2.1.0 release by adding a
Confidence Score: 4/5Safe to merge after fixing the compact-mode warning text and resolving the CHANGELOG/version inconsistency One P1 logic issue (mode-unaware warning message gives incorrect guidance when compact mode exceeds 4K tokens) and documentation/process issues (broken comparison URL, orphaned CHANGELOG content, --no-verify bypass) warrant review before merging. Benchmark artifacts and test coverage are otherwise solid. src/tools/search-codebase.ts (renderSearchPayloadText warning text) and CHANGELOG.md / package.json (version consistency)
|
| Filename | Overview |
|---|---|
| src/tools/search-codebase.ts | Adds tokenEstimate advisory and 4K warning to renderSearchPayloadText; warning message is mode-unaware and tells compact-mode callers to "prefer compact mode" |
| tests/search-compact-mode.test.ts | New tests covering tokenEstimate, warning threshold, compact/full mode behavior; missing a test that verifies the warning message is appropriate in compact mode |
| CHANGELOG.md | 2.1.0 block inserted under "## Unreleased" leaving the prior Unreleased content orphaned; comparison link references v2.0.0 tag that does not exist |
| package.json | Version bumped from 1.9.0 to 2.1.0, skipping 2.0.0 entirely with no corresponding git tag |
| docs/benchmark.md | Refreshed benchmark stats to match current gate output; honestly documents pending_evidence state and near-empty comparator outputs |
| results/comparator-evidence.json | codebase-memory-mcp and raw Claude Code lanes now have task-level data (status: ok) but 0 usefulness; gate correctly blocks claims as pending_evidence |
| results/gate-evaluation.json | Updated gate output with higher token estimates reflecting richer v2.1.0 payload; claimAllowed remains false |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[search_codebase called] --> B{mode?}
B -- compact default --> C[slice results to 6]
B -- full --> D[use all results]
C --> E[renderSearchPayloadText]
D --> E
E --> G[JSON.stringify payload without tokenEstimate]
G --> H{platform win32?}
H -- yes --> I[replace newlines with CRLF for byte count]
H -- no --> J[use as-is]
I --> K[tokenEstimate = ceil length / 4]
J --> K
K --> L{tokenEstimate > 4000?}
L -- yes --> M[add warning text
BUG: always says Prefer compact mode
even when already in compact mode]
L -- no --> N[no warning]
M --> O[re-serialize with tokenEstimate + warning in searchQuality]
N --> O
O --> P[return final payload]
Comments Outside Diff (1)
-
CHANGELOG.md, line 3-38 (link)CHANGELOG structure leaves orphaned content and a broken comparison link
Two related issues:
-
Orphaned "## Unreleased" content: The 2.1.0 block was inserted between
## Unreleasedand the previous Unreleased content (themcpmulti-project routing features). Those features now appear as a second### Features/### Bug Fixes/### Documentationblock visually merged into the 2.1.0 section, with no owning version header. If they belong to an earlier release (likely 2.0.0), they need their own## [2.0.0]entry. -
Broken comparison URL:
compare/v2.0.0...v2.1.0will 404 — the highest existing git tag isv1.9.0;v2.0.0was never tagged. The URL should either point tov1.9.0...v2.1.0or av2.0.0tag needs to be created for the prior release batch first.
-
Reviews (1): Last reviewed commit: "chore(release): prepare v2.1.0" | Re-trigger Greptile
| const warning = | ||
| tokenEstimate > 4000 | ||
| ? `Large search payload: estimated ${tokenEstimate} tokens. Prefer compact mode or tighter filters before pasting into an agent.` | ||
| : undefined; |
There was a problem hiding this comment.
Warning message is incorrect in compact mode
renderSearchPayloadText is called from both compact and full mode paths, but the warning text unconditionally says "Prefer compact mode or tighter filters". If a compact-mode response ever crosses the 4K-token threshold (possible with verbose summaries), the advisory instructs the caller to switch to compact mode when they already are — an impossible action that would cause unnecessary retry loops. The fix is to condition the guidance on the current mode.
| const warning = | |
| tokenEstimate > 4000 | |
| ? `Large search payload: estimated ${tokenEstimate} tokens. Prefer compact mode or tighter filters before pasting into an agent.` | |
| : undefined; | |
| const warning = | |
| tokenEstimate > 4000 | |
| ? isCompact | |
| ? `Large search payload: estimated ${tokenEstimate} tokens. Try tighter filters (e.g. layer=, language=) to reduce payload size.` | |
| : `Large search payload: estimated ${tokenEstimate} tokens. Prefer compact mode or tighter filters before pasting into an agent.` | |
| : undefined; |
Note: isCompact is defined at line 1091, after this function declaration. Because renderSearchPayloadText is a function declaration (not a const), it is hoisted and closes over isCompact by reference, so accessing it at call-time is safe.
There was a problem hiding this comment.
Good catch. The real bug also survived the final returned payload path in registerHandlers(), because the JSON is reserialized after project / index injection. The follow-up patch centralizes this logic in a shared helper, makes the warning mode-aware in the final payload, and adds a compact oversized regression that asserts the filter-only guidance.
| { | ||
| "name": "codebase-context", | ||
| "version": "1.9.0", | ||
| "version": "2.1.0", |
There was a problem hiding this comment.
Version jump skips 2.0.0 with no corresponding tag
package.json goes from 1.9.0 → 2.1.0 directly; no v2.0.0 tag exists in the repository (tags top out at v1.9.0). The CHANGELOG comparison link v2.0.0...v2.1.0 depends on that tag existing. Either:
- Tag the previous "Unreleased" batch as
v2.0.0before this PR lands, or - Bump
package.jsonto2.0.0for this PR and move the 2.1.0 content to the next release
There was a problem hiding this comment.
The changelog structure and compare-link issues were valid, and the follow-up patch fixes those. We are intentionally keeping this release at 2.1.0 rather than retroactively forcing a v2.0.0 cut; the compare link is corrected to v1.9.0...v2.1.0, and the previously orphaned bullets are folded into the owned 2.1.0 section.
Summary
search_codebasepayload finalization in a shared helper sosearchQuality.tokenEstimateand warning text are computed from the final returned payloadPR Scope
This PR is intentionally trimmed to the minimal advisory-fix merge set:
src/tools/search-payload-budget.tssrc/tools/search-codebase.tssrc/index.tstests/search-compact-mode.test.tsCHANGELOG.mdresults/comparator-evidence.jsonresults/gate-evaluation.jsonDeferred to the explicit release checkpoint, not this PR:
package.jsonversion bump to2.1.0docs/benchmark.mdpublic benchmark narrative refreshnpm publishVerification
pnpm run type-checkpnpm test -- tests/search-compact-mode.test.tspnpm test -- tests/benchmark-comparators.test.tsDeferred Follow-up
The broader repo-wide red suite is tracked separately from this narrow PR. Current failing tests on this host include:
tests/ast-chunker-integration.test.tstests/index-migration-atomic-swap.test.tstests/indexer-exclude-patterns.test.tstests/impact-2hop.test.tstests/incremental-indexing.test.tstests/multi-project-routing.test.tstests/relationship-sidecar.test.tstests/search-decision-card.test.tstests/search-edit-preflight-lite.test.tstests/search-safe-01.test.tstests/search-snippets.test.tstests/zombie-guard.test.tsResidual Risk
pending_evidence;claimAllowedremainsfalse--no-verifybecause the repo-wide pre-push suite is currently red outside this PR's scope.planning/remains repo-local and is not part of this PR