fix(retrieval): always return top hit + skip-and-continue on token budget overflow (B-MCP-8) by h2devx · Pull Request #33 · NetziTech/recall

h2devx · 2026-05-02T19:36:03Z

Summary

Closes #31. Fixes B-MCP-8 (medium, discovered in Phase-14 smoke against the dogfood DB): mem.recall returned total_candidates>0 but hits=0 whenever the top-ranked candidate alone exceeded max_tokens.

The old loop in RecallMemoryUseCase.rankAndSlice did if (runningTokens + tokens > max) break from the first iteration, leaving out=[]. The dogfood DB tripped this because learning rows store full content un-truncated (decisions and entities truncate at 600 chars; learnings and turns do not), so a single long learning could cost more tokens than the 4000-token default budget.

Why it surfaced now (beta.4) and not before (beta.3)

Pre-B-MCP-7 (beta.3), the embedding worker never drained the queue, so vector hits were always empty and recall fell back to BM25-only. BM25 prioritized entries with exact lexical matches — typically short titles. With B-MCP-7 fixed in beta.4, the worker now populates embeddings end-to-end, and the hybrid scorer began ranking long-form learnings higher (because their semantic vectors matched). Those long learnings then triggered the budget-overflow break.

Fix

Three changes:

Always include the top-ranked hit, even if it solo exceeds the budget. Returning zero hits when there ARE candidates surprises callers and degrades the semantic-recall promise; one slightly-oversized result is strictly more useful than no result.
Use continue (not break) on overflow for subsequent hits, so a mid-loop oversized candidate doesn't suppress smaller hits behind it in the ranking.
Bump RecallMemoryFacadeAdapter.DEFAULT_MAX_TOKENS from 4000 → 8000 for consistency with GetContextFacadeAdapter (recall returns full ranked entries with previews, so a tighter budget than the bundle made no sense).

Tests (VALUES not SHAPE — Phase-9 rule)

The pre-existing unit test ("trims the tail when cumulative token cost would exceed maxTokens") used expect(getEntries().length).toBeLessThanOrEqual(1) — that loose assertion silently passed even at length=0, exactly the bug. It's now toBe(1) plus an exact id assertion.

Tighten "trims the tail" + "respects the filters.limit slice" to assert exact lengths.
New unit test: top hit is always returned when it solo exceeds budget.
New unit test: a mid-ranking oversized hit is skipped and smaller hits behind it still surface (continue-vs-break semantics).
New integration test: reproduces the dogfood scenario by recording a 1.8 KiB learning and asserting mem.recall("GitFlow", max_tokens: 50) returns hits.length >= 1.
New integration test: the new 8000-token default lets multiple hits surface for a literal "hexagonal" query against the seeded corpus.

5+1 EXIT=0

Check	Result
`npm run typecheck`	EXIT=0
`npm run lint`	EXIT=0
`npm run lint:tests`	EXIT=0
`npm run validate:modules`	PASS — no module violations (retrieval×46 + curator×10 cross-imports as ADR-001)
`npm run build`	EXIT=0
`npm test`	2557 passing in 211 files (+4 vs 2553 baseline)

Test plan

Pre-fix unit test reproducing the bug (always includes the top-ranked hit even when it solo exceeds maxTokens (B-MCP-8)) — fails on develop, passes on this branch.
Pre-fix integration test reproducing the dogfood scenario (returns the top hit even when its preview alone exceeds maxTokens (B-MCP-8)) — fails on develop, passes on this branch.
All 2557 existing tests still pass (no regressions).
CI green (typecheck + lint + lint:tests + validate:modules + build + test:coverage + SonarQube quality gate strict).
Post-merge: cut release/0.1.2-beta.5 and validate fix end-to-end against the real dogfood DB.

🤖 Generated with Claude Code

…dget overflow (B-MCP-8) Closes #31. `mem.recall` returned `total_candidates>0` but `hits=0` whenever the top-ranked candidate alone exceeded `max_tokens`. The old loop in `RecallMemoryUseCase.rankAndSlice` did `if (runningTokens + tokens > max) break` from the first iteration, leaving `out=[]`. The dogfood DB tripped this because learning rows store the full content un-truncated (decisions and entities truncate at 600 chars; learnings and turns do not), so a single long learning could cost more tokens than the 4000-token default budget. Fix: 1. Always include the top-ranked hit, even if it solo exceeds the budget. Returning zero hits when there ARE candidates surprises callers and degrades the semantic-recall promise; one slightly oversized result is strictly more useful than no result. 2. Use `continue` (not `break`) on overflow for subsequent hits, so a mid-loop oversized candidate doesn't suppress smaller hits behind it in the ranking. 3. Bump `RecallMemoryFacadeAdapter.DEFAULT_MAX_TOKENS` from 4000 to 8000 for consistency with `GetContextFacadeAdapter` (recall returns full ranked entries with previews, so a tighter budget than the bundle made no sense). Tests (VALUES not SHAPE — Phase-9 rule): - Tighten the existing "trims the tail" unit test from `toBeLessThanOrEqual(1)` to `toBe(1)` plus exact id assertion. The loose assertion silently passed at length=0, exactly the bug. - Tighten "respects the filters.limit slice" the same way. - New unit test: top hit is always returned when it solo exceeds budget. - New unit test: a mid-ranking oversized hit is skipped and smaller hits behind it still surface (continue-vs-break semantics). - New integration test: reproduces the dogfood scenario by recording a 1.8 KiB learning and asserting `mem.recall("GitFlow", max_tokens: 50)` returns hits.length >= 1. - New integration test: the new 8000-token default lets multiple hits surface for a literal "hexagonal" query against the seeded corpus. 5+1 EXIT=0: typecheck, lint, lint:tests, validate:modules, build, test (2557 passing in 211 files, +4 vs baseline). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…>0) (#34) ## Summary Cuts `v0.1.2-beta.5` from `develop` after [PR #33](#33) closed [B-MCP-8](#31): `mem.recall` returned `total_candidates>0` but `hits=0` whenever the top-ranked candidate alone exceeded `max_tokens`. The dogfood DB tripped this because learning rows store the full content un-truncated and the hybrid scorer (now active end-to-end thanks to B-MCP-7) ranks long learnings highly. This release ships **only the B-MCP-8 fix + version/docs sync**. No new features. | Item | Value | |---|---| | Issues closed | [#31 B-MCP-8](#31) (medium) | | PRs merged into develop since beta.4 | [#33](#33) | | Tests | **2557 passing** in 211 files (+4 vs 2553 at beta.4) | | Coverage on new code | 100% (PR #33) | | Coverage overall | 96.4% | ## Highlights of the fix (PR #33) 1. **`RecallMemoryUseCase.rankAndSlice` always includes the top-ranked hit** — even if it solo exceeds `max_tokens`. Returning zero hits when there ARE candidates surprises callers and degrades the semantic-recall promise; one slightly-oversized result is strictly more useful than no result. 2. **`continue` (not `break`) on overflow** for subsequent hits — a mid-loop oversized candidate doesn't suppress smaller hits behind it in the ranking. 3. **`RecallMemoryFacadeAdapter.DEFAULT_MAX_TOKENS` bumped 4000 → 8000** for consistency with `GetContextFacadeAdapter`. ## Tests (VALUES not SHAPE — Phase-9 rule) The pre-existing unit test ("trims the tail when cumulative token cost would exceed maxTokens") used `expect(getEntries().length).toBeLessThanOrEqual(1)` — that loose assertion silently passed even at length=0, exactly the bug. Now `toBe(1)` plus an exact id assertion. - 2 pre-existing assertions tightened from `toBeLessThanOrEqual(N)` → `toBe(N)`. - 2 new unit tests in `recall-memory.use-case.test.ts` (top-hit always returned; mid-ranking overflow uses continue). - 2 new integration tests in `D-mem-recall.test.ts` (B-MCP-8 dogfood reproducer + new 8000-token default). ## Files in this release branch | Capa | Archivos | |---|---| | Code (already in develop via PR #33) | `code/src/modules/retrieval/application/use-cases/recall-memory.use-case.ts`; `code/src/composition/facades/mcp-server-facades.ts`; 2 test files | | Release tooling (this PR) | `code/package.json` + `code/sonar-project.properties` (bump 0.1.2-beta.4 → 0.1.2-beta.5) | | Release notes (this PR) | `docs/RELEASE-NOTES-v0.1.2-beta.5.md` (NEW) | | Public docs (this PR) | `README.md`, `code/README.md`, `SECURITY.md` (banner / install / version table) | | HANDOFF (this PR) | `HANDOFF.md` (§0 + new §6.20 Phase-15 + footer) | ## Validation | Check | Result | |---|---| | `npm run typecheck` | EXIT=0 | | `npm run lint` (max-warnings 0) | EXIT=0 | | `npm run lint:tests` (max-warnings 0) | EXIT=0 | | `npm run validate:modules` | PASS — no module violations | | `npm run build` (tsup) | EXIT=0 | | `npm test` | **2557 passing** in 211 files | | SonarQube quality gate | already PASSED on PR #33 (coverage new 100%, ratings A) | ## Test plan - [x] 5+1 EXIT=0 over the release branch. - [x] PR #33 merged into develop with CI green + SonarQube quality gate PASSED. - [ ] CI green on this PR. - [ ] Squash-merge to main. - [ ] Tag `v0.1.2-beta.5` annotated to merge commit + push (with `git switch --detach v0.1.2-beta.5` because of the protected-branch hook). - [ ] `gh release create v0.1.2-beta.5 --prerelease --notes-file docs/RELEASE-NOTES-v0.1.2-beta.5.md --target main`. - [ ] `cd code && npm publish --tag beta --auth-type=web` (WebAuthn passkey). - [ ] Smoke against the dogfood DB: `mem.recall` with paraphrased queries should return `hits.length >= 1` even when the top-ranked hit is a long learning. - [ ] Merge-back develop ← main via `chore/sync-develop-after-beta-5`. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ated B-MCP-8 fix end-to-end (#36) ## Summary Standard phase-close docs PR (pattern: [#25](#25), [#28](#28), [#32](#32)). Updates HANDOFF.md §0 + §6.20 + footer to reflect Phase-15 fully closed end-to-end. ## What Phase-15 delivered (all already shipped) - [PR #33](#33) — B-MCP-8 fix (always-include-top-hit + continue-not-break + default max_tokens 4000→8000). - [PR #34](#34) — release v0.1.2-beta.5 to main with conflict resolution (--ours). - Tag `v0.1.2-beta.5` → `4a281f0` + GitHub pre-release. - `npm publish --tag beta` (user, WebAuthn passkey). - Smoke against dogfood DB: **2/2 PASS, 0 FAIL** — `mem.recall("GitFlow")` returns `hits=2` (was 0 in beta.4); `mem.recall("embedding worker async")` returns `hits=1` (was 0). - [PR #35](#35) — merge-back develop ← main with conflict resolution (--theirs). ## What this PR adds Only HANDOFF.md changes (33 insertions, 30 deletions): - §0: 10 rows updated to reflect post-publish reality (Fecha, Fase actual, Paquete npm, Estado del release, Memoria propia, Proximo paso, etc.). - §6.20: sub-fases 5-9 marked complete with concrete outcomes (commit SHAs, conflict resolution patterns, smoke results); new lecciones durables (#5 smoke script reusability, #6 wire output shape parsing, #7 conflict resolution symmetry); estado del repo post-Phase-15 with actual SHAs and dist-tags; Siguiente accion concreta refocused. - Footer "Ultima actualizacion" updated to reflect end-to-end closure. ## Caveat tracked for next phase `serverInfo.version` reported by the JSON-RPC handshake reads `0.1.2-beta.3` even though the installed binary is beta.5. Confirmed in smoke. Cosmetic only — needs `grep -rn "0.1.2-beta" code/src` to locate hardcoded value before promoting to `release/0.1.2` stable. ## Test plan - [x] No code changes — pure docs PR (HANDOFF.md only). - [x] Hooks pre-commit no-op (no `code/src/` changes → typecheck not triggered). - [ ] CI green on this PR (typecheck + lint + lint:tests + validate:modules + build + test:coverage + Sonar all pass over unchanged source). - [ ] Squash-merge to develop. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@beta

## Summary **First stable release of `@netzi/recall`.** Channel promotion from `0.1.2-beta.6` (commit `f3aca46`) to the `latest` dist-tag. **No code changes** — same binary bit-for-bit modulo the version string. Decision: skip the soak 24-48h post-beta.6 (justified by fresh smoke 10/10 PASS + 0 issues + cycle of 7 betas that already closed 8 bugs vinculated to real use). The cycle's bug discovery rate has been "1 bug per beta surfaced by dogfood" — staying on beta.6 longer would not surface new information. | Item | Value | |---|---| | Tag | `v0.1.2` (no suffix) | | Channel | `latest` (was `beta` for the entire `0.1.2-beta.*` cycle) | | Tests | **2560 passing** in 212 files (no change vs beta.6) | | Coverage SonarQube | overall 96.4%, ratings A/A/A | | Issues at release | **0** open | ## What 0.1.2 stable consolidates The `0.1.2-beta.*` cycle ran from 2026-04-28 (beta.0) to 2026-05-03 (beta.6). Each release surfaced exactly one bug from real-world dogfood that the previous release exposed: | Beta | Date | Bugs closed | PR | |---|---|---|---| | `0.1.2-beta.0` | 2026-04-28 | preventive cut after Phase-9 dogfood discovery | n/a | | `0.1.2-beta.3` | 2026-05-01 | **B-MCP-2** + **B-MCP-3** + **B-MCP-4** + **B-MCP-5** | [#17](#17), [#18](#18), [#19](#19), [#20](#20) | | `0.1.2-beta.4` | 2026-05-02 | **B-MCP-7** | [#27](#27) | | `0.1.2-beta.5` | 2026-05-02 | **B-MCP-8** | [#33](#33) | | `0.1.2-beta.6` | 2026-05-03 | **carryover** `serverInfo.version` | [#37](#37) | | **`0.1.2` (this)** | 2026-05-03 | channel promotion + npm deprecation of 0.1.0/0.1.1 | this PR | Plus B-MCP-1 closed in v0.1.1 (Phase-8 same-day patch). Total: 8 bugs closed end-to-end via dogfood + smoke loop. ## Files in this release branch | Capa | Archivos | |---|---| | Code | (cero cambios — channel promotion sin cambios de logica) | | Release tooling | `code/package.json` + `code/sonar-project.properties` (bump 0.1.2-beta.6 → 0.1.2) | | Release notes | `docs/RELEASE-NOTES-v0.1.2.md` (NEW, ~250 LOC consolidating the full cycle + migration guide) | | Public docs | `README.md`, `code/README.md`, `SECURITY.md` (banner "stable", install command without @beta, version table updated) | | HANDOFF | `HANDOFF.md` (§0 + new §6.21 Phase-16 + footer) | ## Validation | Check | Result | |---|---| | `npm run typecheck` | EXIT=0 | | `npm run lint` (max-warnings 0) | EXIT=0 | | `npm run lint:tests` (max-warnings 0) | EXIT=0 | | `npm run validate:modules` | PASS — no module violations | | `npm run build` (tsup) | EXIT=0 | | `npm test` | **2560 passing** in 212 files | ## Test plan - [x] 5+1 EXIT=0 over the release branch. - [x] PR #38 (release v0.1.2-beta.6) merged to main. - [x] Fresh smoke 10/10 PASS against npx-installed beta.6 in fresh workspace (this is the same binary). - [ ] CI green on this PR. - [ ] Squash-merge to main. - [ ] Tag `v0.1.2` annotated to merge commit + push (`git switch --detach v0.1.2` because of protected-branch hook). - [ ] `gh release create v0.1.2 --target main --notes-file docs/RELEASE-NOTES-v0.1.2.md` — **NO `--prerelease`**. - [ ] `cd code && npm publish --auth-type=web` — **NO `--tag beta`** → publishes directly to `latest`. - [ ] `npm deprecate @netzi/recall@0.1.0 "deprecated due to bugs B-MCP-1..8 (closed in 0.1.2). Use @netzi/recall@latest"`. - [ ] `npm deprecate @netzi/recall@0.1.1 "deprecated due to bugs B-MCP-2..8 (closed in 0.1.2). Use @netzi/recall@latest"`. - [ ] `npm view @netzi/recall dist-tags` should return `{ latest: '0.1.2', beta: '0.1.2-beta.6' }`. - [ ] `npx --yes @netzi/recall@latest --help` should execute 0.1.2 (not 0.1.1). - [ ] Merge-back develop ← main via `chore/sync-develop-after-0.1.2`. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

h2devx merged commit ee74d36 into develop May 2, 2026
1 check passed

h2devx deleted the fix/b-mcp-8-recall-empty-hits branch May 2, 2026 19:41

h2devx mentioned this pull request May 2, 2026

release: v0.1.2-beta.5 — close B-MCP-8 (recall hits=0 with candidates>0) #34

Merged

9 tasks

h2devx mentioned this pull request May 2, 2026

docs(handoff): close Phase-15 — v0.1.2-beta.5 published + smoke validated B-MCP-8 fix end-to-end #36

Merged

4 tasks

h2devx mentioned this pull request May 3, 2026

release: v0.1.2 STABLE — promote from beta.6 to latest dist-tag #40

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(retrieval): always return top hit + skip-and-continue on token budget overflow (B-MCP-8)#33

fix(retrieval): always return top hit + skip-and-continue on token budget overflow (B-MCP-8)#33
h2devx merged 1 commit intodevelopfrom
fix/b-mcp-8-recall-empty-hits

h2devx commented May 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

h2devx commented May 2, 2026

Summary

Why it surfaced now (beta.4) and not before (beta.3)

Fix

Tests (VALUES not SHAPE — Phase-9 rule)

5+1 EXIT=0

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant