fix(ai): rerank candidate floor + KbSearchTask scoreThreshold forwarding#496
Closed
sroussey wants to merge 2 commits into
Closed
fix(ai): rerank candidate floor + KbSearchTask scoreThreshold forwarding#496sroussey wants to merge 2 commits into
sroussey wants to merge 2 commits into
Conversation
d170868 to
432d33d
Compare
3 tasks
429b16b to
c1b92d5
Compare
The rerank-mode first-stage size was computed as `Math.max(topK * firstStageMultiplier, topK)`, which is a no-op since `topK * mult >= topK` whenever `mult >= 1`. The intended floor was a fixed minimum (commented as such), so very small `topK` (e.g. topK=1, mult=5 -> 5 candidates) silently collapsed the reranker's input down to a handful of candidates with no real choice to make. Add a new `firstStageMinimum` option to `CreateStandardKbStrategyOptions` (default 20) and use it as the actual floor: `Math.max(topK * firstStageMultiplier, firstStageMinimum)`. Update JSDoc on `firstStageMultiplier` and the new `firstStageMinimum` to describe how they interact. Adds a vitest suite that spies on `hybridSearch` / `similaritySearch` and asserts the first-stage `topK` value forwarded to them across representative inputs.
The input schema accepted no `scoreThreshold`, and `execute` did not
destructure or forward one either, so any threshold supplied via the
task surface was silently dropped before reaching `kb.search`. Callers
wiring a threshold through a workflow would get unfiltered results
with no warning.
Add `scoreThreshold` to the input schema (number, minimum 0) and
forward it in the call to `kb.search(query, { topK, filter,
scoreThreshold })`. Note that the standard strategy still ignores
the threshold in rerank mode by design (cross-encoder logits aren't
on the same scale as cosine/RRF scores) — that contract is documented
in `createStandardKbStrategy` and the new schema description.
Adds a vitest suite that spies on `kb.search` and asserts the
threshold is forwarded when provided and absent (undefined) when
omitted.
505100d to
d0b2f18
Compare
5 tasks
Collaborator
Author
|
Cherry-picked Generated by Claude Code |
sroussey
pushed a commit
that referenced
this pull request
May 15, 2026
CreateStandardKbStrategyFirstStage.test was cherry-picked from PR #496, where it was authored against the pre-capabilities API. Since then PR #494 (capabilities-squash) landed in main and changed: - registerRunFn(provider, taskType, fn) → registerRunFn(provider, { serves, runFn }) - run-fn return value → emit({ type: "finish", data }) - ModelRecord.tasks → ModelRecord.capabilities Update the test to the post-capabilities shape, mirroring the pattern already used by KnowledgeBaseStandardStrategy.test. Also drop the `as never` casts on the spies — those defeated the type system and caused mockResolvedValue to fail typecheck. https://claude.ai/code/session_01Ya54WFZhpDFzAqRh1qG8Ex
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two High-severity fixes against PR #484 (the KB strategy refactor).
H1 —
firstStageMultiplierhad a deadMath.max, lost candidate floorpackages/ai/src/kb/createStandardKbStrategy.tscomputed the rerankfirst-stage pool as
Math.max(topK * firstStageMultiplier, topK),which is a no-op since
topK * mult >= topKwhenevermult >= 1. Theinline comment promised a floor; the code didn't enforce one. For
topK=1, multiplier=5(the default), the first-stage pool was 5candidates — the reranker had almost nothing to choose from.
firstStageMinimum?: number(default20) toCreateStandardKbStrategyOptions.Math.max(topK * firstStageMultiplier, firstStageMinimum).firstStageMultiplierand the newfirstStageMinimumto describe how they interact.H2 —
KbSearchTasksilently droppedscoreThresholdpackages/ai/src/task/KbSearchTask.tsneither declaredscoreThresholdon its input schema nor destructured it inexecute,so any threshold passed by callers wiring the task into a workflow was
discarded before reaching
kb.search. Results came back unfilteredwith no warning.
scoreThreshold?: number({ type: "number", minimum: 0 })to the task's
inputSchema.properties, with a description notingthe strategy intentionally ignores the threshold in rerank mode.
scoreThresholdinexecuteand forward it:kb.search(query, { topK, filter, scoreThreshold }).Tests
packages/test/src/test/rag/CreateStandardKbStrategyFirstStage.test.ts—spies on
hybridSearch/similaritySearchand asserts thefirst-stage
topKacross{topK=1, mult=5}-> 20,{topK=10, mult=5}-> 50, and{topK=2, mult=1, firstStageMinimum=20}-> 20.packages/test/src/test/rag/KbSearchTask.test.ts— spies onkb.searchand assertsscoreThresholdis forwarded when providedand absent (undefined) when omitted.
Test plan
cd packages/test && bun test src/test/rag/CreateStandardKbStrategyFirstStage.test.tscd packages/test && bun test src/test/rag/KbSearchTask.test.tscd packages/test && bun test src/test/rag/(regression check on existing RAG suite)cd packages/ai && bun run build-types(verify the new option/property typecheck cleanly)Generated by Claude Code