Skip to content

fix(ai): rerank candidate floor + KbSearchTask scoreThreshold forwarding#496

Closed
sroussey wants to merge 2 commits into
claude/search-with-rerank-Rl4IXfrom
claude/loving-mendel-bS9js-pr484
Closed

fix(ai): rerank candidate floor + KbSearchTask scoreThreshold forwarding#496
sroussey wants to merge 2 commits into
claude/search-with-rerank-Rl4IXfrom
claude/loving-mendel-bS9js-pr484

Conversation

@sroussey
Copy link
Copy Markdown
Collaborator

Summary

Two High-severity fixes against PR #484 (the KB strategy refactor).

H1 — firstStageMultiplier had a dead Math.max, lost candidate floor

packages/ai/src/kb/createStandardKbStrategy.ts computed the rerank
first-stage pool as Math.max(topK * firstStageMultiplier, topK),
which is a no-op since topK * mult >= topK whenever mult >= 1. The
inline comment promised a floor; the code didn't enforce one. For
topK=1, multiplier=5 (the default), the first-stage pool was 5
candidates — the reranker had almost nothing to choose from.

  • Added optional firstStageMinimum?: number (default 20) to
    CreateStandardKbStrategyOptions.
  • Replaced the no-op with
    Math.max(topK * firstStageMultiplier, firstStageMinimum).
  • Updated JSDoc on firstStageMultiplier and the new
    firstStageMinimum to describe how they interact.

H2 — KbSearchTask silently dropped scoreThreshold

packages/ai/src/task/KbSearchTask.ts neither declared
scoreThreshold on its input schema nor destructured it in execute,
so any threshold passed by callers wiring the task into a workflow was
discarded before reaching kb.search. Results came back unfiltered
with no warning.

  • Added scoreThreshold?: number ({ type: "number", minimum: 0 })
    to the task's inputSchema.properties, with a description noting
    the strategy intentionally ignores the threshold in rerank mode.
  • Destructure scoreThreshold in execute and forward it:
    kb.search(query, { topK, filter, scoreThreshold }).

Tests

  • packages/test/src/test/rag/CreateStandardKbStrategyFirstStage.test.ts
    spies on hybridSearch / similaritySearch and asserts the
    first-stage topK across {topK=1, mult=5} -> 20,
    {topK=10, mult=5} -> 50, and
    {topK=2, mult=1, firstStageMinimum=20} -> 20.
  • packages/test/src/test/rag/KbSearchTask.test.ts — spies on
    kb.search and asserts scoreThreshold is forwarded when provided
    and absent (undefined) when omitted.

Test plan

  • cd packages/test && bun test src/test/rag/CreateStandardKbStrategyFirstStage.test.ts
  • cd packages/test && bun test src/test/rag/KbSearchTask.test.ts
  • cd packages/test && bun test src/test/rag/ (regression check on existing RAG suite)
  • cd packages/ai && bun run build-types (verify the new option/property typecheck cleanly)

Generated by Claude Code

@sroussey sroussey force-pushed the claude/search-with-rerank-Rl4IX branch from d170868 to 432d33d Compare May 14, 2026 05:05
@sroussey sroussey force-pushed the claude/search-with-rerank-Rl4IX branch 2 times, most recently from 429b16b to c1b92d5 Compare May 15, 2026 00:28
sroussey added 2 commits May 15, 2026 01:01
The rerank-mode first-stage size was computed as
`Math.max(topK * firstStageMultiplier, topK)`, which is a no-op since
`topK * mult >= topK` whenever `mult >= 1`. The intended floor was a
fixed minimum (commented as such), so very small `topK` (e.g. topK=1,
mult=5 -> 5 candidates) silently collapsed the reranker's input down
to a handful of candidates with no real choice to make.

Add a new `firstStageMinimum` option to `CreateStandardKbStrategyOptions`
(default 20) and use it as the actual floor:
`Math.max(topK * firstStageMultiplier, firstStageMinimum)`. Update
JSDoc on `firstStageMultiplier` and the new `firstStageMinimum` to
describe how they interact.

Adds a vitest suite that spies on `hybridSearch` / `similaritySearch`
and asserts the first-stage `topK` value forwarded to them across
representative inputs.
The input schema accepted no `scoreThreshold`, and `execute` did not
destructure or forward one either, so any threshold supplied via the
task surface was silently dropped before reaching `kb.search`. Callers
wiring a threshold through a workflow would get unfiltered results
with no warning.

Add `scoreThreshold` to the input schema (number, minimum 0) and
forward it in the call to `kb.search(query, { topK, filter,
scoreThreshold })`. Note that the standard strategy still ignores
the threshold in rerank mode by design (cross-encoder logits aren't
on the same scale as cosine/RRF scores) — that contract is documented
in `createStandardKbStrategy` and the new schema description.

Adds a vitest suite that spies on `kb.search` and asserts the
threshold is forwarded when provided and absent (undefined) when
omitted.
@sroussey sroussey force-pushed the claude/loving-mendel-bS9js-pr484 branch from 505100d to d0b2f18 Compare May 15, 2026 01:03
Copy link
Copy Markdown
Collaborator Author

Cherry-picked 60cea62 and 505100d into claude/search-with-rerank-Rl4IX (PR #484) as 8ba251d and b984523. Closing as integrated.


Generated by Claude Code

@sroussey sroussey closed this May 15, 2026
sroussey pushed a commit that referenced this pull request May 15, 2026
CreateStandardKbStrategyFirstStage.test was cherry-picked from PR #496,
where it was authored against the pre-capabilities API. Since then
PR #494 (capabilities-squash) landed in main and changed:
  - registerRunFn(provider, taskType, fn) → registerRunFn(provider, { serves, runFn })
  - run-fn return value → emit({ type: "finish", data })
  - ModelRecord.tasks → ModelRecord.capabilities

Update the test to the post-capabilities shape, mirroring the pattern
already used by KnowledgeBaseStandardStrategy.test. Also drop the
`as never` casts on the spies — those defeated the type system and
caused mockResolvedValue to fail typecheck.

https://claude.ai/code/session_01Ya54WFZhpDFzAqRh1qG8Ex
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant