Fix: ENG-1785 — apply composite ranker to manual recall (parity with non-manual) by hungtranphamminh · Pull Request #185 · MystenLabs/MemWal

hungtranphamminh · 2026-05-22T06:34:57Z

Summary

Why

The three recall paths returned different orderings for the same query:

/api/recall and /api/ask apply the CompositeRanker (recency + importance signals, opt-in via scoring_weights) — they run search_similar → hydrate → rank → return.
/api/recall/manual returned raw pgvector cosine order — it exited right after search_similar, never applying the ranker. It also validated scoring_weights and then silently ignored them.

Before the composite ranker existed, this was invisible: all paths were pure cosine order, so they matched. Once the ranker shipped, any caller passing scoring_weights got reordered results from /api/recall and /api/ask but unranked results from /api/recall/manual — the same query, different order. (ENG-1785.)

What

Manual recall now applies the same ranker as the other two paths, while keeping its lightweight contract — it ranks without hydrating (no Walrus fetch, no SEAL decrypt) and still returns (blob_id, distance, …) for the client to hydrate itself.

Solution

Theory. The composite ranker only needs three fields — distance, created_at, importance — and all three live on the SearchHit returned by the vector search. Decryption only produces the memory text, which the ranker never reads. So manual recall can apply the identical ranking on the SearchHit data alone, before (and without) any hydration.

Reuse, don't re-implement. Rather than write a second scoring function over SearchHit (which would risk drifting from the real ranker — the exact bug class this fixes), the new rank_search_hits helper maps each SearchHit into a throwaway HydratedMemory carrying only the ranked fields (empty text), calls the same Ranker::rank, then reassembles the original SearchHits in the ranked order. One ordering implementation, shared by all three paths.

Index-based reassembly (not blob_id-keyed). blob_id is not unique — vector_entries has no UNIQUE constraint on it, search_similar does not SELECT DISTINCT, and restore can insert multiple rows with the same blob_id. Reassembling by blob_id would collapse duplicates, silently dropping hits and reordering them — re-introducing the very divergence this fixes (the hydrating paths keep duplicates 1:1). Instead each hit's input index is carried through the ranker's opaque blob_id slot and used to reorder, so no hit is ever dropped and the result count always equals the search-hit count.

Backward compatible. At default weights the ranker short-circuits, so the pgvector cosine order is returned unchanged — existing callers are unaffected. The response wire shape is unchanged (Vec<SearchHit>); only the order changes when scoring_weights are set. recall_manual now validates scoring_weights up front (400 on malformed) exactly like recall, and the weights actually apply.

Technical change

Area	Change
`services/server/src/types.rs`	`RecallManualRequest` gains optional `scoring_weights`; `SearchHit` derives `Clone`
`services/server/src/routes/recall.rs`	New `rank_search_hits` helper; `recall_manual` validates weights + applies the shared ranker; `total` computed from the ranked result count

No schema change, no migration, no new dependency, no change to the retrieval / storage / decrypt paths.

Types of Changes

Testing

I have tested this code locally
I have added/updated unit tests
I have added/updated integration tests
I have tested in multiple browsers (if applicable)

Full server suite passes (236/236); clippy clean on the changed files. New recall tests cover:

manual ≡ non-manual ordering parity under importance-heavy, recency-heavy, and combined (all-three-signals) weights
default weights preserve cosine order (no-op / backward compatibility)
duplicate-blob_id hits are not dropped (default + active weights)
an 8-item non-trivial permutation round-trips exactly (exercises the index reassembly)
empty hits, single hit, field preservation

A follow-up end-to-end smoke (live /api/recall vs /api/recall/manual with matching weights) is recommended before merge to confirm at the handler level; the ordering logic itself is fully unit-covered. The retrieval-quality benchmarks (LOCOMO / LongMemEval) are not applicable — they exercise only /api/recall at default weights, where this change is a no-op.

Checklist

My code follows the code style of this project
My change requires a change to the documentation
I have updated the documentation accordingly
I have added tests to cover my changes
All new and existing tests passed

Related Issues

Related to Feat: MEM-57 — pre-extraction dedup context (Mem0 v3 pattern) + extract.v4 #178, Feat: MEM-59 — extract.v5 granularity-aware dedup #183 (the cycle-13 composite ranker / extraction work whose ranker this brings to the manual path)

Additional Notes

Reviewed via a multi-agent deep review (ordering parity / code+test integrity / security). The review caught a duplicate-blob_id correctness issue in an earlier blob_id-keyed approach; the index-based reassembly above is the fix, with a regression test pinning it.
The MEM-57 pre-extraction dedup retrieval also calls search_similar but is intentionally not ranked — it's an internal dedup-context read for the extractor prompt, never returned to a caller as recall results.

…non-manual) `/api/recall/manual` returned raw pgvector cosine order while `/api/recall` and `/api/ask` applied the CompositeRanker (recency + importance, opt-in via scoring_weights), so the same query + weights gave different orderings across endpoints. Manual recall also validated scoring_weights and then ignored them. Manual recall now applies the same ranker, keeping its lightweight contract: it ranks on the SearchHit fields directly (distance / created_at / importance, all present pre-decrypt) and still returns blob ids + distances WITHOUT a Walrus fetch or SEAL decrypt. All three recall paths now share one ordering logic and agree for the same query + weights. - New `rank_search_hits` reuses the exact `Ranker::rank` the hydrating paths use (no re-implementation of scoring on SearchHit — that would risk drift). - Reorder is index-based, not blob_id-keyed: blob_id is not unique (search_similar has no DISTINCT; restore can produce duplicate-blob_id rows), so a blob_id-keyed round-trip would collapse duplicates and drop hits. - recall_manual validates scoring_weights up front (400 on malformed) like recall. - Default weights short-circuit → cosine order unchanged → existing callers unaffected. Wire shape unchanged (Vec<SearchHit>); only order changes. Tests: 236/236. New recall tests cover manual≡non-manual parity (importance / recency / combined weights), default no-op, duplicate-blob_id no-drop, an 8-item permutation round-trip, and empty/single/field-preservation cases. Closes ENG-1785.

hungtranphamminh · 2026-05-22T06:40:07Z

✅ E2E smoke passed — verified at the handler level against a live benchmark-mode server (this branch's build).

Ingested 5 memories with varied content (so importance buckets differ), embedded a query with the same model the server uses, then called /api/recall (by query text) and /api/recall/manual (by query vector) with matching weights:

default weights: both endpoints returned the same 8 memories in the same (cosine) order ✅
importance_heavy weights: both endpoints returned the same reordered order ✅ — and that order differs from the default-weights order, confirming the ranker actually reorders and manual now follows it (not a coincidental match).

This reproduces the reporter's exact scenario (same query + weights → previously manual gave cosine order while non-manual reordered) and confirms parity end-to-end, not just at the unit level. Counts matched (8 = 8) on both runs, so no hit dropped.

ducnmm approved these changes May 22, 2026

View reviewed changes

hungtranphamminh merged commit ab43f09 into dev May 22, 2026
8 checks passed

hungtranphamminh deleted the feature/eng-1785-manual-recall-ranking branch May 22, 2026 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: ENG-1785 — apply composite ranker to manual recall (parity with non-manual)#185

Fix: ENG-1785 — apply composite ranker to manual recall (parity with non-manual)#185
hungtranphamminh merged 1 commit into
devfrom
feature/eng-1785-manual-recall-ranking

hungtranphamminh commented May 22, 2026 •

edited

Loading

Uh oh!

hungtranphamminh commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hungtranphamminh commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

What

Solution

Technical change

Types of Changes

Testing

Checklist

Related Issues

Additional Notes

Uh oh!

hungtranphamminh commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hungtranphamminh commented May 22, 2026 •

edited

Loading