Skip to content

MMR reranking for diverse KNN results via mmr_lambda hidden column#267

Open
MayCXC wants to merge 4 commits intoasg017:mainfrom
MayCXC:mmr-reranking
Open

MMR reranking for diverse KNN results via mmr_lambda hidden column#267
MayCXC wants to merge 4 commits intoasg017:mainfrom
MayCXC:mmr-reranking

Conversation

@MayCXC
Copy link

@MayCXC MayCXC commented Feb 27, 2026

refs #266

Adds Maximal Marginal Relevance reranking to vec0 KNN queries. When mmr_lambda is provided, candidates are over-fetched and then greedily re-selected to balance relevance against diversity.

create virtual table vec_items using vec0(
  embedding float[4] distance_metric=cosine
);

insert into vec_items(rowid, embedding) values
  (1, '[1,0,0,0]'),
  (2, '[0.99,0.1,0,0]'),
  (3, '[0.98,0.2,0,0]'),
  (4, '[0,1,0,0]'),
  (5, '[0,0,1,0]');

-- Standard KNN: returns 1, 2, 3 (clustered near [1,0,0,0])
select rowid, distance from vec_items
where embedding match '[1,0,0,0]' and k = 3;

/*
┌───────┬──────────┐
│ rowid │ distance │
├───────┼──────────┤
│ 1     │ 0.0      │
│ 2     │ 0.005... │
│ 3     │ 0.020... │
└───────┴──────────┘
*/

-- MMR: returns 1, then diverse picks instead of near-duplicates
select rowid, distance from vec_items
where embedding match '[1,0,0,0]' and k = 3 and mmr_lambda = 0.5;

/*
┌───────┬──────────┐
│ rowid │ distance │
├───────┼──────────┤
│ 1     │ 0.0      │
│ 5     │ 1.0      │
│ 4     │ 1.0      │
└───────┴──────────┘
*/

Tests

9 pytest + syrupy snapshot tests covering:

  • Cosine diversity (baseline vs lambda=1.0, 0.5, 0.0)
  • L2 distance metric
  • Int8 vector element type
  • Cluster monopoly breaking
  • Composition with distance constraints and partition keys
  • Edge cases (k=1, k=0)
  • Error handling (invalid lambda range)
  • Insert guard for hidden column

All 158 existing tests pass (+ 8 skipped). Verified in both aarch64 and x86_64 VMs.

Add Maximal Marginal Relevance (MMR) support to vec0 virtual table.
When mmr_lambda is provided in a KNN query, candidates are over-fetched
and then greedily re-selected to balance relevance against diversity.

API: WHERE embedding MATCH ? AND k = 10 AND mmr_lambda = 0.7

- mmr_lambda range [0.0, 1.0]: 1.0 = pure relevance, 0.0 = pure diversity
- Over-fetch factor: 5x (capped at k_max=4096)
- Supports float32, int8, and bit vector types
- All distance metrics (L2, cosine, L1, hamming)
- Zero impact when mmr_lambda is not provided
9 test functions covering:
- Cosine diversity (baseline vs lambda=1.0, 0.5, 0.0)
- L2 distance metric compatibility
- Int8 vector element type
- Cluster monopoly breaking
- Composition with distance constraints
- Composition with partition keys
- Edge cases (k=1, k=0)
- Error handling (invalid lambda range)
- Insert guard for hidden column
The copy-back loop iterated k_target times, but the greedy selection
loop can terminate early via `if (best_idx < 0) break`, leaving the
tail of out_rowids/out_distances uninitialized. Add an n_selected
counter and out_n_selected output parameter so only actually-selected
entries are copied back. The caller now sets k_used = n_selected
instead of k_used = k_original.

Credit: mceachen (vlasky#6)
The relevance term was already normalized to [0,1] via max_dist, but the
diversity term used raw distances (1 - d). For cosine this is fine since
values are bounded [0,1], but for L2 and L1 the two terms operated on
different scales, making mmr_lambda behave unpredictably.

Normalize the diversity term the same way (d / max_dist) so both terms
are on a consistent [0,1] scale regardless of distance metric.

Ported from vlasky/sqlite-vec@8d4ef9e.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant