Skip to content

Add a new rescore ANN index for vec0 tables#276

Merged
asg017 merged 4 commits intomainfrom
pr/rescore
Mar 31, 2026
Merged

Add a new rescore ANN index for vec0 tables#276
asg017 merged 4 commits intomainfrom
pr/rescore

Conversation

@asg017
Copy link
Copy Markdown
Owner

@asg017 asg017 commented Mar 31, 2026

This PR adds the first ANN index to sqlite-vec, called "rescore". A rescore
index quantizes input vectors into int8/bit vectors, and performs an oversampled
KNN search on that smaller vector space, then re-ranks or "rescores" those
distances with the full-sized vectors.

create virtual table vec_articles using vec0(
  id integer primary key,
  headline_embedding float[1024] distance_metric=cosine indexed by rescore(
    quantizer=bit,
    oversample=8
  )
);

insert into vec_articles(id, headline_embedding)
  values(:id, :embedding);


-- KNN query
select
  rowid,
  distance
from vec_articles
where headline_embedding match :query_embedding
  and k = 10;

The INDEXED BY rescore() is the new syntax that signifies an index on the
headline_embedding column.

In the KNN search shown above, sqlite-vec will first perform a coarse KNN
search on the binary quantized-vectors of the headline_embedding column. It
"oversamples" on k * oversample, meaning it will find the top 8 * 10 closest
vectors based on hamming distance of the bit quantized vectors in the index.
Then, it will rescore those 80 vectors based on their full-sized float
embedidngs, and return the top 10 values.

Well-versed sqlite-vec users may notice that this is essentially the same
thing as the
old sqlite-vec Binary Quantization re-scoring
method that the doc site recommends:

-- The old, manual way of doing this - not necessary anymore!

create virtual table vec_movies using vec0(
  synopsis_embedding float[768],
  synopsis_embedding_coarse bit[768]
);
insert into vec_movies(rowid, synopsis_embedding, synopsis_embedding_coarse)
 VALUES (:id, :vector, vec_quantize_binary(:vector));

with coarse_matches as (
  select
    rowid,
    synopsis_embedding
  from vec_movies
  where synopsis_embedding_coarse match vec_quantize_binary(:query)
  order by distance
  limit 20 * 8
),
select
  rowid,
  vec_distance_L2(synopsis_embedding, :query)
from coarse_matches
order by 2
limit 20;

Besides being way less verbose, this new rescore index is slightly faster than
doing old manual method. Full-size vectors are stored in a key-value table
instead of vec0's chunked storage, meaning the re-scoring stage are a bunch of
b-tree lookups instead of overflow-page nightmares. Granted it's not a huge
difference, maybe 10-15% faster reads, but it's something!

Benchmarks

First off, benchmarks are a notoriously difficult thing to get right:

  • Every embedding models treats binary/int8 quantizations differently, some are
    better than others.
  • Some datasets are semantically diverse enough to perform well with
    quantizations, others are not.
  • Some SSD's/hardware lend itself well with a page cache that makes the coarse
    lookup faster, others may not.

So take these benchmarks with a grain of salt, try it on your own data.

But for my use-case on my hardware (Macbook Pro M4) on a semantically diverse
datasette of 1 million New York Times headlines embeded with
mixedbread-ai/mxbai-embed-large-v1,
I get:

vec0 Configuration Insert (s) Size (GB) Query (ms) Recall @10
Flat index 27.4 3.85 590ms (baseline) 1.0
rescore int8, oversample=2 27.3 5.28 225ms (2.6x speedup) 1.0
rescore bit, oversample=8 28.2 4.55 101ms (5.8x speedup) 0.988
rescore bit, oversample=4 28.2 4.55 92ms (6.4x speedup) 0.962

The int8 rescore performs extremely well at preserving precision, with perfect
recall at K=10 queries while being more than twice as fast. Again, the
mixedbread-ai/mxbai-embed-large-v1 model was specifically trained on binary at
int8 quantization, so your mileage may vary.

The binary rescore index performs very well too - 0.988 recall at
oversample=8, and a respectable 0.962 recall at the lower oversample=4.
The speed benefits of hamming distance really shows here - nearly 6 times faster
for os=8, and more than 6 times faster for os=4!

You can choose a different oversample value at query time, so you can tune that
for faster queries, at the expense of recall.

Do note that both int8 and bit quantized rescores take up more disk space,
since it hold both full-size vectors and quantized vectors. The int8 vectors
take D bytes of space per vector, while bit vectors take D / 8 bytes of
space.

Drawbacks

This rescore index is technically still a brute-force search, but instead of
doing full-scans on full-size floating point vectors, it brute-forces much small
binary or int8 vectors.

So eventually while your vector database grows, query time with the rescore
index will also grow, roughly linearly.

Also, recall will nearly entirely depend on your embedding model and dataset. If
your embedding model was specifically trained on binary or integer quanitzation,
I suspect you'll have a great time using rescore. If not, YMMV but I've seen
recall drop down to like 70-80. Additionally, if your data isn't very
semantically diverse, than quantization may muddle the results.

But in general, if data/embedding model handles quantization well and you can
take a ~10% accuracy hit, I've found that the rescore index to be fantastic!

asg017 added 3 commits March 29, 2026 19:44
Add approximate nearest neighbor infrastructure to vec0: shared distance
dispatch (vec0_distance_full), flat index type with parser, NEON-optimized
cosine/Hamming for float32/int8, amalgamation script, and benchmark suite
(benchmarks-ann/) with ground-truth generation and profiling tools. Remove
unused vec_npy_each/vec_static_blobs code, fix missing stdint.h include.
Add rescore index type: stores full-precision float vectors in a rowid-keyed
shadow table, quantizes to int8 for fast initial scan, then rescores top
candidates with original vectors. Includes config parser, shadow table
management, insert/delete support, KNN integration, compile flag
(SQLITE_VEC_ENABLE_RESCORE), fuzz targets, and tests.
@asg017 asg017 changed the base branch from pr/ann to main March 31, 2026 08:08
@asg017 asg017 merged commit 43982c1 into main Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant