WIP: Support full vector indexes (no PLAID compression) #137

bclavie · 2024-02-14T18:05:03Z

A lot of RAGatouille users are using the library to encode short collections (from a few hundreds to low thousands documents), which don't fully benefit from the very optimised PLAID indexes.

PLAID is very powerful but introduces complexity, notably custom CUDA code and faiss, which can be finicky, as evidenced by #119 #85 #55 #72 #14 #120 (and more)

This PR will introduce another naive form of indexing, where the index will be created with all the usual metadata, etc... but the actual "index" will be .pt.gz file containing the gzipped full-sized vectors (with, future compression work in the future). While querying time will be higher with this approach, it'll also allow for a quick-to-implement & low-dependency approach, and should provide satisfactory results with querying time still in the hundreds of milliseconds for collection of 2k documents.

Additionally, this indexing method will become the default when calling index() with fewer than 512000 total embeddings (2000 documents with a 256 token length). Forcing PLAID or FULL_VECTORS(naming?) will be possible by passing it as an explicit argument.

This'll also unlock really easy-CRUD, not for PLAID indexes, but there will be very little in the way of adding/removing documents (subsequent PR).

This indexing approach will evolve but retain retrocompatibility with this first naive implementation.

This is the first step towards #110 as well (no proper HNSW implementation or vector compression here, but the overall indexing idea is the same) and #74 (lower barrier to entry to specifying mps)

WIP WIP: functional fix: force full precision for now tmp: comment out precision arg lint

bclavie · 2024-03-15T10:49:09Z

Closing as superceded by the much saner #158 rework.

bclavie marked this pull request as draft February 14, 2024 18:05

bclavie force-pushed the feat/full_vectors_indexing branch from add850f to dc3cc2f Compare February 15, 2024 17:10

WIP

141b377

WIP WIP: functional fix: force full precision for now tmp: comment out precision arg lint

bclavie force-pushed the feat/full_vectors_indexing branch from 92d99d6 to 141b377 Compare February 15, 2024 17:23

Merge branch 'main' into feat/full_vectors_indexing

f8429b7

This was referenced Feb 16, 2024

Inconsistent search results length for high top-k values #135

Open

Pytorch 2.1 on Runpod running Examples hangs with message #146

Closed

bclavie closed this Mar 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Support full vector indexes (no PLAID compression) #137

WIP: Support full vector indexes (no PLAID compression) #137

bclavie commented Feb 14, 2024

bclavie commented Mar 15, 2024

WIP: Support full vector indexes (no PLAID compression) #137

WIP: Support full vector indexes (no PLAID compression) #137

Conversation

bclavie commented Feb 14, 2024

bclavie commented Mar 15, 2024