Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[PERF] Make the index correctly use FTS (#958)
## Description of changes Previously we were not using the FTS search index correctly. https://sqlite.org/fts5.html#full_text_query_syntax Expects that you query using the table name of the FTS table, not using the column name. If you want to query by column name, you have to use column filters as discussed in the link above. We opt to take the path suggested here https://sqlite.org/forum/forumpost/1d45a7f6e17a3460 and match on id in addition to filtering that specific column. The query planner leverages this appropriately as confirmed in EXPLAIN. Since we were doing speculative delete queries, assuming the index was leveraged, this was incredibly slow. However now it is much faster. Explain Before ```-- SCAN VIRTUAL TABLE INDEX 0:``` -> Full table scan. Explain After ``` -- SCAN VIRTUAL TABLE INDEX 0:M2 ``` -> Scans the index itself The net effect of this is a large increase in write speed and also now the write path time does not grow with table size. ### Quick Benchmark Results N = 100k uniformly random vectors D = 128 Metadata = one small key: value pair Document = randomly generated string of length 100 Added with batch size = 1000 **Without Fix, Overall Time = 469s. Time to add a batch grows linearly to >8000 ms** <img width="590" alt="Screenshot 2023-08-09 at 5 53 24 PM" src="https://github.com/chroma-core/chroma/assets/5598697/89dde745-9231-4f3f-b62c-bf8486f7e970"> **With Fix, Overall Time = 102s. Time to add a batch grows sublinearly to ~1200 ms** <img width="587" alt="Screenshot 2023-08-09 at 5 43 12 PM" src="https://github.com/chroma-core/chroma/assets/5598697/2a771788-e5d9-4afe-bacb-dfbfb51b6cd1"> We will also want to make sure that the read path leverages this way of querying. Will address that in a follow up PR. ## Test plan Existing tests cover the scope of this change. ## Documentation Changes None required.
- Loading branch information