Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search filtering #38

Merged
merged 7 commits into from
Dec 6, 2023
Merged

Search filtering #38

merged 7 commits into from
Dec 6, 2023

Conversation

irevoire
Copy link
Member

@irevoire irevoire commented Dec 5, 2023

To merge after #37

Related to #5

We could probably improve the search performance by a lot by storing in every split node all the IDs below it and choosing the branch accordingly. Tested and failed in #39

While filtering out half of the items:

Found 95832 vectors
Louis's query took 5.556166ms
Starts querying all documents ...
Making the stats
On average it took 3.420697ms to query a vector
The fastest query took 873.5µs
The slowest query took 18.778792ms
Standard deviation: 774.028µs

Louis's query got twice faster, and the fastest query also got way faster than before.
But the slowest and average are pretty much the same as before.


Without filtering anything (just to see the impact of deserializing roaring bitmap on every descendants):

Found 95832 vectors
Louis's query took 5.5425ms
Starts querying all documents ...
Making the stats
On average it took 3.693682ms to query a vector
The fastest query took 1.093916ms
The slowest query took 14.116625ms
Standard deviation: 691.014µs

Compared to main before this PR:

Found 95832 vectors
Louis's query took 5.133333ms
Starts querying all documents ...
Making the stats
On average it took 3.797529ms to query a vector
The fastest query took 1.048167ms
The slowest query took 26.041417ms
Standard deviation: 779.071µs

In term of database size we won ~10% on movies:

-rw-------@ 1 irevoire  staff   667M Dec  4 11:44 movies/data.mdb
-rw-------@ 1 irevoire  staff   604M Dec  5 17:54 movies_roaring/data.mdb

Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors merge 😂

src/reader.rs Outdated Show resolved Hide resolved
@Kerollmops Kerollmops merged commit 6d1c0d4 into main Dec 6, 2023
5 checks passed
@Kerollmops Kerollmops deleted the search-filtering branch December 6, 2023 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants