Avoid exhaustively sorting buckets that will be discarded due to the "offset" search parameter #3123

loiclec · 2022-11-23T13:57:05Z

Meilisearch works by performing a bucket sort of the documents that match a search query (as explained shortly in the Meilisearch documentation).

For example, with the following ranking rules:

words
typo
proximity

Then, conceptually, the following things happen:

words prepares the first bucket of documents. These are the documents which contain all the words from the search query. It gives this bucket to the next ranking rule, typo.
typo prepares a sub-bucket by finding all the documents which contain these words from the search query with 0 typo. It gives this sub-bucket to proximity.
proximity prepares a sub-sub-bucket by finding all documents where consecutive words in the query are also consecutive in the document.
Since there are no more ranking rules afterward, this sub-sub-bucket is returned. If we don't have enough results yet, we ask proximity for its next sub-bucket. If there aren't any, we ask typo, and finally words again. We go up and down the ranking rules in this way until we have exhaustively sorted enough documents.

However, if a user asks for results starting from offset 500, for example, we should apply a slightly different logic. When a ranking rule computes its bucket, it should check whether any document in this bucket could possibly be returned in the results. If not, it should discard this bucket and compute the next one.

Currently, this logic that skips sorting a bucket if it doesn't intersect with the possible range of results is not implemented. In practice, it means that searches given a large offset parameter will be much slower than necessary.

The text was updated successfully, but these errors were encountered:

loiclec · 2023-05-15T08:48:45Z

Fixed by #3542

loiclec added milli Related to the milli workspace performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption labels Nov 23, 2022

This was referenced Nov 23, 2022

Search performance issues #3111

Closed

Ranking rules should not do any work when only one bucket candidate exists #3124

Closed

loiclec mentioned this issue Feb 28, 2023

Search Relevancy & Performance Improvements #3547

Closed

loiclec closed this as completed May 15, 2023

curquiza added this to the v1.2.0 milestone May 15, 2023

meili-bot added the v1.2.0 PRs/issues solved in v1.2.0 released on 2023-06-05 label Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid exhaustively sorting buckets that will be discarded due to the "offset" search parameter #3123

Avoid exhaustively sorting buckets that will be discarded due to the "offset" search parameter #3123

loiclec commented Nov 23, 2022 •

edited by curquiza

Loading

loiclec commented May 15, 2023

Avoid exhaustively sorting buckets that will be discarded due to the "offset" search parameter #3123

Avoid exhaustively sorting buckets that will be discarded due to the "offset" search parameter #3123

Comments

loiclec commented Nov 23, 2022 • edited by curquiza Loading

loiclec commented May 15, 2023

loiclec commented Nov 23, 2022 •

edited by curquiza

Loading