Store fuzzy/bucketed positions in `word_position_docids` database #746

loiclec · 2022-12-15T10:41:01Z

Pull Request

Related issue

Fixes (when merged into meilisearch) meilisearch/meilisearch#3222

Implementation

The design is described well in the related issue. For details of how different relative positions are grouped together, see the test bucketed_position.

Basically, we no longer store the exact position of words that appear far into an attribute, but instead group relative positions together in buckets whose size grows exponentially with the original position. This is done to improve the relevancy and the performance of the attribute ranking rule.

This is a draft until #742 is merged and the results of the benchmarks are available.

EDIT: I also realised just now that the iterative version of the algorithm needs to be updated as well!

Fixes (when merged into meilisearch): meilisearch/meilisearch#3222

loiclec · 2023-01-02T12:23:19Z

I think I am going to postpone this improvement to v1.1 because:

The iterative version of the algorithm also needs to be updated
I found an unrelated bug in the implementation of the set-based version of the algorithm, and I would like to debug it first.
The whole design of the attribute ranking rule will change a lot soon, and so will the whole structure of almost all search algorithms, so I don't want to duplicate the work too much

loiclec marked this pull request as draft December 15, 2022 10:41

Store fuzzy/bucketed positions in word_position_docids database

b307c93

Fixes (when merged into meilisearch): meilisearch/meilisearch#3222

loiclec force-pushed the attribute-fuzzy branch from 648c529 to b307c93 Compare December 22, 2022 11:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store fuzzy/bucketed positions in `word_position_docids` database #746

Store fuzzy/bucketed positions in `word_position_docids` database #746

loiclec commented Dec 15, 2022 •

edited

loiclec commented Jan 2, 2023 •

edited

Store fuzzy/bucketed positions in word_position_docids database #746

Are you sure you want to change the base?

Store fuzzy/bucketed positions in word_position_docids database #746

Conversation

loiclec commented Dec 15, 2022 • edited

Pull Request

Related issue

Implementation

loiclec commented Jan 2, 2023 • edited

Store fuzzy/bucketed positions in `word_position_docids` database #746

Store fuzzy/bucketed positions in `word_position_docids` database #746

loiclec commented Dec 15, 2022 •

edited

loiclec commented Jan 2, 2023 •

edited