Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swapping searchableAttributes shouldn't trigger a reindexing #4484

Open
5 tasks
ManyTheFish opened this issue Mar 13, 2024 · 0 comments
Open
5 tasks

Swapping searchableAttributes shouldn't trigger a reindexing #4484

ManyTheFish opened this issue Mar 13, 2024 · 0 comments
Labels
performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption settings diff-indexing Issues related to settings diff-indexing

Comments

@ManyTheFish
Copy link
Member

ManyTheFish commented Mar 13, 2024

Related product team resources: PRD (internal only)

⚠️ this issue depends on #4480 to be implemented

Summary

This issue is a subset of the work implementing the settings diff-indexing enhancement.

The searchableAttributes order is essential in Meilisearch to rank documents matching in different attributes. For instance, a document matching in the title is ranked better than a document matching in the description.
In the current implementation, the ranks are hardcoded in the searchable databases as fieldid-word-docids, where the fieldid is both the ID and the rank of the field. This forces Meilisearch to reindex, or at least swap, the data in all the fieldid-based databases.

The ID and the rank should be de-correlated, and a structure mapping the fieldid to its rank should be created.
This way, swapping two searchable attributes will only swap ranks in the fieldid-rank mapping table and no more reindexing will be needed.

Moreover, this new structure eases the implementation of a new feature: "allows the user to put the same rank to several attributes" like:

{
  "searchableAttributes": {
    // title is the most important
    "title": 1,
    // description and footer are considered equal
    "description": 2,
    "footer": 2
   }
}

poke @macraig on this.

TODO

  • Ensure searchableAttributes swapping is tested in Meilisearch
  • Create and index the fieldid-rank mapping table
  • Make sure that Attributes criterion takes into account the fieldid-rank mapping table
  • Don't trigger a reindexing when the searchableAttributes are only swapped
  • Do not reindex obkv documents in documents database

Related Benchmarks:

  • settings-remove-add-swap-searchable.json

Additional note from @ManyTheFish

I raised the priority of this issue because fixing it will ease several implementations. Today, the field mapping with the fields is unstable, which means that changing the searchable attributes settings can change this mapping, forcing reindexing data that don't need to just because their related field changed.
For instance indexing a document like:

{"id": 1, "name": "Many", "age": 28, "realName": "Maxime"}

with the settings:

{"searchableAttributes": ["name"], "filterableAttributes": ["age"]}

Should assign the field ids like "name": 0, "id": 1, "age": 2, "realName": 3.
And then, changing the settings to:

{"searchableAttributes": ["name", "realName"], "filterableAttributes": ["age"]}

Should assign the field ids like "name": 0, "realName": 1, "id": 2, "age": 3.
forcing the reindexing of 3 fields just by adding a searchable attribute.

@ManyTheFish ManyTheFish added performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption settings diff-indexing Issues related to settings diff-indexing labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption settings diff-indexing Issues related to settings diff-indexing
Projects
None yet
Development

No branches or pull requests

1 participant