-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split the field id map from the weight of each fields #4631
Split the field id map from the weight of each fields #4631
Conversation
8e6ea3a
to
31ed3a2
Compare
c30371b
to
9fffb8e
Compare
…stricted field id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks perfect to me! Can you also ensure that the field IDs have equal weights, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can merge whenever we can 🙂
bors merge
4631: Split the field id map from the weight of each fields r=Kerollmops a=irevoire # Pull Request ## Related issue Fixes #4484 ## What does this PR do? - Make the (internal) searchable fields database always contain the searchable fields (instead of None when the user-defined searchable fields were not defined) - Introduce a new « fieldids_weights_map » that does the mapping between a fieldId and its Weight - Ensure that when two searchable fields are swapped, the field ID map doesn't change anymore (and thus, doesn't re-index) - Uses the weight instead of the order of the searchable fields in the attribute ranking rule at search time - When no searchable attributes are defined, make all their weights equal to zero - When a field is declared as searchable and contains nested fields, all its subfields share the same weight ## Impact on relevancy ### When no searchable attributes are declared When no searchable attributes are declared, all the fields have the same importance instead of randomly giving more importance to the field we've encountered « the most early » in the life of the index. This means that before this PR, send the following json: ```json [ { "id": 0, "name": "kefir", "color": "white" }, { "id": 1, "name": "white", "last name": "spirit" } ] ``` Would make the field `name` more important than the field `color` or `last name`. This means that searching for `white` would make the document `1` automatically higher ranked than the document `0`. After this PR, all the fields have the same weight, and none are considered more important than others. ### When a nested field is made searchable The second behavior change that happened with this PR is in the case you're sending this document, for example: ```json { "id": 0, "name": "tamo", "doggo": { "name": "kefir", "surname": "le kef" }, "catto": "gromez" } ``` Previously, defining the searchable attributes as: `["tamo", "doggo", "catto"]` was actually defining the « real » searchable attributes in the engine as: `["tamo", "doggo", "catto", "doggo.name", "doggo.surname"]`, which means that `doggo.name` and `doggo.surname` were _NOT_ where the user expected them and had completely different weights than `doggo`. In this PR all the weights have been unified, and the « real » searchable fields look like this: ```json [ "tamo", "doggo", "doggo.name", "doggo.surname", "catto"] ^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^ Weight 0 Weight 1 Weight 2 Co-authored-by: Tamo <tamo@meilisearch.com>
The windows failure here is a bit disheartening |
Canceled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors merge
Pull Request
Related issue
Fixes #4484
Fixes #4639
What does this PR do?
Impact on relevancy
When no searchable attributes are declared
When no searchable attributes are declared, all the fields have the same importance instead of randomly giving more importance to the field we've encountered « the most early » in the life of the index.
This means that before this PR, send the following json:
Would make the field
name
more important than the fieldcolor
orlast name
.This means that searching for
white
would make the document1
automatically higher ranked than the document0
.After this PR, all the fields have the same weight, and none are considered more important than others.
When a nested field is made searchable
The second behavior change that happened with this PR is in the case you're sending this document, for example:
Previously, defining the searchable attributes as:
["tamo", "doggo", "catto"]
was actually defining the « real » searchable attributes in the engine as:["tamo", "doggo", "catto", "doggo.name", "doggo.surname"]
, which means thatdoggo.name
anddoggo.surname
were NOT where the user expected them and had completely different weights thandoggo
.In this PR all the weights have been unified, and the « real » searchable fields look like this: