-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
4631: Split the field id map from the weight of each fields r=Kerollmops a=irevoire # Pull Request ## Related issue Fixes #4484 ## What does this PR do? - Make the (internal) searchable fields database always contain the searchable fields (instead of None when the user-defined searchable fields were not defined) - Introduce a new « fieldids_weights_map » that does the mapping between a fieldId and its Weight - Ensure that when two searchable fields are swapped, the field ID map doesn't change anymore (and thus, doesn't re-index) - Uses the weight instead of the order of the searchable fields in the attribute ranking rule at search time - When no searchable attributes are defined, make all their weights equal to zero - When a field is declared as searchable and contains nested fields, all its subfields share the same weight ## Impact on relevancy ### When no searchable attributes are declared When no searchable attributes are declared, all the fields have the same importance instead of randomly giving more importance to the field we've encountered « the most early » in the life of the index. This means that before this PR, send the following json: ```json [ { "id": 0, "name": "kefir", "color": "white" }, { "id": 1, "name": "white", "last name": "spirit" } ] ``` Would make the field `name` more important than the field `color` or `last name`. This means that searching for `white` would make the document `1` automatically higher ranked than the document `0`. After this PR, all the fields have the same weight, and none are considered more important than others. ### When a nested field is made searchable The second behavior change that happened with this PR is in the case you're sending this document, for example: ```json { "id": 0, "name": "tamo", "doggo": { "name": "kefir", "surname": "le kef" }, "catto": "gromez" } ``` Previously, defining the searchable attributes as: `["tamo", "doggo", "catto"]` was actually defining the « real » searchable attributes in the engine as: `["tamo", "doggo", "catto", "doggo.name", "doggo.surname"]`, which means that `doggo.name` and `doggo.surname` were _NOT_ where the user expected them and had completely different weights than `doggo`. In this PR all the weights have been unified, and the « real » searchable fields look like this: ```json [ "tamo", "doggo", "doggo.name", "doggo.surname", "catto"] ^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^ Weight 0 Weight 1 Weight 2 Co-authored-by: Tamo <tamo@meilisearch.com>
- Loading branch information
Showing
27 changed files
with
765 additions
and
185 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
//! The fieldids weights map is in charge of storing linking the searchable fields with their weights. | ||
|
||
use std::collections::HashMap; | ||
|
||
use serde::{Deserialize, Serialize}; | ||
|
||
use crate::{FieldId, FieldsIdsMap, Weight}; | ||
|
||
#[derive(Debug, Default, Serialize, Deserialize)] | ||
pub struct FieldidsWeightsMap { | ||
map: HashMap<FieldId, Weight>, | ||
} | ||
|
||
impl FieldidsWeightsMap { | ||
/// Insert a field id -> weigth into the map. | ||
/// If the map did not have this key present, `None` is returned. | ||
/// If the map did have this key present, the value is updated, and the old value is returned. | ||
pub fn insert(&mut self, fid: FieldId, weight: Weight) -> Option<Weight> { | ||
self.map.insert(fid, weight) | ||
} | ||
|
||
/// Create the map from the fields ids maps. | ||
/// Should only be called in the case there are NO searchable attributes. | ||
/// All the fields will be inserted in the order of the fields ids map with a weight of 0. | ||
pub fn from_field_id_map_without_searchable(fid_map: &FieldsIdsMap) -> Self { | ||
FieldidsWeightsMap { map: fid_map.ids().map(|fid| (fid, 0)).collect() } | ||
} | ||
|
||
/// Removes a field id from the map, returning the associated weight previously in the map. | ||
pub fn remove(&mut self, fid: FieldId) -> Option<Weight> { | ||
self.map.remove(&fid) | ||
} | ||
|
||
/// Returns weight corresponding to the key. | ||
pub fn weight(&self, fid: FieldId) -> Option<Weight> { | ||
self.map.get(&fid).copied() | ||
} | ||
|
||
/// Returns highest weight contained in the map if any. | ||
pub fn max_weight(&self) -> Option<Weight> { | ||
self.map.values().copied().max() | ||
} | ||
|
||
/// Return an iterator visiting all field ids in arbitrary order. | ||
pub fn ids(&self) -> impl Iterator<Item = FieldId> + '_ { | ||
self.map.keys().copied() | ||
} | ||
} |
Oops, something went wrong.