Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create SettingsDiff structure and run extractions based on it #4480

Closed
6 of 7 tasks
ManyTheFish opened this issue Mar 12, 2024 · 1 comment
Closed
6 of 7 tasks

Create SettingsDiff structure and run extractions based on it #4480

ManyTheFish opened this issue Mar 12, 2024 · 1 comment
Labels
performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption settings diff-indexing Issues related to settings diff-indexing v1.8.0 PRs/issues solved in v1.8.0 released on 2024-05-06
Milestone

Comments

@ManyTheFish
Copy link
Member

ManyTheFish commented Mar 12, 2024

Related product team resources: PRD (internal only)

⚠️ this issue depends on #4478 to be implemented

Summary

This issue is a subset of the work implementing the settings diff-indexing enhancement.

The current approach of Settings::execute is to trigger a full reindexing if at least one setting-change impacts the database.

Instead, we should create a structure SettingsDiff computing the differences between the old and the new version of the Settings, then send the SettingsDiff everywhere in the indexing process, allowing each part to react depending on the changes.

Additionally, the Documents shouldn't be sent to the database writer if there is no document addition/modification/deletion.

Structure example

struct SettingsDiff {
  before: &Settings { ... }
  after:  &Settings { ... }
}

impl SettingsDiff {
  fn process_searchable_pipeline(&self) -> bool {
    before.searchable_attributes != after.searchable_attributes
    || before.stop_words != after.stop_words
    || before.non_separator_tokens != after.non_separator_tokens // ⚠️ This can be a big vector of strings
    || before.separator_tokens != after.separator_tokens // ⚠️ This can be a big vector of strings
    || before.dictionary != after.dictionary // ⚠️ This can be a big vector of strings
    || before.disable_on_words != after.disable_on_words
    || before.disable_on_attributes != after.disable_on_attributes
  }
}

TODO

  • Ensure Meilisearch has tests triggering the Settings::execute methods
  • Create the SettingsDiff structure
  • Make the indexer react to the SettingsDiff
    • Process the searchable pipeline only if needed
    • Process the facet pipeline only if needed
    • Process the vector pipeline only if needed
    • Skip documents database writing if only the settings have been changed
@ManyTheFish
Copy link
Member Author

fixed by #4504

@curquiza curquiza added this to the v1.8.0 milestone Apr 18, 2024
@meili-bot meili-bot added the v1.8.0 PRs/issues solved in v1.8.0 released on 2024-05-06 label May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption settings diff-indexing Issues related to settings diff-indexing v1.8.0 PRs/issues solved in v1.8.0 released on 2024-05-06
Projects
None yet
Development

No branches or pull requests

3 participants