Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

Limit the reindexing caused by updating settings when not needed #706

Merged
merged 7 commits into from Dec 1, 2022
Merged

Limit the reindexing caused by updating settings when not needed #706

merged 7 commits into from Dec 1, 2022

Conversation

GregoryConrad
Copy link
Contributor

@GregoryConrad GregoryConrad commented Nov 23, 2022

What does this PR do?

When updating index settings using update::Settings, sometimes a reindex of update::Settings is triggered when it doesn't need to be. This PR aims to prevent those unnecessary reindex calls.

For reference, here is a snippet from the current execute method in update::Settings:

// ...
if stop_words_updated
    || faceted_updated
    || synonyms_updated
    || searchable_updated
    || exact_attributes_updated
{
    self.reindex(&progress_callback, &should_abort, old_fields_ids_map)?;
}
  • faceted_updated - looks good as-is ✅
  • stop_words_updated - looks good as-is ✅
  • synonyms_updated - looks good as-is ✅
  • searchable_updated - fixed in this PR
  • exact_attributes_updated - fixed in this PR

PR checklist

Please check if your PR fulfills the following requirements:

  • Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
  • Have you read the contributing guidelines?
  • Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!

@GregoryConrad
Copy link
Contributor Author

GregoryConrad commented Nov 24, 2022

For my logic in update_searchable's Setting::Reset match branch to work, I assume index::delete_all_searchable_fields returns true when reindexing needs to happen. I am not sure if that assumption is correct; thus, please verify that the below should return true when reindexing needs to happen:

pub(crate) fn delete_all_searchable_fields(&self, wtxn: &mut RwTxn) -> heed::Result<bool> {
    self.delete_searchable_fields(wtxn)?;
    self.delete_user_defined_searchable_fields(wtxn)
}

To clarify, I am not sure if this should actually return (for example):

pub(crate) fn delete_all_searchable_fields(&self, wtxn: &mut RwTxn) -> heed::Result<bool> {
    Ok(self.delete_searchable_fields(wtxn)? && self.delete_user_defined_searchable_fields(wtxn)?)
}

@GregoryConrad GregoryConrad marked this pull request as ready for review November 24, 2022 03:33
@GregoryConrad GregoryConrad changed the title Limit the reindexing caused by updating settings in certain cases Limit the reindexing caused by updating settings when not needed Nov 24, 2022
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your changes, it looks amazing indeed, it will speedup a lot of workflows where users were always sending the searchable attributes ❤️

milli/src/update/settings.rs Outdated Show resolved Hide resolved
milli/src/update/settings.rs Outdated Show resolved Hide resolved
@GregoryConrad GregoryConrad requested review from Kerollmops and removed request for ManyTheFish November 26, 2022 17:09
@GregoryConrad
Copy link
Contributor Author

GregoryConrad commented Nov 26, 2022

^^ Don't know why GitHub removed that request for review from @ManyTheFish. I only requested a re-review from @Kerollmops. I don't have the permission necessary to request a review either so sorry about that.

@Kerollmops Kerollmops added the no breaking The related changes are not breaking (DB nor API) label Nov 28, 2022
@Kerollmops
Copy link
Member

For my logic in update_searchable's Setting::Reset match branch to work, I assume index::delete_all_searchable_fields returns true when reindexing needs to happen. I am not sure if that assumption is correct; thus, please verify that the below should return true when reindexing needs to happen:

pub(crate) fn delete_all_searchable_fields(&self, wtxn: &mut RwTxn) -> heed::Result<bool> {
    self.delete_searchable_fields(wtxn)?;
    self.delete_user_defined_searchable_fields(wtxn)
}

To clarify, I am not sure if this should actually return (for example):

pub(crate) fn delete_all_searchable_fields(&self, wtxn: &mut RwTxn) -> heed::Result<bool> {
    Ok(self.delete_searchable_fields(wtxn)? && self.delete_user_defined_searchable_fields(wtxn)?)
}

The delete functions should be mapped on the LMDB mdb_del function and return true if the value was found in the database. However, ensure you don't use the latest code you published, as you'll skip the right part of the && if the left part returns false.

So the short answer is: yes, the delete_* methods must return true if the value existed and was effectively deleted.

Cases: whenever searchable_fields OR user_defined_searchable_fields is modified
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you've done looks good to me. I'll wait for you to answer @irevoire and would like to know what @irevoire thinks of merging this now or waiting for @loiclec to fix the issue of the duplicated and Missing key in the documents database bug. It would be safer to wait for the fix, but I am also quite happy if we could provide that to the users in v0.30.1.

Copy link
Member

@irevoire irevoire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it looks good to me as well.

I'm not sure we want to push this change in the v0.30.1 since it's not a bug fix, though 😱
What do you think @curquiza?
It looks pretty safe to me, but it's typically the kind of thing we said we would not do 😂

@curquiza
Copy link
Member

curquiza commented Nov 29, 2022

We don't want to integrate anything in v0.30.1 that is not a bug fix. However #707 has already been merged into main.
If we don't want #707 into v0.30.1 either, you can merge this PR into main, and we will do a specific branch to release the fixed milli version of v0.30.1

@curquiza
Copy link
Member

curquiza commented Nov 29, 2022

Discussed internally here: #707 is safe to be integrated into v0.30.1.
So don't merge this PR until v0.30.1 is released. I pass it as a draft to avoid any issues!

We will merge it right after v0.30.1 release, thanks @GregoryConrad! 😊

@curquiza curquiza marked this pull request as draft November 29, 2022 17:34
@GregoryConrad
Copy link
Contributor Author

GregoryConrad commented Nov 29, 2022

@curquiza Thanks for the update! Just as a heads up, please hold off on merging this until I get a chance to address this. I need to experiment with the index's current behavior to see if I might be able to throw in another optimization (or if it is good as-is).

Edit: my test went a lot faster than I thought I would. You're all set to merge at your convenience!

@curquiza curquiza marked this pull request as ready for review December 1, 2022 13:56
@curquiza
Copy link
Member

curquiza commented Dec 1, 2022

We can merge this PR to main finally, because other PRs have been merged into main
We will manage with git branch to do a custom release with only the needed commit

bors merge

@bors
Copy link
Contributor

bors bot commented Dec 1, 2022

@bors bors bot merged commit d3731dd into meilisearch:main Dec 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
no breaking The related changes are not breaking (DB nor API)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants