-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch document additions and deletions together #3440
Comments
Thank you, @loiclec! In the future, (not saying it will be needed), could we leverage that enhancement to provide an endpoint permitting sending hybrid documents ops akka containing additions (updates/replace) and deletions in the same HTTP request? I see it as a friction reducer in the case you receive a batch of ops (from a message queue or a file system) being mixed and you are forced to separate them by type before sending them to Meilisearch. |
@gmourier Sure, it's technically possible :) The main challenge would be in making a good API for it. |
Thank you @loiclec It could be related to meilisearch/product#554 |
We will try to be investigated and maybe done during v1, no guarantee however |
After an unexpected and hard-to-find/debug bug on the implementation, we decided to cancel this feature: #3667 We'll work on it again later. |
Hello everyone 👋 We have just released the first RC (release candidate) of Meilisearch containing this fix! docker run -it --rm -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:v1.3.0-rc.0 If you encounter any bugs, please report them here. 🎉 The official and stable release containing this change will be available on July 31st, 2023 |
When someone sends many document addition tasks interspersed with document deletion tasks, meilisearch is currently forced to process these tasks serially. For example, if the following tasks are sent in this order:
Then they will be batched as follows:
This can cause significant indexing performance problems, as we rely on incremental indexing speed to keep up with the updates. If the updates are sent quicker than meilisearch can process then, then the task queue will keep growing bigger and bigger.
Ideally, we want to batch all of these tasks into:
Proposed solution
We could add a new function in
milli
which can accumulate document additions and deletions, respecting the order of the operations. The output of this function should be two things:Transform
orTransformOutput
containing the documents to addThen we can use the existing indexing functions to process (1) and (2) serially.
impacted team
Can impact @meilisearch/docs-team in some part of the docs talking about auto batching
The text was updated successfully, but these errors were encountered: