Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make batchs for the indexation #9

Closed
mdubus opened this issue Jun 23, 2021 · 2 comments
Closed

Make batchs for the indexation #9

mdubus opened this issue Jun 23, 2021 · 2 comments
Labels
good first issue Good for newcomers

Comments

@mdubus
Copy link
Member

mdubus commented Jun 23, 2021

Split the indexation by making batches of 1000 (by default) documents

@mdubus mdubus mentioned this issue Jun 23, 2021
@bidoubiwa
Copy link
Contributor

Currently the plugins fetch all data needed to be indexed and send it in one big batch to MeiliSearch

// Index data to MeiliSearch
const { updateId } = await index.addDocuments(transformedData)

We want to avoid, in case the dataset is very large, a payload that is too big for the server hosting MeiliSearch. To do that, the best solution would be to batch the documents in smaller chunks before sending them.

For example if your transformedData has 3500 documents, the plugin should send 3 times a chunk of 1000 documents and once a chunk of 500 documents.

bors bot added a commit that referenced this issue Oct 26, 2021
54: Implemented adding to index in batches (#9) r=bidoubiwa a=TommasoAmici

Hello, please consider this a draft, I opened the MR so others can see I'm working on this.

I've started implementing the changes required to add documents in batches (#9).

`index.addDocumentsInBatches` appears to fail silently in some cases (e.g. 'Wrong transformer', 'Document has no id'), while looking at the existing tests `index.addDocuments` used to throw some errors.

I will investigate more the behavior of `index.addDocumentsInBatches` to see if there are some parameters to tweak, otherwise it looks like some logic needs to be added to the plugin to handle these cases.

Co-authored-by: Tommaso Amici <me@tommasoamici.com>
@mdubus
Copy link
Member Author

mdubus commented Dec 7, 2021

Closing as this was done with #54

@mdubus mdubus closed this as completed Dec 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants