-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not-expected loss of performance during indexation (between v0.22.0 and v0.23.0) #1804
Comments
we've upgraded to v0.23 and the time it takes to index increased significantly since we had to re-index everything since we upgraded from v0.20 it takes hours or days to index 100K records. the servers load also increases to a point where it no longer able to handle anything and has to be rebooted. We also tried to increase the hardware (on aws) a server with 16cores or 32 cores with 128GB ram and the same thing happens, it just runs for a few minutes and then the load increases so much it just can't do anything else. |
Hello @VinceG! This is not normal it takes that amount of time of 100K records We always recommend pushing your documents to MeiliSearch by batch and not one by one. The maximum payload size of MeiliSearch is 100Mb and can be optionally increased. It means most of the dataset can be pushed with one call to the API. Also this is really weird you have this issue on this big machine. We succeeded to add millions of documents using machine with 8GM of RAM. |
@curquiza we have more, but right now we just tried to index 100K and its still going. we use Laravel, and Laravel Scout it seems like it does batching but since we only have 550 updates enqueued it does send over 200 records at a time but i don't think it reaches 100MB any idea what i can do to troubleshoot? thank you. |
I have a document set of around 10 million, this is for e-commerce where each document contains a title, description, and then a handful of other short attributes like brand, merchant, and category. The titles and descriptions are no more than a few hundred characters. First I tried on a server with 16 GB of RAM with a chunk size of 20,000 and payload size limit of "256 Mb". That took about 8 hours to get through 4 million documents. After that, it took about an additional 24 hours to add another 2 million records to get to 6 million records before I killed the process. Now I'm trying on a server with 128 GB of RAM with a chunk size of 10,000,000 and a payload size limit of "30 Gb". It has been hanging on indexing for about 8 hours now. My current plan is to lower the chunk size until I find something that works and then hope it can finish in a reasonable amount of time. Any suggestions are welcome. I'm using Laravel with Scout and Postgres. EDIT: 100,000 didn't work. 50,000 worked and best performance so far, 6 million records in about 8 hours. |
Thanks for transferring this here, @curquiza. Looking at the comments above, there seem to be multiple discussions ongoing. I think the primary goal here is preventing regressions, i.e. making sure that in newer versions, things stay at least as fast as they were in an older version. Here's some overview from my data. It's not really comparable except within a row due specific machines, OS, etc... and in general due to small sample size.
So, there's a clear jump between 0.20 and 0.21, especially on small data sets. However, OTOH the service became much more responsive during indexing in 0.21+. On Linux Meilisearch is getting faster and faster, but on Docker for Mac (which is actually Linux running in a VM) it gets worse from 0.22 to 0.23, at least for rather small data sets. The observation which lead to this ticket is the jump from 0.22 to 0.23 which I observe in Docker for Mac but not on Linux. Maybe it's related to Docker for Mac's poor disk IO performance? Maybe it's too little RAM (or even swapping) - Docker Desktop is configured to 6GB) or too few cores (configured 5) where the same test on the Linux host has plenty of RAM and 12 cores. |
I have the same problem here, difference is huge... I Insert 10k documents : 0.5s versus 43.6s |
Hello @alexchuinDo you have the problem with Docker for mac too? Which dataset do you use? Hey @bbDo you run your docker on a mac M1? (I don't manage to reproduce your bug on my intel) Which dataset do you use? Thanks to both of you for the help 😃 |
@ManyTheFish |
@ManyTheFish no, I have an old Intel MacBook Pro. It's a private dataset which I'd need to clean up or anonymize before sharing. |
Hey @alexchuin @bb, |
@ManyTheFish for me it's the uploading time (when I POST 10k documents with the PHP API) |
FWIW, at least for us, it was the indexing updates that were queued. on an empty index it took less than a second for the first few hundred records, as we sent more documents to index (500 per batch or 1000) each update took longer than the previous one. so it was something like 1 - 0.5s etc... it kept increasing it duration with each update. Also, since we send updates to delete or update individual records those take a few seconds as well, just one document. And to be clear i was testing with a 8 core machine with 16GB ram as well as a 92 Cores with 756GB and it was exactly the same thing. The load was always high if we sent multiple batches, seems like maybe nginx was the bottle neck or something was going on when we sent multiple updates using queue workers. if i run the batch updates in sync. it was slow but the load didn't go up. hope that helps. |
I just provided @ManyTheFish with my test dataset and my settings via Slack. The test is a fresh db folder, Meilisearch running in Docker for Mac and three calls:
In the last runs comparing 0.22.0 vs 0.23.1, I had on this specific dataset, I had 14 times slower indexing (1.6 s → 22.7 s) and 3x slower uploads (5 s → 17s). The indexing time can be observed by looking at the 0.22.0 status updates(note: updateIDs are higher here because I had same settings call multiple before) {
"status": "processed",
"updateId": 5,
"type": {
"name": "ClearAll"
},
"duration": 0.015,
"enqueuedAt": "2021-10-13T20:24:41.363020307Z",
"processedAt": "2021-10-13T20:24:41.386188054Z"
},
{
"status": "processed",
"updateId": 6,
"type": {
"name": "DocumentsAddition",
"number": 5942
},
"duration": 1.605,
"enqueuedAt": "2021-10-13T20:24:46.195683759Z",
"processedAt": "2021-10-13T20:24:47.810306346Z"
} 0.23.1 status updates {
"status": "processed",
"updateId": 1,
"type": {
"name": "ClearAll"
},
"duration": 0.005,
"enqueuedAt": "2021-10-13T20:26:25.268266946Z",
"processedAt": "2021-10-13T20:26:25.277046910Z"
},
{
"status": "processed",
"updateId": 2,
"type": {
"name": "DocumentsAddition",
"number": 5942
},
"duration": 22.701999999999998,
"enqueuedAt": "2021-10-13T20:26:42.306488448Z",
"processedAt": "2021-10-13T20:27:05.022123113Z"
} By the way, the Ruby script's HTTP call of add_documents timed out in 0.23 but not in 0.22. My timeout is configured to be 10 s.
|
@VinceG I also observed consecutive updates on a non-empty DB becoming slower, however that was already the case in earlier versions and AFAIK not different between v0.22 and v0.23. You should really try bigger batch size - for me, it's no problem to send 1 million documents at once... this is a giant payload and takes several minutes. If you need to send smaller batches, you |
The reason for the slowdown in 0.23 on Docker for Mac is the way bind-mounts are handled. The regression I observed appears only when running in Docker and only when using a bind-mount, i.e. a local path. (compared to a named volume). This compose.yaml is slow (but wasn't slow in 0.22 yet): services:
meilisearch:
restart: "no"
image: getmeili/meilisearch:v0.23.1
volumes:
- ./data-meili:/data.ms
environment:
MEILI_ENV: development
ports:
- 127.0.0.1:7700:7700 while this is fast: services:
meilisearch:
restart: "no"
image: getmeili/meilisearch:v0.23.1
volumes:
- meili:/data.ms
environment:
MEILI_ENV: development
ports:
- 127.0.0.1:7700:7700
volumes:
meili: Not specifing any volume is also fast. From Docker's side it's because it's manually trying to keep the internal volume and the host file system in sync causing a lot of copying. So I guess 0.22 just kept the files opened and 0.23 is opening and closing the file which makes Docker for Mac very busy. @ManyTheFish is already working on a fix and can probably explain more. |
#1824 fixes part of the issue: the issue when using docker using a mounted volume. |
We have just merged some optimizations that should make ingestion time more acceptable in #1847. This is not gonna be as fast as before, but it is a serious improvement nonetheless. |
The same observation with 0.24.0 and DigitalOcean droplet. First new documents take less than a second. Batch inserts take ~3 seconds. When we have more data like 100+ every next insert takes more and more time. For now, it is more than 1 minute :( Any updates about it? In this situation, we will be forced to change a search driver, because users can't wait minutes to see their records. No difference between the 2 or 4 CPU DO droplets. |
Hello everyone reading this issue. The issue regarding the loss of performance between v0.22.0 and v0.23.0 has been fixed. Since it's the specific purpose of this issue, I close it. However, for anyone having issues in general with indexation performance, please, refer to this discussion. Your feedback will be really appreciated 🙂 |
Following: #1798
Seems like there is a loss of performance during indexation between v0.22.0 and v0.23.0
We need to investigate and fix this
@MarinPostma already started a first investigation about the part he updated between v0.22.0 and v0.23.0: #1798 (comment)
The text was updated successfully, but these errors were encountered: