Poor performance with scaling #4572

rrajp · 2024-03-28T16:23:18Z

How to reproduce this bug?

Ingested 18 million objects
Weaviate version 1.24.4 running on docker
Persistent data size ~180Gb on disk
Instance configuration: 256 Gb RAM , 128 Core

What is the expected behavior?

Average query latency should be < 1 sec

What is the actual behavior?

Every new query, It is taking anywhere between 5 to 15 sec per query.
Exact query second time hit take 3-4 sec

Supporting information

I am using the db with default configurations.

Earlier I faced this issue on weaviate 1.23.3 . Then got suggestion from the forum to upgrade the version. It took me 24 hrs to migrate all the data, and situation is almost same even after migration.

Current RAM usages is 120Gb

I am using custom vectorizer (xlm Roberta 1024 dim)

Server Version

1.24.4

Code of Conduct

I have read and agree to the Weaviate's Contributor Guide and Code of Conduct

rthiiyer82 · 2024-04-02T07:56:02Z

@rrajp : Thank you for reporting the issue.

Could you provide more information on the type of query being used? Are you able to share code snippet here?

scd10 · 2024-04-12T07:14:35Z

Same problem.
I am using 1.24.8 docker with 26 million objects. The mathine is with 80 cores and 512G memory.
After restart docker the using memory increases for a long time, like 20 minutes or more, took about 260G mem finally.
New query will causes "weaviate.exceptions.WeaviateQueryError: Query call with protocol GRPC search failed with message Deadline Exceeded." every time! Then after a while if I retry with the same query it will return results very fast, like retrieve from cache or something.

rrajp · 2024-04-12T08:21:09Z

I did little bit of experiments and found that main cause of latency has been the hybrid search. I am using hybrid with BM25 (alpha=0.5). If I make only vector query, it usually returns in 10ms. Also, I query is not global but applicable to almost 5000 data points ( applied with where on document id).
It seems like bm25 is indexing across all the data set and slowing everything down.

scd10 · 2024-04-12T08:51:37Z

I'v tried vector search and hybrid search, both will cause timeout. I was using 1.23.10 earlier but it was very slow, normly it will costs 10 seconds or more with hybrid search, so I tried newest version, but I get timeout instead. I found some suggestion about setting timeout in python client, but it won't fix the slow response problem.

pommedeterresautee · 2024-04-14T09:57:14Z

FWIW we have same problems on bm25 search, super slow on moderately large index, and clearly second class citizen (no stemmer, etc). We just accepted the fact that we can't use it outside of creation of dataset and other offline scenario.

amourao · 2024-04-17T14:54:49Z

@pommedeterresautee @scd10 @rrajp
I'm sorry you are having issues with BM25 and hybrid search performance.
This is a topic we are actively working on.
I can't promise a timeline, but we should see considerable performance improvements over the next major releases (1.26+)

Until then, there are a couple of tips that can help:

The issue with the first query timing out and the second being slow points to BM25 indices being placed in cache on first read. If you are using MacOS, Linux or other Unix-like systems, you can try a tool like vmtouch to ensure that the BM25 information is placed in memory before having to be queried. Ideally, you'd cache the _searchable folders for the properties you are searching (you can find those folders inside the LSM folder for your index).
Depending on how much memory you have, consider testing setting the environmental variable PERSISTENCE_LSM_ACCESS_STRATEGY to pread and/or mmap (it is set to mmap by default).
By default, Weaviate will index and search all textual fields. To improve query time performance, you can search only on specific fields
Removing stopwords can also greatly help with performance.

About the scale of data, I have a couple of questions that may help me with further advise:

Do you know how many properties you have and how many words are there per-property?
What is the average query size (in number of words)?

pommedeterresautee · 2024-04-17T15:29:51Z

Tks for your answer. Our index is made of 40m of docs, each having 2 fields indexed for bm25 search.
One field is usually short (< 20 tokens) and the other a few hundred tokens max. Stop words are removed and everything is stemmed (lower vocab size in theory but we don t count vocab size). Query are short questions, usually < 20 tokens after stop words removal. Timing of each query is between 0.5s and 1s which is ... unexpected (we are used to ES query time).

scd10 · 2024-04-18T05:37:48Z

@amourao Thanks for the reply. For your questions:

Our scheme have about 20 properties, most of them are metadata and set to "indexFilterable", only one property is set to "indexSearchable". the indexSeachable property is in Chinese and we tokenized it beforehand and seperate words with whitespace, so weaviate can tokenize the property with "whitespace" and build index. Most of this property have 100~200 seperated tokens.
About the query, we will also segment it and send the whitespaced query with a embedding vector to search in weavaite. The query are normally a sentence, about 10~30 tokens.

amourao · 2024-04-19T16:40:55Z

Ok, both your setups are well though, the only thing that jumps to mind are the slightly large queries with 10~30 tokens.
We will keep that type of scenario in mind when improving performance, and get it closer to <100 ms in most scenarios.

So, keyword search on Weaviate is quite disk-heavy, and relies on OS memory caching.
This can result in scenarios like the one described by @scd10 of very different cold and hot query performance.
I've used vmtouch before in a similar cold index scenario before to great effect.
An alternative would be to do a set of queries to warm up the cache.

rrajp added the bug label Mar 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor performance with scaling #4572

Poor performance with scaling #4572

rrajp commented Mar 28, 2024 •

edited

rthiiyer82 commented Apr 2, 2024

scd10 commented Apr 12, 2024 •

edited

rrajp commented Apr 12, 2024

scd10 commented Apr 12, 2024

pommedeterresautee commented Apr 14, 2024

amourao commented Apr 17, 2024

pommedeterresautee commented Apr 17, 2024

scd10 commented Apr 18, 2024

amourao commented Apr 19, 2024

Poor performance with scaling #4572

Poor performance with scaling #4572

Comments

rrajp commented Mar 28, 2024 • edited

How to reproduce this bug?

What is the expected behavior?

What is the actual behavior?

Supporting information

Server Version

Code of Conduct

rthiiyer82 commented Apr 2, 2024

scd10 commented Apr 12, 2024 • edited

rrajp commented Apr 12, 2024

scd10 commented Apr 12, 2024

pommedeterresautee commented Apr 14, 2024

amourao commented Apr 17, 2024

pommedeterresautee commented Apr 17, 2024

scd10 commented Apr 18, 2024

amourao commented Apr 19, 2024

rrajp commented Mar 28, 2024 •

edited

scd10 commented Apr 12, 2024 •

edited