How to config cache size? #27715

ngominhhaibk · 2023-10-16T02:21:56Z

ngominhhaibk
Oct 16, 2023

The more I add vectors to the database, the more RAM memory increases. In the configuration file config, how can I modify it to limit the memory size (eg: 2GB) when running, and when the data has filled 2GB of RAM, I want all the remaining data to be saved on disk. Or anywhere I can modify?
I tried modify queryNode.cacheSize, but it not works
queryNode:
cacheSize: 2 # GB, default 32 GB, cacheSize is the memory used for caching data for faster query. The cacheSize must be less than system memory size.
port: 21123
loadMemoryUsageFactor: 3 # The multiply factor of calculating the memory usage while loading segments
enableDisk: true # enable querynode load disk index, and search on disk index
maxDiskUsagePercentage: 95

Answered by yhmo

Oct 16, 2023

The milvus v1.x can swap data between memory and disk. If the data size is larger than the cache size, it can load/release segments one by one to search. Once the cache is full, it releases a segment and reads the next segment from the hard disk to search. Since reading data from a hard disk is much slower than from memory, the search performance is very poor. Only when the data size is less than cache size, the search performance is the best.

Milvus v2.x doesn't support swapping data between memory and disk. It requires the target collection must be fully loaded into memory before searching.

The milvus v1.x C++ sdk is here https://github.com/milvus-io/milvus/tree/1.1/sdk, it is inside th…

View full answer

yhmo · 2023-10-16T03:24:54Z

yhmo
Oct 16, 2023
Collaborator

When you continually insert vectors into milvus:

if the collection is loaded, the inserted data will be accumulated into querynode's memory. Querynode doesn't persist data, it just serves for search.
datanode also receives the inserted data, datanode doesn't accumulate data in memory for long time, it just flushes the data into storage once the data size reaches 16MB.

So, then you insert vectors, querynode's memory increases, datanode's memory remains at low level.
You may assign max memory size to querynode, once the memory usage almost hit the max size, milvus server will reject any insertion, you will get error reports like "memory usage in high level, all dml requests would be rejected".

That is to say, "high memory usage" has nothing to do with "data persist". If you don't want to search the collection, just call collection.release() to free the memory. But once you have call collection.load(), you must ensure the querynode's memory can hold the entire collection, otherwise you will not able to search.

queryNode.cacheSize cannot save memory. As we know, querynode reads data from s3/minio, the data is read into querynode's memory. If you call collection.release() and call collection.load() again, it will read s3/minio again. Read data from s3/minio is time consuming, the queryNode.cacheSize allows the querynode download the files to the local disk. The next time you call collection.load(), querynode can read the data from local disk, much faster than reading from s3/minio. So, queryNode.cacheSize is "the max size to cache data into local disk".

3 replies

ngominhhaibk Oct 16, 2023
Author

Thanks for your responding.
I tested at milvus 1.0.0, when I config cache.cache_size = 1 GB in file server_config.yml, I tried to insert 2M entities into collection, and then insert 1M entities more and then insert 1M entities more (the total is 4M entities) , the used ram memory only increase to hight point, at about 4.7GB and then remain stable, but I still can search. However, the time searching is really high, at around 5-7 seconds.
Can I do that for milvus 2.x.x? I need to deploy milvus with cpp sdk (milvus v1.x.x don't support cpp sdk) into PC with low ram memory, but the database is really high, about 10M vectors. I don't need the speed but I need to save the memory.
Do you have any solution or recommendation for my problem?

This is a fragment of server_config what I modified.
----------------------+------------------------------------------------------------+------------+-----------------+
Cache Config | Description | Type | Default |
----------------------+------------------------------------------------------------+------------+-----------------+
cache_size | The size of CPU memory used for caching data for faster | Integer | 4 (GB) |
| query. The sum of 'cpu_cache_capacity' and | | |
| 'insert_buffer_size' must be less than system memory size. | | |
----------------------+------------------------------------------------------------+------------+-----------------+
cache:
cache_size: 4GB
insert_buffer_size: 1GB
preload_collection:

yhmo Oct 16, 2023
Collaborator

The milvus v1.x can swap data between memory and disk. If the data size is larger than the cache size, it can load/release segments one by one to search. Once the cache is full, it releases a segment and reads the next segment from the hard disk to search. Since reading data from a hard disk is much slower than from memory, the search performance is very poor. Only when the data size is less than cache size, the search performance is the best.

Milvus v2.x doesn't support swapping data between memory and disk. It requires the target collection must be fully loaded into memory before searching.

The milvus v1.x C++ sdk is here https://github.com/milvus-io/milvus/tree/1.1/sdk, it is inside the milvus project. You can build it by yourself, the output is a libmilvus_sdk.so, the readme tells how to use it.

Answer selected by ngominhhaibk

ngominhhaibk Oct 16, 2023
Author

Oh, thank you so much

ian-lss · 2024-04-18T03:27:37Z

ian-lss
Apr 18, 2024

Hello, @yhmo

I am using version 2.3.12,
and based on the previous response,

I understood the following:

queryNode.cacheSize is merely a feature to help speed up the collection.load() process.
It does not offer functionalities similar to RDBMS (e.g., FIFO) to handle cases where the collection size exceeds the memory size of the query node. 
Therefore, in Milvus, the query node must have enough memory available to accommodate the entire collection size.

Is this correct?
Additionally, I am curious about whether there can be a significant difference in the memory usage of the query node in Milvus with only insert traffic compared to Milvus handling insert/update/delete traffic.

1 reply

yhmo Apr 18, 2024
Collaborator

Basically correct. In v2.2/2.3, yes, the query nodes mush have enough memory to load the entire index of the collection.
Different index types require different memory space:

IVF_FLAT/HNSW, required memory is a bit larger than the original vector data size
IVF_SQ/IVF_PQ, required memory size is about 25% ~ 30% of the original vector data size
DISKANN, required memory size is about 1/6 ~ 1/4 of the original vector data size

In v2.4, we provide a new feature "MMap-enabled Data Storage": https://milvus.io/docs/mmap.md
With this feature, users can use a few memory to search. But the search performance is worse than in-memory search.

insert action simply increase data size, with more and more segments generated, the memory usage will increase.

delete actions, the deleted items will be kept in memory for a few hours, waiting for GC process. Once a segment's deleted items exceed 10%, the GC process will rebuild this segment. And the index is rebuilt, too. After index is rebuilt, memory usage decreases.

update = insert + delete, milvus data management is column-based, the deleted items will be kept in memory for a few hours, waiting the GC process to clean. The inserted data will generate new segments. So, with update actions, the memory usage will increase, and decrease after GC process. But not all the deleted items are cleaned. Only the segments that deleted items exceed 10% can be cleaned.

ian-lss · 2024-04-18T10:45:27Z

ian-lss
Apr 18, 2024

@yhmo

Thank you so much for your answer.
The document only mentions HNSW and does not mention SCANN.
Have you by any chance also tested the SCANN type?

4 replies

yhmo Apr 18, 2024
Collaborator

SCANN is similar to IVF_PQ. The memory usage is a bit smaller than IVF_PQ.

ian-lss Apr 18, 2024

Does that mean that SCANN, like other IVF series, also experiences a significant drop in performance?

Is there any recommended index type for memory mapping?

Yes, HNSW is recommended for enable mmap. We have tested HNSW, IVF_FLAT, IVF_PQ/SQ series indexes before, the performance of IVF series indexes dropped seriously, while the performance drop of turning on mmap for HNSW indexes is still within expectation.

yhmo Apr 19, 2024
Collaborator

https://milvus.io/docs/index.md#SCANN

"SCANN (Score-aware quantization loss) is similar to IVF_PQ in terms of vector clustering and product quantization. What makes them different lies in the implementation details of product quantization and the use of SIMD (Single-Instruction / Multi-data) for efficient calculation."

with_raw_data ------------------ Whether to include the raw data in the index.

Set with_raw_data to True, it will load the original data into memory, and require about 125% memory size of the original data size. SCANN will get a high recall rate with original data.

Set with_raw_data to False, it won't load the original data into memory, requires only about 25% memory size of the original data size. The recall rate is not good since no original data to refine the results. Performance is faster than with_raw_data=True.

You can read more articles about SCANN to learn more. Milvus integrates SCANN, but Milvus is not the inventor of SCANN.

ian-lss Apr 19, 2024

@yhmo
Thank you so much for your answer!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to config cache size? #27715

{{title}}

Replies: 3 comments 8 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

How to config cache size? #27715

ngominhhaibk Oct 16, 2023

Replies: 3 comments · 8 replies

yhmo Oct 16, 2023 Collaborator

ngominhhaibk Oct 16, 2023 Author

yhmo Oct 16, 2023 Collaborator

ngominhhaibk Oct 16, 2023 Author

ian-lss Apr 18, 2024

yhmo Apr 18, 2024 Collaborator

ian-lss Apr 18, 2024

yhmo Apr 18, 2024 Collaborator

ian-lss Apr 18, 2024

yhmo Apr 19, 2024 Collaborator

ian-lss Apr 19, 2024

ngominhhaibk
Oct 16, 2023

Replies: 3 comments 8 replies

yhmo
Oct 16, 2023
Collaborator

ngominhhaibk Oct 16, 2023
Author

yhmo Oct 16, 2023
Collaborator

ngominhhaibk Oct 16, 2023
Author

ian-lss
Apr 18, 2024

yhmo Apr 18, 2024
Collaborator

ian-lss
Apr 18, 2024

yhmo Apr 18, 2024
Collaborator

yhmo Apr 19, 2024
Collaborator