questions on milvus connection, load collection and search #40390

ranjith502 · 2025-03-05T16:42:20Z

ranjith502
Mar 5, 2025

Hello Team,

I have set up a Milvus server using Docker, which is running on a Virtual Machine (VM). From Databricks, I am performing data insertion and search operations.

Questions on Collection Loading & Indexing
I understand that before performing a search, we need to load the collection into memory. However, I have a question regarding index storage:

I am using HNSW, which is an in-memory index. If the index is already stored in memory, why do we still need to load the entire collection into memory before searching?

Fluctuations in Execution Time
I have observed significant fluctuations in execution time across different operations. Below are the recorded times for 10 queries:

Connection Time (ms):
[6, 77, 6, 83, 107, 89, 83, 7, 5, 6]
Load Collection Time (ms):
[107, 12, 11, 11, 97, 11, 15, 98, 100, 10]
Search Time (ms):
[13, 87, 12, 88, 11, 10, 82, 11, 12, 10]

I would like to understand:

What could be the main reasons for these fluctuations in execution time?
Is this expected behavior due to system resource allocation, network latency, or some internal optimizations in Milvus?
Are there any best practices to stabilize query and load times?

Memory Management Questions
Once a collection is loaded into memory, how long does it remain in memory? (e.g., 5 minutes, 10 minutes, or indefinitely until manually released?)
Do we need to manually release the collection after every query, or does Milvus handle memory management automatically?

yhmo · 2025-03-06T02:41:36Z

yhmo
Mar 6, 2025
Collaborator

Index data is stored in S3/minio, load_collection() is to read the index data from S3/minio to querynode's memory.
If you don't call load_collection(), index data is in s3/minio, not searchable.

I believe the fluctuation of "Connection Time" is mainly caused by the network.
"Load Collection" reads data from S3/minio, the latency mainly depends on how much data need to be read and the bandwidth between querynode and S3/minio.
Typically, for millions of data with index, "Search Time" is about 10 ~ 20 ms, I think the fluctuation is caused by the network.

2 replies

ranjith502 Mar 6, 2025
Author

Thanks for the reply
What about fluctuations in the search time , connection time

It is advisable to release the collection after every query ?
Once you load the collection, how much time ( 5m ,10m ) it will be in memory. Is there are any parameter to configure it .

yhmo Mar 6, 2025
Collaborator

No need to load/release for each query/search.
Declare a MilvusClient.
Just call load() one time before search/query, and call release() one time after all query/search are done.

client = MilvusClient()
client.load_collection()
client.search()
client.search()
client.query()
.....
client.search()
client.release_collection()

A tool to estimate the in-memory size: https://milvus.io/tools/sizing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

questions on milvus connection, load collection and search #40390

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

questions on milvus connection, load collection and search #40390

Uh oh!

ranjith502 Mar 5, 2025

Replies: 1 comment · 2 replies

Uh oh!

yhmo Mar 6, 2025 Collaborator

Uh oh!

ranjith502 Mar 6, 2025 Author

Uh oh!

yhmo Mar 6, 2025 Collaborator

ranjith502
Mar 5, 2025

Replies: 1 comment 2 replies

yhmo
Mar 6, 2025
Collaborator

ranjith502 Mar 6, 2025
Author

yhmo Mar 6, 2025
Collaborator