-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: milvus 搜索永久阻塞(milvus Search permanent blocking) #22070
Comments
Complete log file |
/assign @congqixia |
这个问题是怎么导致的?并且如何解决?我重启 milvus server 无法解决这个问题,只能通过重新安装 milvus 来解决问题,但是遇到相同的查询依然会卡死,可以 100% 复现这个问题。(一次查询多个向量就能导致 milvus 卡死) What caused this problem? And how can I fix it? I restarted the milvus server but could not solve the problem, I had to reinstall milvus to solve the problem, but the same query would still get stuck and I could reproduce the problem 100%. (Multiple vectors in one query can cause milvus to seize) This is a very serious bug |
@ponponon |
/assign @ponponon |
nq 是什么?我使用的是 pymilvus,这个 nq 和 pymilvus 里面哪个参数相关? What is nq? I am using pymilvus, which parameter is related to this nq in pymilvus? |
nq means the number of vectors you are searching in one search request. |
那永久阻塞的问题如何解决呢?(即便之后使用一个向量进行查询也是永久阻塞)(除了重装) And how to solve the problem of permanent blocking? (Even queries using a vector afterwards are permanently blocking) (except for reinstallation) |
我现在可以确定:永久阻塞的问题和 nq 参数有关 之前,我是一次性搜索 8000+ 的向量,我现在按照你说的,改成小于 2060,用的是一次 1000 个向量搜索,没有出现『永久阻塞』 但是在另一个集合中,一次性搜索 8000+的向量是可以成功的(用时20秒+) 所以这个『永久阻塞』的问题和搜索向量数目有关系,但又不仅仅和向量数目有关系。 这两个集合的区别是:一个没有字符串字段,一次性搜索 8000+ 不会永久阻塞。另一个有字符串字段,一次性搜索 8000+ 会永久阻塞 并且永久阻塞不影响写操作,写依旧可以,但是查询请求永久阻塞,永远不可用 但是永久阻塞影响其他的集合,只要出现『永久阻塞』,所有集合的 search 操作都无法继续进行 I can now confirm that the permanent blocking problem is related to the nq parameter Previously, I was searching 8000+ vectors at once, but now I've changed it to less than 2060, as you said, using 1000 vectors at a time, without 'permanent blocking' But in another collection, a one-time search for 8000+ vectors was successful (in 20 seconds+) So this 'permanent blocking' problem is related to the number of vectors searched, but not only to the number of vectors. The difference between the two sets is that one has no string field and a one-time search of 8000+ does not block permanently. The other has a string field and a one-time search of 8000+ will block permanently The permanent blocking does not affect writes, which are still possible, but the query request blocks permanently and is never available However, permanent blocking affects other collections, as long as there is 'permanent blocking', all search operations on all collections will not continue |
所以,如果触发了 bug 如何解决? So, how do you resolve a bug if it is triggered? |
if nq too large, then search timeout is likely to timeout, and also OOM could happen. if you can share you schema and your search request we could try to reproduce on our environment. So far we know the search is timeout because the batch is too large |
might be fixed by #21852 if you try to output a string field |
/assign @jiaoew1991 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Is there an existing issue for this?
Environment
Current Behavior
milvus 出现无法搜索了怎么办?现象是 CPU 100% 十几秒之后回落到很低,然后所有的 search 请求都卡住了,永远没有结果,milvus 的 cpu 也是超低,看起来没有在干活,是触发到什么 bug 了吗?
What should I do if milvus appears to be unable to search? The phenomenon is that the CPU is 100% and then drops back to very low after a dozen seconds, then all search requests are stuck and never get any results, the milvus cpu is also super low and doesn't seem to be working, is it triggered by some bug?
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
Anything else?
idocker ps
The relevant containers are running normallyThe text was updated successfully, but these errors were encountered: