-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Missing data in hybrid search result when using partition key #30607
Comments
Update - First downgrading to Milvus v2.3.2 and pymilvus v2.3.2, then re-upserting specific entities seems to make those vectors suddenly searchable again with any of the hybrid search filtering syntaxes mentioned above. The issue appears to be specific to Milvus or pymilvus greater than version 2.3.2 - unfortunately this means we will need to downgrade our cluster and re-upsert 100 Million+ vectors for search to work again which is obviously less than ideal. Any idea what could have changed in the new versions to break hybrid search? |
Seems to be a potential bug introduced recently. @yanliang567 |
if this is a bug, pls assign to @zhagnlu |
trying to reproduce it in house |
@IsaacWhittakerTR I failed to reproduce the issue in house. could you please reproduce the issue and collect the milvus logs for investigation? I roughly tried below on milvus v2.3.8
|
/assign @IsaacWhittakerTR |
I also failed to reproduce this error using 2.3.7. please provide more detailed infos. |
Thank you both for trying to reproduce the issue. Did either of you try creating a collection first using an older version of Milvus (i.e. v2.3.1 or v2.3.2), and then upgrading the cluster to a newer version (2.3.6) and inserting vectors after upgrading? This is the exact situation which caused issues for us. Downgrading the cluster to v2.3.2 and re-upserting vectors fixed the issue, and then upgrading back to v2.3.7 and re-upserting the vectors would consistantly give empty results for hybrid searches on our partition key. I will try to reproduce the issue again and export the full Milvus logs. Is there by chance an e-mail address I can send the logs to instead of uploading them directly here? |
sounds like this could be a compatibility issue? |
@IsaacWhittakerTR thank you for your patience, and I will try to reproduce it with an upgrade. If you have logs, please mail to me: yanliang.qiao@zilliz.com |
@IsaacWhittakerTR I tried today with an upgrade from v2.3.1 to v2.3.7, but no luck. So please share the logs when you got them. thx. |
@yanliang567 Thanks again for your continued assistance on this issue. It is strange to me that it is not reproducible for you as it is very consistent on my end (but I can't think what else would differ between our setup that would change the behavior of the partitioning logic). I sent over our logs via e-mail, here is a detailed list of the steps I took to reproduce the incorrect search results:
|
@Cactus-L @IsaacWhittakerTR thank you both for updates. We will look into the logs and keep you posted. |
@Cactus-L could you please share your collection schema and index params for me to trying a reproduction? |
#30607 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>
is this info still needed? |
yep...we are testing on the fix pr, now will release a new version soon after verification done. Thank you both for popping this issue up. @Cactus-L @IsaacWhittakerTR |
fix pr :#30773 |
Glad to hear that the issue is solved! Great job guys 🙌🏻 Will be attentive for the release of the new version to update it on our end as well. |
we have already released 2.3.10. |
…r data (#247) See also milvus-io/milvus#30607 This command could scan binlogs for milvus instance and check partition key data is located in wrong partition due to previously mentioned issue --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> Signed-off-by: Congqi.Xia <congqi.xia@zilliz.com>
Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
related issue: milvus-io#30607 and update some test for groupby Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
I'd close this issue as I have verified the fix on milvus 2.3.10 |
Is there an existing issue for this?
Environment
Current Behavior
When performing a hybrid search for vectors which exist in collection and filtering on the partition key, either an empty result is returned or a result with a few distant vectors are returned IF a boolean expression with the
partition_key in [123, 456]
TermExpr syntax is used.The correct results are returned if the boolean expression uses the
partition_key == 123 or partition_key == 456
CmpOp is used instead. Correct results are also returned when filtering on any of the non-partition key scalar fields.****Expected Behavior
When performing a hybrid search on a vector that exists in the collection, and filtering by the partition key which is associated with that vector, the search vector should be returned as one of the results with a zero distance.
Steps To Reproduce
Milvus Log
I was not able to find any related error or warning logs, but I can provide the logs on request.
I could not find any issues with the collection using birdwatcher either.
Anything else?
Schema information:
field_1 -> type int32 with scalar index
field_2 -> partition key, type int64 with scalar index
field_3 -> primary key, type int64
field_4 -> type int32 with scalar index
field_5 -> type FloatVector with dim 768 and IVF_SQ8 index (metric type L2, nlist 8192)
A Zilliz user mentioned a very similar problem in this discord thread: https://discord.com/channels/1160323594396635310/1201965988451454996
The text was updated successfully, but these errors were encountered: