Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: The search vector being too far away can lead to inaccurate result sets. #33269

Closed
1 task done
zhoujiaqi1998 opened this issue May 22, 2024 · 3 comments
Closed
1 task done
Assignees
Labels
kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@zhoujiaqi1998
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: v2.4.1
- Deployment mode(standalone or cluster): standalone and cluster
- MQ type(rocksmq, pulsar or kafka): rocksmq and pulsar
- SDK version(e.g. pymilvus v2.0.0rc2): v2.3.4
- OS(Ubuntu or CentOS): CentOS
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

The search vector is too far away, and the result set contains [0,0] and [1,0].
Search vector:
[1<<25, 0]
Database vector:
[0, 0]
[1, 0]
Search algorithm:
L2
Data storage type:
FLOAT_VECTOR

Test script:
test.txt
[root@master-00926-0 root]# python3 test.txt

=== start connecting to Milvus ===
=== Create collection hello_milvus ===
=== Start inserting entities ===
Number of entities in Milvus: 3000
=== Start Creating index IVF_FLAT ===
=== Start loading ===
=== Start searching based on vector similarity ===
hit: id: 0, distance: 1125899906842624.0, entity: {'random': 0.456112344881892, 'embeddings': [0.0, 0.0]}
hit: id: 1, distance: 1125899906842624.0, entity: {'random': 0.2938152926171026, 'embeddings': [1.0, 0.0]}
hit: id: 2, distance: 1125899906842624.0, entity: {'random': 0.7528276205814994, 'embeddings': [0.0, 0.0]}
hit: id: 3, distance: 1125899906842624.0, entity: {'random': 0.6917658945233224, 'embeddings': [1.0, 0.0]}

Expected Behavior

[root@master-00926-0 root]# python3 test.py

=== start connecting to Milvus ===
=== Create collection hello_milvus ===
=== Start inserting entities ===
Number of entities in Milvus: 3000
=== Start Creating index IVF_FLAT ===
=== Start loading ===
=== Start searching based on vector similarity ===
hit: id: 1, distance: 11258991476736.0, entity: {'random': 0.9407890488813064, 'embeddings': [1.0, 0.0]}
hit: id: 3, distance: 11258991476736.0, entity: {'random': 0.9319260020306628, 'embeddings': [1.0, 0.0]}
hit: id: 5, distance: 11258991476736.0, entity: {'random': 0.9496021275858367, 'embeddings': [1.0, 0.0]}
hit: id: 7, distance: 11258991476736.0, entity: {'random': 0.4254195287876559, 'embeddings': [1.0, 0.0]}

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

@zhoujiaqi1998 zhoujiaqi1998 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 22, 2024
@xiaofan-luan
Copy link
Contributor

test.txt

your test data set doens't make any sense.
the test vector is too far away from all the vectors in the database.

@xiaofan-luan
Copy link
Contributor

I don't think this is an issue. you need to find a valid data set and query

@xiaofan-luan
Copy link
Contributor

all the distance is same 1125899906842624.0, guess it already overflowed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

3 participants