-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: range search results are not the same as the intersection of search results with ranges #32630
Comments
/assign @liliu-z |
changed a bit for my test: use FLAT index for the first search and then rebuild the collection with test target index type, the results is a bit better --50%, but still not perfect. |
Try enlarge range search param |
/assign @yanliang567 |
/unassign |
range search results are already better than search results. |
besides, we need to align the results in search, range_search, search with pagination, grouping search, etc. |
@yanliang567 I don't agree about the result alignment
Also, we have iterator, search, range search, I didn't see a strong signal that we need to align all the results. |
@liliu-z From a user’s point of view, he does not care about the algos, he is just confusing about the results are not fitting for each other, and he does not know which results are accurate |
I think there is no necessity to sync all the search result. |
Yes, we have a param that can be tuned to get better recall |
let's set the goal to be:
|
Then we need some improvement here in range search. /assign @liliu-z |
/assign |
growing index still not support fp16/bf16/binary vector. |
plz use this code try again:
|
test case in milvus may has some problem. when i = 1, use current index to generate gt, instead of flat. |
some errors in the test script:
row_cnt = 0
nb = 2000
for i in range(3):
data = cf.gen_general_default_list_data(nb=nb, auto_id=True, start=row_cnt,
vector_data_type=vector_data_type, with_json=False)
collection_w.insert(data)
row_cnt += nb
this test script need update @yanliang567 |
/assign @yanliang567 |
updating and rerunning the tests... |
Is there an existing issue for this?
Environment
Current Behavior
My case is
range_filter = search_res[0].distances[50]
expect:
300 results are same in search_res and range_search_results
actual:
only 51 of range_search_results hit the search_res[50:350], less than 20%
Expected Behavior
if it is hard to get 100% same results with search, it should be >90%
Steps To Reproduce
No response
Milvus Log
test case:
Anything else?
The same situation with IVF_*, SCCAN Index. HNSW has a better result of 80% hit rate.
![image](https://private-user-images.githubusercontent.com/82361606/325828531-75fdef23-f928-4f2c-833f-f841b0c6625e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA1OTAwNDAsIm5iZiI6MTcyMDU4OTc0MCwicGF0aCI6Ii84MjM2MTYwNi8zMjU4Mjg1MzEtNzVmZGVmMjMtZjkyOC00ZjJjLTgzM2YtZjg0MWIwYzY2MjVlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzEwVDA1MzU0MFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTFhNWM3Mjc3YzhhN2FmZWQ3YzZhYjljNTVmODU1ODRiZDE4YmQ3MzljZDRjYmYwZmNkOTMxZmRkNDk2ODdjOTImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.rnFCa5_7yH9gijAybBmHkBW4hZRKb2Ivp8QsTVJ3pJo)
Checking the first 10 results, you will see the range_res have better distances
The text was updated successfully, but these errors were encountered: