Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Search in collection very slow #15258

Closed
1 task done
TheBritishSabrina opened this issue Jan 17, 2022 · 8 comments
Closed
1 task done

[Bug]: Search in collection very slow #15258

TheBritishSabrina opened this issue Jan 17, 2022 · 8 comments
Assignees
Labels
kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@TheBritishSabrina
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.4.21
- Deployment mode(standalone or cluster): cluster running on EKS
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus v2.0.0rc9
- OS(Ubuntu or CentOS): Ubuntu
- CPU/Memory: t3.xlarge - 4 CPUs, 16GB
- GPU: N/A
- Others:

Current Behavior

When searching in one small collection, collection.search() takes 0.3-0.4 seconds, where quoted time is 35ms.

Setup:

  • Index: IVF_SQ8
  • Number of vectors: 24,000
  • Dimension of vectors: 150
  • nlist: around 620 (4 * sqrt(number of vectors))
  • nprobe: 20

We have tried changing nprobe to values between 1 and 512, but this has made no difference. Setting guarantee_timestamp to 1 also has no effect.

Querynode logs attached
query-node-log-170122.txt

Expected Behavior

Search should take less than 0.04 seconds.

Steps To Reproduce

No response

Anything else?

No response

@TheBritishSabrina TheBritishSabrina added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 17, 2022
@xiaofan-luan
Copy link
Contributor

hi there~ from the log we see guarantee timestamp is specified and seems that it always lag behind and the server take around 200ms to wait for the timetick.

Did you tried to set guarantee timestamp to 1~ may I check your client side code?

@czs007 could you also please follow up this issue?

@xiaofan-luan
Copy link
Contributor

/assign @czs007

@xiaofan-luan
Copy link
Contributor

@TheBritishSabrina also, can you give use some data about your scenario, like how many qps, latency, entity count you would expect?

@TheBritishSabrina
Copy link
Author

Hi @xiaofan-luan! Yes, we set guarantee timestamp to various values, including 1. Here are the latest logs, after changing back to 1 and re-running search.
query-node-log-17012201.txt

The call to collection.search() is currently:

res = self.collection.search(
vectors,
partition_names=partition_names, # partition_names=None by default
anns_field=self.anns_field_name, # anns_field_name="embeddings"
param=search_params, # search_params={ "metric_type" : "IP", "params": {"nprobe": self.nprobe} }
limit=top_k, # top_k=10
guarantee_timestamp=1
)

We don't have expectations around these metrics, but we would expect it to be considerably lower than these numbers given our data size.

@yanliang567 yanliang567 removed their assignment Jan 18, 2022
@xiaofan-luan
Copy link
Contributor

that's weird, we've expect 15ms latency under your data size(And will be improved to under 10ms for next big release)
according your server side log all ganruatee timestamp is still current time, which lead to the long waiting time in the queue for data sync.

@czs007 @DragonDriver any suggestions? maybe we should just upgrade pymilvus versions and let user set their consistency level?

@xiaofan-luan
Copy link
Contributor

/assign @DragonDriver

@longjiquan
Copy link
Contributor

hello, @TheBritishSabrina , we have supported to specify the consistency level when you do a search. See details in Support consistency level when search, and we'll release this commit soon.
If you are interested in trying this feature now, you can reinstall pymilvus via:

$ pip install --extra-index-url https://test.pypi.org/simple/ pymilvus==2.0.0rc10.dev12

@TheBritishSabrina
Copy link
Author

Hi both, we found that we were using pymilvus SDK version 2.0.0rc7, not rc9. Confusingly, using the 'helm list' command showed v2.0.0rc9, so it was hard to diagnose the issue. After updating, the guarantee_timestamp parameter is now recognised and our search time is much quicker! Thanks for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

5 participants