-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HttpANN algorithm to support language-angostic implementations (Re: Issue #20) #35
HttpANN algorithm to support language-angostic implementations (Re: Issue #20) #35
Conversation
@@ -90,6 +110,17 @@ deep-10M: | |||
"nprobe=2,quantizer_efSearch=8", | |||
"nprobe=4,quantizer_efSearch=4", | |||
"nprobe=2,quantizer_efSearch=16"] | |||
diskann-t2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were two sections called "deep-10M" so I just moved the "diskann-t2" section up here to deduplicate the sections.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very nice, thanks @alexklibisz! Have you been able to measure the overhead vs a naive implementation?
My only suggestion here is to split up httpann.py
into base-http.py
(or whatever you deem to fit) and http-example-sklearn.py
to show the interaction between the base http wrapper and the actual implementation that you suggest others to use.
def query(self, X, k): | ||
body = dict(X=[arr.tolist() for arr in X], k=k) | ||
self.post("query", body, 200) | ||
|
||
def range_query(self, X, radius): | ||
body = dict(X=[arr.tolist() for arr in X], radius=radius) | ||
self.post("range_query", body, 200) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering what the performance penalty of this will be. I would be happy to
- change the arguments such that the query vector file is exposed
- or add some kind of
prepare
for providing the query vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done some local testing, and I think it will be fine since the HTTP overhead and JSON serialization is only incurred one time. It was a bigger deal with ann-benchmarks because that framework required a request and serialization for every query.
Sounds good. I'll break it into two files. |
Great, thanks. Is this ready to be merged? |
I think so. Please squash if you can, as there are several intermediate/incomplete commits in there. |
I'll resolve the conflicts and also need to add one bit of documentation.. one moment.. |
Sorry for not getting back to you early. Shall I squash and merge with main? |
No problem. Yes, good to go. Thanks! |
Looking forward to how this is going to be used. Thanks again. |
Thanks @maumueller, @gosha1128 and others for the fruitful discussion over in #20. I think I've arrived at an implementation that could fit the purpose of language-agnostic (big) ANN.
The HttpANN algorithm is designed to make HTTP calls to a server. The server executes all indexing and querying, thus enabling language-agnostic ANN implementations with minimal overhead. The only requirements for the server are:
It could in theory even run remotely, although the intended use-case is that the server runs in the same container.
The overhead for data transfer and serialization is minimal. The server only needs to parse the 10k JSON-encoded query vectors and encode the resulting 10k lists of neighbors.
I also included an example implementation which uses scikit-learn. It's too slow for the large datasets, but it works on the smaller random-xs and random-range-xs. So it should be good enough to demonstrate that this algorithm works.
Here is the API that a server must implement: