Add vectorSearch operator for $search pipeline stage#1962
Conversation
| * @param exact Whether to use exact (ENN) search. If {@code true}, runs exact nearest neighbor search. | ||
| * @return A new {@link VectorSearchOperator}. | ||
| */ | ||
| VectorSearchOperator exact(boolean exact); |
There was a problem hiding this comment.
I didn't really understand why do we need this exact if SearchOperator already has vectorSearch with exact false (client doesn't set it , but according to the doc the default value is false if omitted) and vectorSearchExact
There was a problem hiding this comment.
True - it does allow users to programmatically build a vectorSearch and set the exact value. vectorSearchExact is a convience helper that helps with the discoverability of ENN searches.
Lets remove it - will update.
There was a problem hiding this comment.
Done - removed. Good catch.
| } | ||
|
|
||
| @Test | ||
| void vectorSearch() { |
There was a problem hiding this comment.
one of the acceptance criteria from the doc is to
Implement all necessary testing (unit, integration, e2e) and metrics.
we don't have e2e tests , is it because feature is not available in atlas ?
There was a problem hiding this comment.
We don't have e2e testing yet at a drivers level, I'm expecting this work flow should be convered by a Drivers rather than the doc.
I'll make sure to add a DRIVERS ticket to ensure its documented in the Specs repo and / or unified tests are added.
| */ | ||
| @Sealed | ||
| @Beta(Reason.CLIENT) | ||
| public interface VectorSearchOperator extends SearchOperator { |
There was a problem hiding this comment.
what about knnVector ? From the doc I understood that index could be created with knnVector field
{
"mappings": {
"dynamic": false,
"fields": {
"embeddings": {
"type": "knnVector",
Should we add a fluent index builder that can set type to knnVector ?
There was a problem hiding this comment.
The index work was covered in #1960 - this is for creating the query eg:
{
$search: {
"index": "<index name>", // optional, defaults to "default"
"vectorSearch": {
"exact": true | false,
"path": "<field-to-search>",
"queryVector": [<array-of-numbers>],
"filter": {<filter-specification>},
"limit": <number-of-results>,
"numCandidates": <number-of-candidates>,
"score": {<options>}
}
}
}
Which can be done via:
SearchOperator.vectorSearch(
fieldPath("embedding"),
asList(1.0, 2.0),
10,
50
).filter(SearchOperator.text(fieldPath("title"), "hello"))
.score(boost(2f))
I added VectorSearchOperator to allow for overrides that vector search can do but aren't general for all SearchOperator instances. VectorSearchOperatorConstructibleBsonElement is the actual implementation for creating the bson.
This follows the existing SearchOperator conventions and builds upon them for vector search.
…earchOperator.vectorSearchExact
JAVA-6130