Skip to content

Add vectorSearch operator for $search pipeline stage#1962

Open
rozza wants to merge 5 commits into
mongodb:mainfrom
rozza:JAVA-6130
Open

Add vectorSearch operator for $search pipeline stage#1962
rozza wants to merge 5 commits into
mongodb:mainfrom
rozza:JAVA-6130

Conversation

@rozza
Copy link
Copy Markdown
Member

@rozza rozza commented May 6, 2026

@rozza rozza requested a review from Copilot May 6, 2026 12:29

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Comment thread driver-scala/src/main/scala/org/mongodb/scala/model/search/package.scala Outdated
@rozza rozza marked this pull request as ready for review May 6, 2026 14:32
@rozza rozza requested a review from a team as a code owner May 6, 2026 14:32
@rozza rozza requested a review from strogiyotec May 6, 2026 14:32
* @param exact Whether to use exact (ENN) search. If {@code true}, runs exact nearest neighbor search.
* @return A new {@link VectorSearchOperator}.
*/
VectorSearchOperator exact(boolean exact);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't really understand why do we need this exact if SearchOperator already has vectorSearch with exact false (client doesn't set it , but according to the doc the default value is false if omitted) and vectorSearchExact

Copy link
Copy Markdown
Member Author

@rozza rozza May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True - it does allow users to programmatically build a vectorSearch and set the exact value. vectorSearchExact is a convience helper that helps with the discoverability of ENN searches.

Lets remove it - will update.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - removed. Good catch.

}

@Test
void vectorSearch() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of the acceptance criteria from the doc is to

Implement all necessary testing (unit, integration, e2e) and metrics.
we don't have e2e tests , is it because feature is not available in atlas ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have e2e testing yet at a drivers level, I'm expecting this work flow should be convered by a Drivers rather than the doc.

I'll make sure to add a DRIVERS ticket to ensure its documented in the Specs repo and / or unified tests are added.

*/
@Sealed
@Beta(Reason.CLIENT)
public interface VectorSearchOperator extends SearchOperator {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about knnVector ? From the doc I understood that index could be created with knnVector field

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "embeddings": {
        "type": "knnVector",

Should we add a fluent index builder that can set type to knnVector ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index work was covered in #1960 - this is for creating the query eg:

{
  $search: {
    "index": "<index name>", // optional, defaults to "default"
    "vectorSearch": {
      "exact": true | false,
      "path": "<field-to-search>",
      "queryVector": [<array-of-numbers>],
      "filter": {<filter-specification>},
      "limit": <number-of-results>,
      "numCandidates": <number-of-candidates>,
      "score": {<options>}
    }
  }
}

Which can be done via:

  SearchOperator.vectorSearch(
          fieldPath("embedding"),
          asList(1.0, 2.0),
          10,
          50
  ).filter(SearchOperator.text(fieldPath("title"), "hello"))
          .score(boost(2f))

I added VectorSearchOperator to allow for overrides that vector search can do but aren't general for all SearchOperator instances. VectorSearchOperatorConstructibleBsonElement is the actual implementation for creating the bson.

This follows the existing SearchOperator conventions and builds upon them for vector search.

@rozza rozza requested a review from strogiyotec May 20, 2026 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants