Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-16836: introduce support for high dimensional vectors #1680

Merged
merged 6 commits into from Jun 14, 2023

Conversation

alessandrobenedetti
Copy link
Contributor

@alessandrobenedetti alessandrobenedetti commented Jun 2, 2023

https://issues.apache.org/jira/browse/SOLR-16836

Description

Very simple draft implementation to support high dimensional vector in Apache Solr.
It's a small change but quite useful as we enable our users with any vector size.
A warning is logged if the default Lucene MAX_DIMENSION is exceeded.
This warning is up for discussion as I am conflicted with this, given 1024 doesn't really mean anything nor it's a limit backed by any current data structure or algorithm optimisation).
Given that if it makes the community happier, we can leave the wanring, not going to put much resistance there.

N.B. as any other Solr configuration (number of fields, length of content in the fields, values per fields, etc), no performance is guaranteed.
It's your responsibility to carefully test and benchmark your solution if you intend to bring it to production.

Solution

Follows the approach here: (https://lists.apache.org/thread/pc8280kn99s0lf2gjd50chk0nftzmzmt)

Tests

A simple test has been added to index vectors with 2048 dimensions.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

@alessandrobenedetti alessandrobenedetti merged commit cedb246 into apache:main Jun 14, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants