New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-9908: Move VectorValues#search to LeafReader #104
Conversation
@@ -199,40 +229,53 @@ public void checkIntegrity() throws IOException { | |||
|
|||
@Override | |||
public VectorValues getVectorValues(String field) throws IOException { | |||
FieldInfo info = fieldInfos.fieldInfo(field); | |||
if (info == null) { | |||
FieldEntry fieldEntry = fields.get(field); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Across the different reader implementations, we aren't that consistent when we return null
vs. empty. For now I just matched the current pattern in each reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we'd be consistent with what we do with doc values:
- The low-level API on
VectorReader
only supports getting data for fields that exist and have vectors enabled on their FieldInfo, the behavior is undefined otherwise. - The high-level API on
LeafReader
checks the FieldInfo and returnsnull
for fields that do not exist or that don't have vectors enabled, and otherwise delegates to theVectorReader
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to fix this before merging (for both vector-related methods).
@@ -1357,6 +1359,11 @@ public VectorValues getVectorValues(String fieldName) { | |||
return VectorValues.EMPTY; | |||
} | |||
|
|||
@Override | |||
public TopDocs searchNearestVectors(String field, float[] target, int k, int fanout) { | |||
return TopDocsCollector.EMPTY_TOPDOCS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be null
? My understanding based on the rest of the PR is that we use null
when a field doesn't have vectors and EMPTY_TOP_DOCS
when a field has vectors but not on the current segment. Since MemoryIndex doesn't support indexing vectors should it return null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me. I'll update this and getVectorValues
above to return null
.
@@ -199,40 +229,53 @@ public void checkIntegrity() throws IOException { | |||
|
|||
@Override | |||
public VectorValues getVectorValues(String field) throws IOException { | |||
FieldInfo info = fieldInfos.fieldInfo(field); | |||
if (info == null) { | |||
FieldEntry fieldEntry = fields.get(field); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we'd be consistent with what we do with doc values:
- The low-level API on
VectorReader
only supports getting data for fields that exist and have vectors enabled on their FieldInfo, the behavior is undefined otherwise. - The high-level API on
LeafReader
checks the FieldInfo and returnsnull
for fields that do not exist or that don't have vectors enabled, and otherwise delegates to theVectorReader
.
Also return null in Lucene90VectorReader for simplicity.
This reverts commit ca031d2.
This PR removes
VectorValues#search
in favor of exposing NN search throughVectorReader#search
andLeafReader#searchNearestVectors
. It also marks thevector methods on
LeafReader
as experimental.