-
Notifications
You must be signed in to change notification settings - Fork 24.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support for indexing byte-sized knn vectors (#90774)
This change adds an element_type as an optional mapping parameter for dense vector fields as described in #89784. This also adds a byte element_type for dense vector fields that supports storing dense vectors using only 8-bits per dimension. This is only supported when the mapping parameter index is set to true. The code follows a similar pattern to our NumberFieldMapper where we have an enum for ElementType, and it has methods that DenseVectorFieldType and DenseVectorMapper can delegate to to support each available type (just float and byte for now).
- Loading branch information
Showing
7 changed files
with
863 additions
and
65 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pr: 90774 | ||
summary: Add support for indexing byte-sized knn vectors | ||
area: Vector Search | ||
type: feature | ||
issues: [] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
180 changes: 180 additions & 0 deletions
180
...-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/45_knn_search_byte.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,180 @@ | ||
setup: | ||
- skip: | ||
version: ' - 8.5.99' | ||
reason: 'byte-sized kNN search added in 8.6' | ||
|
||
- do: | ||
indices.create: | ||
index: test | ||
body: | ||
settings: | ||
number_of_replicas: 0 | ||
mappings: | ||
properties: | ||
name: | ||
type: keyword | ||
vector: | ||
type: dense_vector | ||
element_type: byte | ||
dims: 5 | ||
index: true | ||
similarity: cosine | ||
|
||
- do: | ||
index: | ||
index: test | ||
id: "1" | ||
body: | ||
name: cow.jpg | ||
vector: [2, -1, 1, 4, -3] | ||
|
||
- do: | ||
index: | ||
index: test | ||
id: "2" | ||
body: | ||
name: moose.jpg | ||
vector: [127.0, -128.0, 0.0, 1.0, -1.0] | ||
|
||
- do: | ||
index: | ||
index: test | ||
id: "3" | ||
body: | ||
name: rabbit.jpg | ||
vector: [5, 4.0, 3, 2.0, 127] | ||
|
||
- do: | ||
indices.refresh: {} | ||
|
||
--- | ||
"kNN search only": | ||
- do: | ||
search: | ||
index: test | ||
body: | ||
fields: [ "name" ] | ||
knn: | ||
field: vector | ||
query_vector: [127, 127, -128, -128, 127] | ||
k: 2 | ||
num_candidates: 3 | ||
|
||
- match: {hits.hits.0._id: "3"} | ||
- match: {hits.hits.0.fields.name.0: "rabbit.jpg"} | ||
|
||
- match: {hits.hits.1._id: "2"} | ||
- match: {hits.hits.1.fields.name.0: "moose.jpg"} | ||
|
||
--- | ||
"kNN search plus query": | ||
- do: | ||
search: | ||
index: test | ||
body: | ||
fields: [ "name" ] | ||
knn: | ||
field: vector | ||
query_vector: [127.0, -128.0, 0.0, 1.0, -1.0] | ||
k: 2 | ||
num_candidates: 3 | ||
query: | ||
term: | ||
name: rabbit.jpg | ||
|
||
- match: {hits.hits.0._id: "2"} | ||
- match: {hits.hits.0.fields.name.0: "moose.jpg"} | ||
|
||
- match: {hits.hits.1._id: "3"} | ||
- match: {hits.hits.1.fields.name.0: "rabbit.jpg"} | ||
|
||
- match: {hits.hits.2._id: "1"} | ||
- match: {hits.hits.2.fields.name.0: "cow.jpg"} | ||
|
||
--- | ||
"kNN search with filter": | ||
- do: | ||
search: | ||
index: test | ||
body: | ||
fields: [ "name" ] | ||
knn: | ||
field: vector | ||
query_vector: [5.0, 4, 3.0, 2, 127.0] | ||
k: 2 | ||
num_candidates: 3 | ||
|
||
filter: | ||
term: | ||
name: "rabbit.jpg" | ||
|
||
- match: {hits.total.value: 1} | ||
- match: {hits.hits.0._id: "3"} | ||
- match: {hits.hits.0.fields.name.0: "rabbit.jpg"} | ||
|
||
- do: | ||
search: | ||
index: test | ||
body: | ||
fields: [ "name" ] | ||
knn: | ||
field: vector | ||
query_vector: [2, -1, 1, 4, -3] | ||
k: 2 | ||
num_candidates: 3 | ||
filter: | ||
- term: | ||
name: "rabbit.jpg" | ||
- term: | ||
_id: 2 | ||
|
||
- match: {hits.total.value: 0} | ||
|
||
--- | ||
"kNN search with explicit search_type": | ||
- do: | ||
catch: bad_request | ||
search: | ||
index: test | ||
search_type: query_then_fetch | ||
body: | ||
fields: [ "name" ] | ||
knn: | ||
field: vector | ||
query_vector: [-0.5, 90.0, -10, 14.8, -156.0] | ||
k: 2 | ||
num_candidates: 3 | ||
|
||
- match: { error.root_cause.0.type: "illegal_argument_exception" } | ||
- match: { error.root_cause.0.reason: "cannot set [search_type] when using [knn] search, since the search type is determined automatically" } | ||
|
||
--- | ||
"Test nonexistent field": | ||
- do: | ||
catch: bad_request | ||
search: | ||
index: test | ||
body: | ||
fields: [ "name" ] | ||
knn: | ||
field: nonexistent | ||
query_vector: [ 1, 0, 0, 0, -1 ] | ||
k: 2 | ||
num_candidates: 3 | ||
- match: { error.root_cause.0.type: "query_shard_exception" } | ||
- match: { error.root_cause.0.reason: "failed to create query: field [nonexistent] does not exist in the mapping" } | ||
|
||
--- | ||
"Direct kNN queries are disallowed": | ||
- do: | ||
catch: bad_request | ||
search: | ||
index: test | ||
body: | ||
query: | ||
knn: | ||
field: vector | ||
query_vector: [ -1, 0, 1, 2, 3 ] | ||
num_candidates: 1 | ||
- match: { error.root_cause.0.type: "illegal_argument_exception" } | ||
- match: { error.root_cause.0.reason: "[knn] queries cannot be provided directly, use the [knn] body parameter instead" } |
Oops, something went wrong.